Trusting STRING_SPLIT() order in Azure SQL Database


By:   |   Updated: 2021-11-11   |   Comments (1)   |   Related: More > TSQL


Problem

I recently wrote about splitting strings with OPENJSON() to support returning an ordinal position, and it seems like every time I write about this topic, something in our ecosystem changes. Well, something has changed in Azure SQL Database that is going to make a lot of people happy.

Solution

This is always a hot topic because folks need to break apart delimited strings, and often they need to rely on the output to be in the right order - or at least have some way to reliably determine the original sequence. As a result, some come up with all kinds of (usually costlier) alternatives to using STRING_SPLIT. Others believe STRING_SPLIT is already going to give the strings back in the same order as they appear in the original string; this likely stems from the documentation originally stating:

  • The sort order of the output rows matches the order of the substrings in the input string.

In November 2018, that string was replaced with the following:

  • The output rows might be in any order. The order is not guaranteed to match the order of the substrings in the input string.

Several people have argued with me that they can’t imagine why it would work any other way, and that applying ROW_NUMBER() over some constant (like NULL, or @@SPID, or @@TRANCOUNT) should guarantee that the row numbers are applied in input order. I have been reassured by SQL Server database engine experts that this is not a guarantee we should assume and is, in fact, why they changed the documentation in the first place: to warn us against making that assumption.

In a previous tip, A way to improve STRING_SPLIT in SQL Server - and you can help, I asked for your assistance in voting up a suggestion that would solve the problem:

The request is simple: add a column to the output of STRING_SPLIT so we can deterministically order by exactly the input order, or reverse order, or pluck the 3rd element, or match the elements to another list by ordinal position instead of by crossing our fingers.

There have been over 700 votes on that feedback item, but the only indication Microsoft has even seen it is that the status changed to "Under Review."

Then something strange happened. I recently downloaded the latest version of SQL Server Management Studio, and was testing the OPENJSON() function I showed in the previous tip on this topic, and I saw this:

IntelliSense not knowing right from wrong

Well, that’s silly, I thought, because of course STRING_SPLIT() takes exactly two arguments. When I hovered over the function name, the help tooltip showed me this third argument (emphasis mine):

IntelliSense way ahead of its time

enable_ordinal, huh? I thought for sure I was getting punked, and that someone messed with IntelliSense on me.

I tried this code:

SELECT * FROM STRING_SPLIT(N'foo,bar', N',', 1);

Predictably, it yielded:

Msg 8144, Level 16, State 3
Procedure or function STRING_SPLIT has too many arguments specified.

What was happening? Well, the next time I connected to an Azure SQL Database instance, I found out. I tried the query there and, lo and behold, the syntax was accepted by the engine (even if Azure Data Studio’s IntelliSense hasn’t been updated yet):

Cloud FTW

The estimates are still the same, and it still only supports single-character delimiters, but hey, I’ll take minor improvements anytime.

Conclusion

This is only currently available in Azure SQL Database, and I have not done any performance testing on this (either between STRING_SPLIT with or without the optional enable_ordinal argument, or against other functions that provide the ordinal output in other ways). But if performance is comparable in large-scale tests, this is going to simplify a lot of environments, especially when it makes it to the on-premises editions of SQL Server (and it will).

Next Steps

See these tips and other resources relating to splitting strings in SQL Server:






get scripts

next tip button



About the author
MSSQLTips author Aaron Bertrand Aaron Bertrand (@AaronBertrand) is a passionate technologist with industry experience dating back to Classic ASP and SQL Server 6.5. He is editor-in-chief of the performance-related blog, SQLPerformance.com, and also blogs at sqlblog.org.

View all my tips


Article Last Updated: 2021-11-11

Comments For This Article




Friday, November 12, 2021 - 11:46:24 AM - Martin Smith Back To Top (89441)
I was happy to see that adding an `ORDER BY ordinal" did not add a sort to the execution plan.


download














get free sql tips
agree to terms