A way to improve STRING_SPLIT in SQL Server - and you can help
STRING_SPLIT has two glaring limitations that I've written about here before (SQL Server STRING SPLIT Limitations): there is no indication of input order, and the separator is limited to a single character. I think the SQL Server team can fix this not by changing or overloading the existing function but, rather, by adding a new function that addresses these shortcomings.
Every time I write anything about splitting strings in SQL Server, I promise myself that it will be the last one I need to write. I'm going to stop trying to fool myself with that promise because I will probably continue into retirement.
The inspiration behind this specific tip came to me as I was trying to work around another issue where order mattered (a list of price changes) and where existing workarounds don't solve it easily (since they rely on pattern matching, but prices can repeat).
Imagine needing to pivot this sequence of values in order:
257.55, 324.12, 326.55, 324.12, 337.45
To get this output, and guaranteed to be in the right order, is a lot more cumbersome and inefficient than it might seem:
324.12 -- repeat
I'm not going to solve that here, as that is an article of its own. My point here is that I once again found myself wishing that STRING_SPLIT could return a key so we could explicitly order the results when we care. I acknowledge that overloading a function is not trivial, and the risk to backward compatibility is high, so adding a parameter that has to be optional is not a path the product team is likely to take.
However, there is precedent for introducing new functions instead of changing or overloading existing ones: CONCAT. This function was first introduced in SQL Server 2012 as syntactic sugar for messy string concatenation operations. In SQL Server 2017, they added CONCAT_WS (WS = “with separator”) to address the need to define how to concat strings together. They didn't change the original function by adding a parameter, they just created a whole new function.
With this case in mind, why not add a function STRING_SPLIT_WK (WK = “with key”)? The output would simply add an additional column (“key”) that would indicate the sequence of the value in the original string.
So this call:
SELECT * FROM STRING_SPLIT_WK('257.55, 324.12, 326.55, 324.12, 337.45', ',');
Would yield this output:
It doesn't even have to guarantee the values would be returned in that
order, because I can explicitly say
ORDER BY [key]
(or add other functionality, like return only the first, last, or
As a bonus, they could also address the delimiter length limitation here, supporting 5, or 50, or 255 characters. Users could benefit from the longer delimiter and ignore the key, just like they can use CONCAT_WS for consistency but pass an empty string for the separator when they just want CONCAT behavior.
There's nothing you can do differently with this information on current versions of SQL Server. But if you'd like to see this functionality in the future (whether by overload or by a new function), please go vote, and state your business case in a comment, on this UserVoice item:
Read on for related tips and other resources involving string splitting:
- SQL Server 2016 STRING_SPLIT Function
- Parsing string data with the new SQL Server 2016 STRING_SPLIT function
- Dealing with the single-character delimiter in SQL Server's STRING_SPLIT function
- SQL Server Split String Replacement Code with STRING_SPLIT
- SQL Server STRING SPLIT Limitations
- Solve old problems with SQL Server's new STRING_AGG and STRING_SPLIT functions
- SQL Server CLR and SQL split functions to parse a delimited string
- Splitting Delimited Strings Using XML in SQL Server
- Removing Duplicates from Strings in SQL Server
- Comparing string splitting / concatenation methods
About the author
View all my tips
Article Last Updated: 2020-08-11