There are several tips and articles on the internet that discuss how to split a delimited list into multiple rows. This tip shows two approaches to this problem a T-SQL function and a CLR function.
The following examples show two different approaches to splitting a delimited list into multiple rows. Both approaches return the same result set, but this tip will show the different techniques to deploy the functions.
There are several versions of a T-SQL approach that you can find on the internet, but here is one that parses a string based on a delimiter and returns one row per value. The code has been put together to show you how this is done versus finding the most optimal approach for this problem.
Just copy the code below and execute it in a query window.
CREATE FUNCTION [dbo].[fnParseStringTSQL] (@string NVARCHAR(MAX),@separator NCHAR(1))
RETURNS @parsedString TABLE (string NVARCHAR(MAX))
DECLARE @position int
SET @position = 1
SET @string = @string + @separator
WHILE charindex(@separator,@string,@position) <> 0
INSERT into @parsedString
SELECT substring(@string, @position, charindex(@separator,@string,@position) - @position)
SET @position = charindex(@separator,@string,@position) + 1
This can be done either manually or you can do this using Visual Studio. If you deploy from Visual Studio a lot of these steps are simplified.
Step 1 - CLR code
Before we get started the first thing that needs to be done is to enable the CLR on your SQL Server. This can be done by using the SQL Server Surface Area Configuration tool. Refer to this tip CLR String Sort Function in SQL Server 2005 for more information.
Copy and save the VB.Net code below in a file called: C:\fnParseString.vb or whatever you prefer.
Partial Public Class UserDefinedFunctions
TableDefinition:="StringCol NVARCHAR(MAX)")> _
Public Shared Function parseStringCLR(<SqlFacet(MaxSize:=-1)> ByVal Input As String, _
ByVal Separator As Char) As IEnumerable
Dim Result() As String
Result = Input.Split(Separator)
Public Shared Sub GetNextToken(ByVal row As Object, ByRef TheToken As String)
TheToken = CStr(row)
Step 2 - Compile CLR Code - (You only need to do this if you are not using Visual Studio to develop and deploy)
In order to use this code, the code has to be compiled first to create a DLL.
The following command is run from a command line to compile the CLR code using the vbc.exe application. This is found in the .NET 2.0 framework directory. This may be different on your server or desktop. Also, this code should be compiled on the machine where the code will run.
The code should now be compiled in a file called: C:\fnParseString.dll
Step 3 - Create Assembly and Function - (You only need to do this if you are not using Visual Studio to develop and deploy)
After the code has been compiled you need to create the assembly and the function with SQL Server. To do this, run these commands in the database where you want to create the function.
For the function you will see three components that are referenced CLRFunctions.UserDefinedFunctions .parseStringCLR .
CLRFunctions - the assembly reference
UserDefinedFunctions - the class reference in the VB code
parseStringCLR - the function reference in the VB code
CREATE ASSEMBLY CLRFunctions FROM 'C:\fnParseStringCLR.dll'
CREATE FUNCTION [dbo].fnParseStringCLR(@string [nvarchar](4000), @separator [nchar](1))
RETURNS TABLE (
[StringCol] [nvarchar](max) NULL
) WITH EXECUTE AS CALLER
EXTERNAL NAME CLRFunctions.UserDefinedFunctions.parseStringCLR
Step 4 - Test It
To test the functions, run the following SELECT statements.
SELECT * FROM dbo.fnParseStringTSQL('SQL Server 2000|SQL Server 2005|SQL Server 2008|SQL Server 7.0','|')
SELECT * FROM dbo.fnParseStringCLR('SQL Server 2000|SQL Server 2005|SQL Server 2008|SQL Server 7.0','|')
SELECT * FROM dbo.fnParseStringTSQL('Apple,Banana,Pear',',')
SELECT * FROM dbo.fnParseStringCLR('Apple,Banana,Pear',',')
Here is the output from the above queries:
Result set for the first two queries
Result set for the second two queries
Here is another test that has over 1000 delimited items. With this amount of data we can see that the CLR code is faster. Although each time you run this you may get different execution times, in my tests the CLR was always almost three times faster.
A reader asked how these functions can be used against data that exists in a table. Here is a simple example of populating a table and then using these functions:
CREATE TABLE #temp (id int, stringData varchar(100))
INSERT INTO #temp VALUES (1,'SQL Server 2000,SQL Server 2005,SQL Server 2008,SQL Server 7.0,SQL Server 2012,SQL Server 2008 R2')
INSERT INTO #temp VALUES (2,'Apple,Banana,Pear')
SELECT y.id, fn.string
FROM #temp AS y CROSS APPLY dbo.fnParseStringTSQL(y.stringData, ',') AS fn
DROP TABLE #temp
Summary With the above you can see the two different approaches to tackling this issue. For the most part you are probably fine with either code set, but do some testing to see which approach works best.
Give this example a try and see what other functions you could write that could take advantage of the CLR
If you have CLR functions that you want to share with the rest of the MSSQLTips.com community, please send them to firstname.lastname@example.org so we can post them for others to use.
If you don't know how to write either VB or C# now is the time to begin learning. You will find a lot of things these languages will make a lot easier to implement.
Do some testing and also take on other functions that you have written in T-SQL to see if they can run any faster in the CLR
You can also use CTEs under SQL Server 2005 to do the same thing.
The advantage is that you can specify a JOIN on a CTE :
DECLARE @str VARCHAR(64)
SET @str = 'toto,titi,tutu,tata';
WITH CTE (Deb, Fin) AS
SELECT 1 Deb, CHARINDEX(',', @str + ',') Fin
SELECT Fin + 1, CHARINDEX(',', @str + ',', Fin + 1)
WHERE CHARINDEX( ',', @str + ',', Fin + 1 ) > 0
SELECT SUBSTRING(@str, Deb , Fin - Deb), Deb, Fin
DECLARE @String VARCHAR(64) SET @String = 'un,deux,trois,quatre,cinq';
DECLARE @Separator CHAR(1) SET @Separator = ',';
WITH INDICES AS
SELECT 0 Deb, 1 Fin
SELECT Fin, CHARINDEX(@Separator, @String, Fin) + LEN(@Separator)
WHERE Fin > Deb
WHEN Fin > LEN(@Separator) THEN Fin - Deb - LEN(@Separator)
ELSE LEN(@String) - Deb + 1
LEN(@String) - Deb + 1 Lgr
FROM INDICES WHERE Deb > 0
As in most examples I have seen, this works great as long as you have a variable or supply a delimited string for it to work on. I'm sure you know what RBAR means, and I'm trying to avoid that by diong set related processing, sin instead of typing in a string, what if I have a table with onle colum in it and 2000 entries of string text I want to process? All the examples I have tried return one recors, I want to iterate (try not to use the word loop) through all 2000 entries and insert the results into a table, but I can't get to work, and there are no examples of this, but hundreds of examples on how to do just one string.
From Jeff Moden's efforts, this is the best performing UDF I have seen which uses a Numbers table
@ListVARCHAR(8000) ,@DelimiterCHAR(1) )RETURNSTABLE WITHSCHEMABINDING AS RETURN --this UNIONs a "1" (first element) with n+1 values (subsequent elements) relative to delimiter positions --eliminates need to prepend a delimiter WITHElementStartCTE AS ( SELECT 1 ASStart UNIONALL SELECT n+1 FROM dbo.NumbersWITH (NOLOCK) WHERE n<=DATALENGTH(@List) ANDSUBSTRING(@List,n,1)=@Delimiter )
--ISNULL\NULLIF logic handles last element eliminating need to append delimiter
,ElementStartAndLengthCTE AS ( SELECT Start ,ISNULL(NULLIF(CHARINDEX(@Delimiter,@List,Start),0)-Start,DATALENGTH(@List))ASLength FROM ElementStartCTE )
SUBSTRING(@List,Start,Length)ASItem FROM ElementStartAndLengthCTE;
Thursday, January 17, 2013 - 1:38:41 PM - Steven Willis
In a cooperative effort over the last few years Jeff Moden over at SQLServerCentral has pretty much put this issue to rest.
http://www.sqlservercentral.com/articles/Tally+Table/72993/ ... go to the bottom of the article and download all the code.
The function DelimitedSplit8K has been objectively performance tested against many other non-CLR versions. Jeff Moden even provides the testing script and procedures for anyone to use. The latest version of the function does not require an external numbers/tally table. I have personally tested this and competing versions of splitter functions using Jeff's test harness and DelimitedSplit8K cannot be beat without using a CLR.
By the way, Jeff also provides the code for a splitter CLR (and test results) and the CLR is by far the fastest splitter method. But if one is working on servers without full sysadmin control it is often not possible to build and run CLRs. Both functions produce exactly the same output columns so they can be used interchageably.
The DelimitedSplit8K function's only limitation is that the string being parsed must be be less than VARCHAR(8000) or NVARCHAR(4000). Splitting a value that requires a VARCHAR(MAX) datatype will kill performance and there are some other splitters that handle those rare situations better, though splitting such blobs will always be a performance problem. The function is also optimized for a single delimiter only with values of any size including random sizes. For strings with multiple delimiters (i.e., 2-dimenisonal) there is a variation of DelimitedSplit8K that has been shown to perform faster than trying to do multiple CROSS APPLYs with the same query. Details on that variation can be found here: http://www.sqlservercentral.com/Forums/FindPost1401102.aspx
DelimitedSplit8K is an iTVF and thus produces a virtual table as output. As in the example in a previous post the function can be used by using a JOIN or a CROSS APPLY.