SQL Server CLR function to concatenate values in a column

By:   |   Comments (3)   |   Related: 1 | 2 | > Common Runtime Language


Problem

Normalization is an important principal in database design, but it's often the opposite of what's needed in a good report. If a column is normalized, but the user really wants to see the values as a short comma separated list, how can I write a query that produces the list? Concatenating the values in a column would be pretty easy if SQL Server had a concatenate aggregate function, which it doesn't. What's more, for efficiency sake it's important to write the reporting queries without using cursors.

Solution

SQL Server 2005 introduced user-defined aggregate functions that can be written in the SQLCLR. These are great when you need an aggregate that SQL Server doesn't supply. The funny thing is that we don't have to write the Concatenate aggregate. Microsoft has supplied it in the sample code that comes with SQL Server 2005.

During the installation of SQL Server there's an option to install Sample Code. If you select that option a generous quantity of samples are added to your system. If you're installing to the default location on the C: drive you'll find the samples in the directory "C:\Program Files\Microsoft SQL Server\90\Samples\". The sample with the Concatenate aggregate is in the directory "Engine\Programmability\CLR\StringUtilities\" under Samples. The SQLCLR samples come in both C# and Visual Basic.Net. For this tip I'll be using the VB version in the VB subdirectory. If you prefer C#, the code in the CS subdirectory has the same functionality.

This samples rely on a strong key to identify the code to SQL Server. To compile the project, you'll need to supply a strong key file, Microsoft doesn't ship one with the samples. There are instructions in the readme_stringutils.htm file that is in the StringUtilities directory on how to create the key. Let's walk through the process:

  • Open up a command window and navigate to the directory at the top of the sample code:
    "c:\program files\Microsoft SQL Server\90\Samples\"
  • Run the sn.exe program that creates a pair of cryptographic keys. sn.exe is part of the .Net SDK so you may need to use it's complete path.
"C:\Program Files\Microsoft Visual Studio 8\SDK\v2.0\Bin\"sn.exe -k Samplekey.snk

The snk file is used by several of the sample projects. Not just StringUtils so you won't have to take this step again.

Now it's almost time to start coding. Using Visual Studio 2005 Professional, or above, open the StringUtilities.sln file to get started. The samples ship without specifying a connection to a database so the next task is to make the connection to a database. Using the menu choose Project and then StringUtils Properties. Along the left side of the properties screen select the Database tab. This is what you'll see:

string utilities solution

The green arrow points to the blank string. If you hide the Solution Explorer and scroll over to the right you can press the Browse button that is hidden in the picture above. Add a reference to any database that you like on your local machine. Here's the New Database Reference dialog as I add a connection to AdventureWorks on the local machine:

new database reference

Press OK and then OK again on the Add Database Reference dialog and your connection string is set. Open the Solution Explorer again and take a look at the code in the Concat.vb file.

The sample declares the aggregate function Concatenate by creating the concatenate class and decorating it with the SqlUserDefinedAggregate attribute. I've re-wrapped the text so it's more readable and now the code for Concatenate starts like this:

concatenate code

Aggregate functions are implemented by writing four methods

Method Name Description
Init Initializes the aggregate
Accumulate This method is called for each row to be aggregated it provides the value to aggregate.
Merge Called when two aggregates must be merged to produce a final result
Terminate Called by SQL Server when it wants the result of the aggregate.

Let's take a look at each of those functions in the Concat.VB file. The Init method is small and called only once each time the Concatenate aggregate is created. All it has to do is create a new StringBuilder object. StringBuilder is a .Net framework class that will do most of the concatenation work.

concatenate code

The accumulate method is called once for every value to aggregate. It should decide what to do when it's called with a null value. The Concatenate.Accumulate method in the sample just ignores nulls. If the string has a value it's Appended to the StringBuilder along with a comma. Concatenating with commas as the separator is hard-coded into the Accumulate method. In SQL Server 2005 aggregates take only one argument so there's no way to pass in the character used to separate the aggregated strings. Here's the Concatenate.Accumulate code:

concatenate code

The merge method is used when SQL Server has to combine two aggregate objects to produce a combined results. Using multiple aggregate methods allows SQL Server to process the query on multiple threads for better performance. It calls Merge to aggregate the intermediate results. The parameter to Merge is a second Concatenate object distinct from the one that is called. Since it's the same class, the Merge method can refer to the private members of the parameter and all it has to do is Append that string to the current intermediateResult StringBuilder. Here's Concatenate.Merge:

concatenate code

Finally the job of the Terminate method is to return the result of the aggregate to SQL Server. Concatenate.Terminate strips the final comma from the intermediate result and returns the result as a new SqlString.

concatenate code

After taking a look at the code it's time to Build and Deploy it to SQL Server with the Deploy menu command. If you run into an error message complaining that the compiler can't find the SampleKey.snk file you'll need to go back to the instructions above and be sure that the SampleKey.snk file has been created and that it's in the correct directory. This is the text of the error message that I got before getting everything correctly:

Error reading key file 'c:\Program Files\Microsoft SQL Server \90\Samples\SampleKey.snk' -- The system cannot find the file specified.

Building and deploying a SQLCLR project have been covered in previous tips. Using Visual Studio to build and deploy is covered in Writing to an operating system file using the SQL Server SQLCLR, or to compile a .Net program from the command line and then load it into SQL Server is covered in CLR function to delete older backup and log files in SQL Server,

Once built and deployed the Concatenate aggregate can be used in a query. Here's a pretty basic one in the AdventureWorks database, where we built the StringUtilties assembly:

select TerritoryName
     , dbo.Concatenate(StateProvinceName)
from Sales.vSalesPerson
where TerritoryName is NOT NULL
group by TerritoryName

The following is the result of running the script in SQL Server Management Studio:

sample output

For short lists the Concatenate aggregate works great. But it must be used with caution. Like all User-Defined functions there is a performance penalty that has to be paid when using a function on tens of thousands of rows or more. I had one situation where I had to remove Concatenate for performance reasons when working with a query on 3 million rows.

Another limitation on aggregates in SQL Server 2005 is that their internal storage can only use up to 8000 bytes. Let's see what happens when the string concatenated grows to over 8000 bytes:

select datalength(dbo.Concatenate(Name))
from Sales.Store

Here are the results:

Msg 6522, Level 16, State 2, Line 2
A .NET Framework error occurred during execution of user-defined 
               routine or aggregate "Concatenate": 
System.Data.SqlTypes.SqlTypeException: The buffer is insufficient. 
               Read or write operation failed.
System.Data.SqlTypes.SqlTypeException: 
   at System.Data.SqlTypes.SqlBytes.Write(Int64 offset, 
               Byte[] buffer, Int32 offsetInBuffer, Int32 count)
   at System.Data.SqlTypes.StreamOnSqlBytes.Write(Byte[] 
               buffer, Int32 offset, Int32 count)
   at System.IO.BinaryWriter.Write(String value)
   at Microsoft.Samples.SqlServer.Concatenate.Write(BinaryWriter w)

The 8000 byte limitation was addressed in SQL Server 2008 and later.

Next Steps
  • Explore and build the sample StringUtilities project in either VB.Net or C#.
  • Use Concatenate in your queries.
  • Look for other aggregates that might make sense in your environment.


sql server categories

sql server webinars

subscribe to mssqltips

sql server tutorials

sql server white papers

next tip



About the author
MSSQLTips author Andy Novick Andy Novick is a SQL Server Developer in the Boston area with 25 years of database and application development experience.

This author pledges the content of this article is based on professional experience and not AI generated.

View all my tips



Comments For This Article




Wednesday, March 10, 2021 - 5:11:13 PM - Raphael Ferreira Back To Top (88376)
Am I missing something here? Can't you just use Pivot in a CTE and then concatenate the pivoted attributes? Honest question. TIA, Raphael.

Friday, February 20, 2009 - 1:01:52 AM - ALZDBA Back To Top (2802)

It is a nice demo.

But it needs to be mentioned you should always do performance tests to compare it to performing the same thing using regular TSQL (as shown above).

 In many cases the optimal TSQL-code will outperform the CLR in doing the same data handling.

some nice threads:

http://www.sqlservercentral.com/Forums/Topic622800-386-1.aspx

http://www.sqlservercentral.com/Forums/Topic365107-386-2.aspx

http://www.sqlservercentral.com/Forums/Topic458140-386-1.aspx?

http://www.sqlservercentral.com/Forums/Topic365107-386-1.aspx

 


Thursday, February 19, 2009 - 1:49:33 PM - jwr4 Back To Top (2798)

We've been doing this for years with T-SQL:

SELECT @List = COALESCE(@List + '; ', '') + [LegalName] + COALESCE('; ' + [ShortName], '') + COALESCE('; ' + [AlternateName], '')

FROM dbo.prime_Org_Contact c INNER JOIN

dbo.prime_Organization o ON c.OrgID = o.ID

WHERE c.ContactID = @ContactID

RETURN @List

This returns a CSV list of orgs that a contact is associated with. We populate a search table with a bunch of these UDFS so we can full text search. Unfortunately, a view using a UDF that does data access cannot be full text indexed.

 

JR















get free sql tips
agree to terms