Split Delimited String into Columns in SQL Server with PARSENAME
By: Aubrey Love | Comments (18) | Related: More > TSQL
Typically, in SQL Server data tables and views, values such as a person's name or their address is stored in either a concatenated string or as individual columns for each part of the whole value. For example: John Smith 123 Happy St Labor Town, CA. This data could be stored in a single column with all the data, in two columns separating the person's name from their address or in multiple columns with a column for each piece of the total value of the data.
With this, DBA's are inevitably always having to concatenate or parse the values to suit our customer's needs.
To build on our sample from above, I have an address column that has the full address (street number and name, city and state) separated by commas in a concatenated column. I want to break that data into three columns (street, city and state) respectively.
For this tip, let's create a test database and test table for the parsing example. In this scenario, we have a street name and number, followed by the city, then the state.
USE master; GO CREATE DATABASE myTestDB; GO USE myTestDB; GO
The following code creates a table in the new (test) database with only two columns - an id column and an address column.
CREATE TABLE dbo.custAddress( colID INT IDENTITY PRIMARY KEY , myAddress VARCHAR(200) ); GO
Populate the table with some generic data. Notice that the address is one string value in one column.
INSERT INTO dbo.custAddress(myAddress) VALUES('7890 – 20th Ave E Apt 2A, Seattle, VA') , ('9012 W Capital Way, Tacoma, CA') , ('5678 Old Redmond Rd, Fletcher, OK') , ('3456 Coventry House Miner Rd, Richmond, TX') GO
Confirm the table creation and data entry with the following SQL SELECT statement.
SELECT * FROM dbo.custAddress; GO
Your results should return, as expected, two columns - the id column and the address column. Notice the address column is returned as a single string value in image 1.
Breaking down the data into individual columns. The next step will be to parse out the three individual parts of the address that are separated by a comma.
SELECT REVERSE(PARSENAME(REPLACE(REVERSE(myAddress), ',', '.'), 1)) AS [Street] , REVERSE(PARSENAME(REPLACE(REVERSE(myAddress), ',', '.'), 2)) AS [City] , REVERSE(PARSENAME(REPLACE(REVERSE(myAddress), ',', '.'), 3)) AS [State] FROM dbo.custAddress; GO
Since we didn't call the id column in the SELECT statement above, the results only returned the address string divided into three columns, each column returning a portion of the address for the respective column as shown in image 2.
Here in the real world, DBA's are often faced with more complex tables or views and not just a simple two column table as in the above sample. Although the sample above is a great primer for dissecting how to parse a string value, the following section demonstrates a more complex situation.
Again, create a sample table in our test database to work with. Here, we create a slightly more complex table with some additional columns.
CREATE TABLE employeeData( colID INT IDENTITY PRIMARY KEY , empID INT , empName VARCHAR(50) , empAddress VARCHAR(200) , empPhone VARCHAR(12) , jobClass VARCHAR(50) ); GO
Insert some generic data into the test table.
INSERT INTO employeeData(empID, empName, empAddress, empPhone, jobClass) VALUES (1, 'John, M, Smith, Jr', '123 Happy Hollow, BarnYard, OK, 90294', '202-555-0118', 'Programmer') , (2, 'Joe, S, Jones, Sr', '456 Sad Ln, BarnDoor, TX, 90295', '202-555-0195', 'Tester') , (3, 'Sammy, L, Smuthers', '5655 Medow Lane, Pastuer, CA, 90296', '202-555-0192', 'Sales') , (4, 'Henry, R, Lakes, Esq', '8749 Sunshine Park, Glenndale, HA, 90297', '202-555-0141', 'Manager') , (5, 'Harry, Q, Public, Jr', '555 Somber Ln, Levy, OR, 90298', '202-555-0137', 'Graphical Designer') GO
Now, let's create a mixed SELECT statement to pull all the columns that breaks down the "empName", "empAddress" and "empPhone" columns into multiple columns.
In the following block of code, notice that I restarted my "place / position" count on each column that I want to parse. Each time you start parsing a new column, you must start your count over. You should also note that the "empName" and "empAddress" string values are separated with a comma while the "empPhone" string value is separated by a hyphen and thus should be reflected in your "parse" function between the first set of single quotes.
SELECT colID , empID -- The following section breaks down the "empName" column into three columns. , REVERSE(PARSENAME(REPLACE(REVERSE(empName), ',', '.'), 1)) AS FirstName , REVERSE(PARSENAME(REPLACE(REVERSE(empName), ',', '.'), 2)) AS MiddleName , REVERSE(PARSENAME(REPLACE(REVERSE(empName), ',', '.'), 3)) AS LastName -- The following section breaks down the "empAddress" column into four columns. , REVERSE(PARSENAME(REPLACE(REVERSE(empAddress), ',', '.'), 1)) AS Street , REVERSE(PARSENAME(REPLACE(REVERSE(empAddress), ',', '.'), 2)) AS City , REVERSE(PARSENAME(REPLACE(REVERSE(empAddress), ',', '.'), 3)) AS State , REVERSE(PARSENAME(REPLACE(REVERSE(empAddress), ',', '.'), 4)) AS ZipCode -- The following section breaks down the "empPhone" column into three columns , REVERSE(PARSENAME(REPLACE(REVERSE(empPhone), '-', '.'), 1)) AS AreaCode , REVERSE(PARSENAME(REPLACE(REVERSE(empPhone), '-', '.'), 2)) AS Prefix , REVERSE(PARSENAME(REPLACE(REVERSE(empPhone), '-', '.'), 3)) AS LastFour , jobClass FROM employeeData; GO
The results should return thirteen columns from the six columns queried against as shown in image 3.
In summary, the PARSENAME function is a handy addition to your T-SQL toolkit for writing queries involving delimited data. It allows for parsing out and returning individual segments of a string value into separate columns. Since the PARSENAME function breaks down the string, you are not obligated to return all the delimited values. As in our sample above, you could have returned only the area code from the "empPhone" column to filter certain area codes in your search.
About the author
View all my tips