Learn more about SQL Server tools

 

Tutorials          DBA          Dev          BI          Career          Categories          Events          Whitepapers          Today's Tip          Join

Tutorials      DBA      Dev      BI      Categories      Events

DBA    Dev    BI    Categories

 

Standardize SQL Server data with text lookup and replace function


By:   |   Read Comments (5)   |   Related Tips: More > Scripts

Problem
Have you ever had the need to replace multiple words in a data column within SQL Server with a new word or a new phrase for an entire table?  A good example would be to update a product catalog with new words either for standardization or just to reflect a new name for the product.  A simple technique would be to use the REPLACE command as outlined in this prior tip.  This is great if you only need to update a few words, but what if you need to scan the entire product catalog and make several changes across the board, sometimes multiple changes for one data value?

Solution
By using a combination of the text manipulation commands from this prior tip and also using a cursor we can cursor through the data one time and update all words in the data column based on values that exist in a synonyms table to reflect the new way we want the data to be stored. The downside is that we will need to use a cursor, but the upside is that we can scan through all of our data one time instead of having to do multiple table scans to find the word we are looking for and then doing the update.

Here is the script that will scan through an entire table using a cursor, for each row of data the text is broken down word by word and then looked up in a synonyms table to see if there is a standardized way of storing this word. If found the new value is brought in to create an updated version of our data based on the lookup.

Here is the code that does the conversion.

SET NOCOUNT ON

DECLARE 
@word VARCHAR(50), 
       
@position INT
       
@newProductName VARCHAR(500), 
       
@oldProductName VARCHAR(500), 
       
@newWord VARCHAR(50),
       
@ProductName VARCHAR(500)

DECLARE load_cursor CURSOR FOR
       SELECT 
ProductName
       
FROM Products

OPEN load_cursor
FETCH NEXT FROM load_cursor INTO @ProductName

WHILE @@FETCH_STATUS 0
BEGIN
       SET 
@oldProductName @ProductName
       
SET @ProductName LTRIM(RTRIM(@ProductName))
       
       
SET @newProductName @ProductName

       
SET @position = CHARINDEX(' '@ProductName1)

       
BEGIN
               WHILE 
@position 0
               
BEGIN
                       SET 
@word LTRIM(RTRIM(LEFT(@ProductName@position 1)))
                       
IF @word <> ''
                       
BEGIN
                               
                               SELECT 
@newWord NULL
                               
SELECT @newWord synonym FROM Synonyms WHERE word @word 
                               
IF @newWord IS NOT NULL
                               
BEGIN
                                       SET 
@newProductName REPLACE(@newProductName@word@newWord)
                               
END
                       END
                       SET 
@ProductName RIGHT(@ProductNameLEN(@ProductName) - @position)
                       
SET @position = CHARINDEX(' '@ProductName1)
               
END
               
               SET 
@word @ProductName
               
SELECT @newWord NULL
               
SELECT @newWord synonym FROM Synonyms WHERE word @word
               
IF @newWord IS NOT NULL
                       
SET @newProductName REPLACE(@newProductName@ProductName@newWord)
       
END

       IF 
@oldProductName <> @newProductName
       
BEGIN
               SELECT 
@oldProductName AS OldProductName@newProductName AS NewProductName
       
END

       FETCH 
NEXT FROM load_cursor INTO @ProductName
END

CLOSE 
load_cursor
DEALLOCATE load_cursor
GO

So to show how this can be used we need to create a new table called synonyms.  We will also create a unique index on the word column.  This will allow us to use the index for the data lookup and also make sure that only one row is returned. 

To run the example create this table in the Northwind database or create this in your own database to use with your data.

CREATE TABLE [dbo].[Synonyms] (
       
[synonym] [varchar] (50COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
       
[word] [varchar] (50COLLATE SQL_Latin1_General_CP1_CI_AS NULL 
ON [PRIMARY]
GO

 
CREATE  UNIQUE  INDEX [IX_word] ON [dbo].[Synonyms]([word]ON [PRIMARY]
GO

To test against the Product table in the Northwind database we need to create some data in the synonyms table.  Here is a sample of what we will create.

Synonym Word
Jam Marmalade
Chowda Chowder
Wicked Hot Hot
King Sir

You can either create the data manually or run this script to populate the table.

INSERT INTO synonyms VALUES ('Jam','Marmalade')
INSERT INTO synonyms VALUES ('Chowda','Chowder')
INSERT INTO synonyms VALUES ('Wicked Hot','Hot')

INSERT INTO synonyms VALUES ('King','Sir')

So after creating the table, loading the data and running the script above these are the results we get when using the data in the Northwind database.  As you can see we have updated the data based on the our word replacements.  Also, you will notice that we have updated "Sir Rodney's Marmalade" to "King Rodney's Jam".  The process did the replacement for both "Sir" and for "Marmalade".

OldProductName NewProductName
Sir Rodney's Marmalade King Rodney's Jam
Sir Rodney's Scones King Rodney's Scones
Jack's New England Clam Chowder Jack's New England Clam Chowda
Louisiana Fiery Hot Pepper Sauce Louisiana Fiery Wicked Hot Pepper Sauce
Louisiana Hot Spiced Okra Louisiana Wicked Hot Spiced Okra

Next Steps

  • Download all the scripts here
  • Modify this code to work with your data
  • Make some changes so the final result updates the data instead of just displaying the before and after versions
  • Although the text manipulation commands are limited, get creative with what is available to solve your data problem


Last Update:






About the author
MSSQLTips author Greg Robidoux Greg Robidoux is the President of Edgewood Solutions and a co-founder of MSSQLTips.com.

View all my tips


 









Post a comment or let the author know this tip helped.

All comments are reviewed, so stay on subject or we may delete your comment. Note: your email address is not published. Required fields are marked with an asterisk (*).

*Name    *Email    Notify for updates 


Get free SQL tips:

*Enter Code refresh code     



Wednesday, February 20, 2013 - 2:27:20 PM - Jewel Back To Top

Thank you so much!  I used this in a slightly different manner.  I used it to replace bad values in a column with nothing.


Friday, March 12, 2010 - 6:28:42 AM - admin Back To Top

Here is an update to this.  This is updating the table PRODUCTS that has a primary key of PRODUCTID.

 SET NOCOUNT ON

DECLARE 
@word VARCHAR(50
), 
       
@position INT

       
@newProductName VARCHAR(500
), 
       
@oldProductName VARCHAR(500
), 
       
@newWord VARCHAR(50
),
       
@ProductName VARCHAR(500
),
       @ProductID INT
 

DECLARE load_cursor 
CURSOR FOR
       SELECT 
ProductName, ProductID 
       
FROM 
Products

OPEN 
load_cursor
FETCH NEXT FROM load_cursor INTO 
@ProductName, @ProductID 

WHILE @@FETCH_STATUS 
0
BEGIN
       SET 
@oldProductName 
@ProductName
       
SET @ProductName LTRIM(RTRIM(@ProductName
))
       
       
SET @newProductName 
@ProductName

       
SET @position = CHARINDEX(' '@ProductName1
)

       
BEGIN
               WHILE 
@position 
0
               
BEGIN
                       SET 
@word LTRIM(RTRIM(LEFT(@ProductName@position 1
)))
                       
IF @word <> 
''
                       
BEGIN
                               
                               SELECT 
@newWord 
NULL
                               
SELECT @newWord synonym FROM Synonyms WHERE word 
@word 
                               
IF @newWord IS 
NOT NULL
                               
BEGIN
                                       SET 
@newProductName REPLACE(@newProductName@word@newWord
)
                               
END
                       END
                       SET 
@ProductName RIGHT(@ProductNameLEN(@ProductName) - @position
)
                       
SET @position = CHARINDEX(' '@ProductName1
)
               
END
               
               SET 
@word 
@ProductName
               
SELECT @newWord 
NULL
               
SELECT @newWord synonym FROM Synonyms WHERE word 
@word
               
IF @newWord IS 
NOT NULL
                       
SET @newProductName REPLACE(@newProductName@ProductName@newWord
)
       
END

       IF 
@oldProductName <> 
@newProductName
       
BEGIN 
               UPDATE Products SET
ProductName = @newProductName WHERE P
roductID = @ProductID 
       
END

       FETCH 
NEXT FROM load_cursor INTO 
@ProductName, @ProductID 
END

CLOSE 
load_cursor
DEALLOCATE 
load_cursor
GO


Thursday, March 11, 2010 - 8:45:31 PM - jdevm Back To Top
Unfortunately you have now lost me regarding the 2nd part which mentions updating the cursor

Thursday, March 11, 2010 - 6:54:29 PM - admin Back To Top

You can change this part of the code to do an update versus just showing the data. 

       IF @oldProductName <> @newProductName
       
BEGIN
               SELECT 
@oldProductName AS OldProductName@newProductName AS 
NewProductName
       
END

You would also need to add in some additional logic to determine which row you are updating.  This could be added to the cursor.


Thursday, March 11, 2010 - 5:56:54 PM - jdevm Back To Top
How does one change this script to replace the content instead of just updating it on screen?

Learn more about SQL Server tools