solving sql server problems for millions of dbas and developers since 2006



SQL Server DBA Tips SQL Server Developer Tips SQL Server Business Intelligence Tips SQL Server Career Tips SQL Server Tip Categories SQL Server Tutorials SQL Server Webcasts SQL Server Whitepapers SQL Server Tools SQL Server Questions and Answers MSSQLTips Authors About MSSQLTips SQL Server User Groups MSSLQTips Giveaways MSSQLTips Advertising Options

MSSQLTips Facebook Page MSSQLTips LinkedIn Page MSSQLTips RSS Feed MSSQLTips Twitter Page MSSQLTips Google+ Page





Standardize SQL Server data with text lookup and replace function

By: | Read Comments (4) | Print

Greg is the President of Edgewood Solutions and a co-founder of MSSQLTips.com.

Related Tips: More

Problem
Have you ever had the need to replace multiple words in a data column within SQL Server with a new word or a new phrase for an entire table?  A good example would be to update a product catalog with new words either for standardization or just to reflect a new name for the product.  A simple technique would be to use the REPLACE command as outlined in this prior tip.  This is great if you only need to update a few words, but what if you need to scan the entire product catalog and make several changes across the board, sometimes multiple changes for one data value?

Solution
By using a combination of the text manipulation commands from this prior tip and also using a cursor we can cursor through the data one time and update all words in the data column based on values that exist in a synonyms table to reflect the new way we want the data to be stored. The downside is that we will need to use a cursor, but the upside is that we can scan through all of our data one time instead of having to do multiple table scans to find the word we are looking for and then doing the update.

Here is the script that will scan through an entire table using a cursor, for each row of data the text is broken down word by word and then looked up in a synonyms table to see if there is a standardized way of storing this word. If found the new value is brought in to create an updated version of our data based on the lookup.

Here is the code that does the conversion.

SET NOCOUNT ON

DECLARE 
@word VARCHAR(50), 
       
@position INT
       
@newProductName VARCHAR(500), 
       
@oldProductName VARCHAR(500), 
       
@newWord VARCHAR(50),
       
@ProductName VARCHAR(500)

DECLARE load_cursor CURSOR FOR
       SELECT 
ProductName
       
FROM Products

OPEN load_cursor
FETCH NEXT FROM load_cursor INTO @ProductName

WHILE @@FETCH_STATUS 0
BEGIN
       SET 
@oldProductName @ProductName
       
SET @ProductName LTRIM(RTRIM(@ProductName))
       
       
SET @newProductName @ProductName

       
SET @position = CHARINDEX(' '@ProductName1)

       
BEGIN
               WHILE 
@position 0
               
BEGIN
                       SET 
@word LTRIM(RTRIM(LEFT(@ProductName@position 1)))
                       
IF @word <> ''
                       
BEGIN
                               
                               SELECT 
@newWord NULL
                               
SELECT @newWord synonym FROM Synonyms WHERE word @word 
                               
IF @newWord IS NOT NULL
                               
BEGIN
                                       SET 
@newProductName REPLACE(@newProductName@word@newWord)
                               
END
                       END
                       SET 
@ProductName RIGHT(@ProductNameLEN(@ProductName) - @position)
                       
SET @position = CHARINDEX(' '@ProductName1)
               
END
               
               SET 
@word @ProductName
               
SELECT @newWord NULL
               
SELECT @newWord synonym FROM Synonyms WHERE word @word
               
IF @newWord IS NOT NULL
                       
SET @newProductName REPLACE(@newProductName@ProductName@newWord)
       
END

       IF 
@oldProductName <> @newProductName
       
BEGIN
               SELECT 
@oldProductName AS OldProductName@newProductName AS NewProductName
       
END

       FETCH 
NEXT FROM load_cursor INTO @ProductName
END

CLOSE 
load_cursor
DEALLOCATE load_cursor
GO

So to show how this can be used we need to create a new table called synonyms.  We will also create a unique index on the word column.  This will allow us to use the index for the data lookup and also make sure that only one row is returned. 

To run the example create this table in the Northwind database or create this in your own database to use with your data.

CREATE TABLE [dbo].[Synonyms] (
       
[synonym] [varchar] (50COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
       
[word] [varchar] (50COLLATE SQL_Latin1_General_CP1_CI_AS NULL 
ON [PRIMARY]
GO

 
CREATE  UNIQUE  INDEX [IX_word] ON [dbo].[Synonyms]([word]ON [PRIMARY]
GO

To test against the Product table in the Northwind database we need to create some data in the synonyms table.  Here is a sample of what we will create.

Synonym Word
Jam Marmalade
Chowda Chowder
Wicked Hot Hot
King Sir

You can either create the data manually or run this script to populate the table.

INSERT INTO synonyms VALUES ('Jam','Marmalade')
INSERT INTO synonyms VALUES ('Chowda','Chowder')
INSERT INTO synonyms VALUES ('Wicked Hot','Hot')

INSERT INTO synonyms VALUES ('King','Sir')

So after creating the table, loading the data and running the script above these are the results we get when using the data in the Northwind database.  As you can see we have updated the data based on the our word replacements.  Also, you will notice that we have updated "Sir Rodney's Marmalade" to "King Rodney's Jam".  The process did the replacement for both "Sir" and for "Marmalade".

OldProductName NewProductName
Sir Rodney's Marmalade King Rodney's Jam
Sir Rodney's Scones King Rodney's Scones
Jack's New England Clam Chowder Jack's New England Clam Chowda
Louisiana Fiery Hot Pepper Sauce Louisiana Fiery Wicked Hot Pepper Sauce
Louisiana Hot Spiced Okra Louisiana Wicked Hot Spiced Okra

Next Steps

  • Download all the scripts here
  • Modify this code to work with your data
  • Make some changes so the final result updates the data instead of just displaying the before and after versions
  • Although the text manipulation commands are limited, get creative with what is available to solve your data problem


Related Tips: More | Become a paid author


Last Update: 9/12/2006

Share: Share 






Comments and Feedback:

Thursday, March 11, 2010 - 5:56:54 PM - jdevm Read The Tip
How does one change this script to replace the content instead of just updating it on screen?

Thursday, March 11, 2010 - 6:54:29 PM - admin Read The Tip

You can change this part of the code to do an update versus just showing the data. 

       IF @oldProductName <> @newProductName
       
BEGIN
               SELECT 
@oldProductName AS OldProductName@newProductName AS 
NewProductName
       
END

You would also need to add in some additional logic to determine which row you are updating.  This could be added to the cursor.


Thursday, March 11, 2010 - 8:45:31 PM - jdevm Read The Tip
Unfortunately you have now lost me regarding the 2nd part which mentions updating the cursor

Friday, March 12, 2010 - 6:28:42 AM - admin Read The Tip

Here is an update to this.  This is updating the table PRODUCTS that has a primary key of PRODUCTID.

 SET NOCOUNT ON

DECLARE 
@word VARCHAR(50
), 
       
@position INT

       
@newProductName VARCHAR(500
), 
       
@oldProductName VARCHAR(500
), 
       
@newWord VARCHAR(50
),
       
@ProductName VARCHAR(500
),
       @ProductID INT
 

DECLARE load_cursor 
CURSOR FOR
       SELECT 
ProductName, ProductID 
       
FROM 
Products

OPEN 
load_cursor
FETCH NEXT FROM load_cursor INTO 
@ProductName, @ProductID 

WHILE @@FETCH_STATUS 
0
BEGIN
       SET 
@oldProductName 
@ProductName
       
SET @ProductName LTRIM(RTRIM(@ProductName
))
       
       
SET @newProductName 
@ProductName

       
SET @position = CHARINDEX(' '@ProductName1
)

       
BEGIN
               WHILE 
@position 
0
               
BEGIN
                       SET 
@word LTRIM(RTRIM(LEFT(@ProductName@position 1
)))
                       
IF @word <> 
''
                       
BEGIN
                               
                               SELECT 
@newWord 
NULL
                               
SELECT @newWord synonym FROM Synonyms WHERE word 
@word 
                               
IF @newWord IS 
NOT NULL
                               
BEGIN
                                       SET 
@newProductName REPLACE(@newProductName@word@newWord
)
                               
END
                       END
                       SET 
@ProductName RIGHT(@ProductNameLEN(@ProductName) - @position
)
                       
SET @position = CHARINDEX(' '@ProductName1
)
               
END
               
               SET 
@word 
@ProductName
               
SELECT @newWord 
NULL
               
SELECT @newWord synonym FROM Synonyms WHERE word 
@word
               
IF @newWord IS 
NOT NULL
                       
SET @newProductName REPLACE(@newProductName@ProductName@newWord
)
       
END

       IF 
@oldProductName <> 
@newProductName
       
BEGIN 
               UPDATE Products SET
ProductName = @newProductName WHERE P
roductID = @ProductID 
       
END

       FETCH 
NEXT FROM load_cursor INTO 
@ProductName, @ProductID 
END

CLOSE 
load_cursor
DEALLOCATE 
load_cursor
GO



Post a Comment or Question

Keep it clean and stay on the subject or we may delete your comment.
Your email address is not published. Required fields are marked with an asterisk (*)

*Name   *Email   Notify for updates
Comments
*Enter Code refresh code


 
New SQL Monitor v3.0

New SQL Monitor v3.0


Sponsor Information
Try the free performance monitoring tool from Idera!

Quickly and accurately deploy database changes with Red Gate's SQL Compare - the industry standard comparison and deployment tool.

What grade do you think your SQL Servers get? Find out with Edgewood's SQL Server Health Check consulting services.

Free Trial: Get Proactive Insight with Spotlight® for SQL Server Enterprise.

Join the over million SQL Server Professionals who get their issues resolved daily.

Learn SQL Server 2012, Performance Tuning, Development, Administration, Replication and more - free webcasts


Copyright (c) 2006-2012 Edgewood Solutions, LLC All rights reserved
privacy | disclaimer | copyright | advertise | about
authors | contribute | feedback | giveaways | user groups
Some names and products listed are the registered trademarks of their respective owners.


Edgewood Solutions LLC | MSSharePointTips.com | MSSQLTips.com