REPLACE versus TRANSLATE for SQL Server Text Data Manipulation
By: Jared Westover | Updated: 2023-04-19 | Comments (5) | Related: More > TSQL
When reviewing or demoing your old code, do you ever get embarrassed? Most times, for me, it's caused by a lack of comments. However, I usually apologize when showing queries with a nested group of REPLACE() functions. Even if everyone uses REPLACE() this way, it doesn't look good. If you need to replace several occurrences of text in a column, is there a better way than using several nested REPLACE() functions?
Could the SQL TRANSLATE() function replace nesting REPLACE() functions? In this article, I'll look at the TRANSLATE() function in detail, including how it's different from REPLACE(). Can TRANSLATE() act as a complete substitute for nested REPLACE() functions? What are the differences between the two? These are just a few questions I'll answer as we go through this tutorial. By the end, you'll know which function to choose moving forward, depending on your query requirements.
Exploring SQL REPLACE()
As the name implies, the REPLACE() function replaces one or a series of characters with another. It's easily one of the simplest yet most powerful functions in SQL. You can also use it as part of an UPDATE statement.
SELECT REPLACE('Luke I am your friend', 'friend', 'father') AS [Name Replacement];
Name Replacement --------------------- Luke I am your father
For as long as I've used SQL Server, REPLACE() has existed. It's a favorite among developers when trying to clean up data. We've all been there, especially when users modify data without controls. They could fill out a form indicating someone's name and include a # sign. Their name may have a # sign, but it was likely a mistake. Ideally, controls exist on the front end to stop this from happening, but that's a different conversation.
Something you commonly see is nesting several REPLACE() functions to account for multiple strings to replace on the same column. Often, you want to replace the value with a blank or empty string. For example, let's say we want to replace any non-numeric characters in a phone number. The code below should do the trick.
-- Original Phone Number 1-(555)-555-5555 DECLARE @PhoneNumber NVARCHAR(50) = '1-(555)-555-5555' SELECT REPLACE(REPLACE(REPLACE(@PhoneNumber,'-',''),'(',''),')','') AS [New Phone Number];
New Phone Number ----------------- 15555555555
In the example above, there is absolutely nothing wrong with nesting REPLACE(). I've likely used up to 8 or 10 REPLACE() functions when the situation called for it. However, I'll admit it does make your T-SQL hard to read the deeper you go. This pattern reminds me of a nested CASE expression. Is there a better way of doing this to remove the clutter? That's where our next function comes into play.
Exploring SQL TRANSLATE()
Microsoft first introduced the TRANSLATE() function in SQL Server 2017. We could use TRANSLATE() in Oracle long before then. You must ensure your database compatibility level is at 140 or higher to take advantage of it. Microsoft defines TRANSLATE() as a function that returns the string provided as a first argument after some characters specified in the second argument are translated into a destination set of characters specified in the third argument.
When first looking at TRANSLATE(), I had no idea you could swap it for REPLACE() based on the name. I assumed it had something to do with language translation. However, after seeing an example, the confusion cleared up.
What does the basic syntax of TRANSLATE() look like? One crucial point is that the second and third arguments are a list of characters, list being the keyword.
TRANSLATE ( Input String, List of Characters, List of Characters to Replace With )
Let's try and remove all the non-numeric numbers from our phone number.
-- Original Phone Number 1-(555)-555-5555 DECLARE @PhoneNumber NVARCHAR(50) = '1-(555)-555-5555'; SELECT TRANSLATE(@PhoneNumber, '()-', '') AS [New Phone Number];
The second and third arguments of the TRANSLATE built-in function must contain an equal number of characters.
You can see above that our attempt to replace the list of characters resulted in an error message indicating that the second and third arguments must be the same length. What if we try and use three blanks since the second argument has three characters?
-- Original Phone Number 1-(555)-555-5555 DECLARE @PhoneNumber NVARCHAR(50) = '1-(555)-555-5555'; SELECT TRANSLATE(@PhoneNumber, '()-', ' ') AS [New Phone Number];
New Phone Number ------------------- 1 555 555 5555
Well, it's doing what I'm telling it to, essentially replacing the characters with spaces. We still need to use a REPLACE() function to remove the blank spaces, granted only one of them. With the TRANSLATE() portion, I will use a less-used replacement than a space.
-- Original Phone Number 1-(555)-555-5555 DECLARE @PhoneNumber NVARCHAR(50) = '1-(555)-555-5555'; SELECT REPLACE(TRANSLATE(@PhoneNumber, '()-', '***'),'*','') AS [New Phone Number];
Remember to accomplish the task with REPLACE(). I needed three, one for the open, one for the closed, and one for the hyphen. Situations like this are where TRANSLATE() shines.
How are NULLs handled for each of the functions? Let's start with REPLACE(). If either the value being replaced or the replacement value is NULL, SQL returns a NULL. This also applies if you are nesting REPLACE() functions. You can see the results of the example below.
WITH PhoneNumbers AS ( SELECT '1-(555)-555-5555' AS Number UNION ALL SELECT '1-(555)-555-1234' UNION ALL SELECT NULL ) SELECT REPLACE(REPLACE(REPLACE(Number,'-',''),'(',''),')','') AS [New Phone Number] FROM PhoneNumbers; GO
New Phone Number ------------------ 15555555555 15555551234 NULL
My recommendation is to handle the potential for a NULL with either COALESCE() or ISNULL().
Now let's turn our attention to TRANSLATE(). Does translate handle NULLs any differently than REPLACE()? Not really. If the input or replace value is NULL, the result will be a NULL.
WITH cte_ImportantNumbers AS ( SELECT '999-99-9999' AS Number UNION ALL SELECT '000-00-0000' UNION ALL SELECT NULL ) SELECT TRANSLATE(Number, '0123456789',REPLICATE('*',10)) AS [New Important Number] FROM cte_ImportantNumbers; GO
New Important Number --------------------- ***-**-**** ***-**-**** NULL
As mentioned previously, remember to create a remediation plan if you have a potential for NULLs. Unless your column doesn't allow NULLs, they are always there waiting to spring into action.
Which One to Choose?
If you want to replace a string with another, stick with REPLACE(), for example, replacing the word "Home" with "Office." However, if you're trying to clean up a list of characters and need to nest multiple REPLACE() functions together, try TRANSLATE(). Since TRANSLATE() didn't exist before SQL Server 2017, you can't use it if you're on an older version. Keep using REPLACE() until you can upgrade. To answer the question from the beginning, can TRANSLATE() replace the REPLACE() function? The answer is kind of, but not entirely.
- REPLACE() and TRANSLATE() are both string functions and substitute characters for some others.
- TRANSLATE() only works on SQL Server 2017. You need to ensure your database compatibility is set to at least 140.
- With either REPLACE() or TRANSLATE(), if the string is NULL, the value returned is NULL. There is no difference in how either handle NULLs.
- Pick whichever function you are the most comfortable working with and get coding.
- Daniel Calbimonte assembled several articles about SQL Server functions. Here is the link to the one about TRANSLATE().
- The article said TRANSLATE() was introduced in SQL Server 2017. For a listing of other T-SQL enhancements, please check out this article by Koen Verbeeck.
- In this article, Aubrey Love compares three additional string functions: STUFF(), REPLACE(), and WRITE(). I had never heard of WRITE() before reading this.
About the author
View all my tips
Article Last Updated: 2023-04-19