Delete duplicate rows with no primary key on a SQL Server table

By: Greg Robidoux | Comments (47) | Related: > TSQL

Problem

Every once in awhile a table gets created without a primary key and duplicate records get entered. The problem gets even worse when you have two identical rows in the table and there is no way to distinguish between the two rows. So how do you delete the duplicate record?

Solution

One option that SQL Server gives you is the ability to set ROWCOUNT which limits the numbers of records affected by a command. The default value is 0 which means all records, but this value can be set prior to running a command. So let's create a table and add 4 records with one duplicate record.

Create a table called duplicateTest and add 4 records.

CREATE TABLE dbo.duplicateTest 
( 
[ID] [int] , 
[FirstName] [varchar](25), 
[LastName] [varchar](25)  
) ON [PRIMARY] 

INSERT INTO dbo.duplicateTest VALUES(1, 'Bob','Smith') 
INSERT INTO dbo.duplicateTest VALUES(2, 'Dave','Jones') 
INSERT INTO dbo.duplicateTest VALUES(3, 'Karen','White') 
INSERT INTO dbo.duplicateTest VALUES(1, 'Bob','Smith')

If we select all data we get the following:

SELECT * FROM dbo.duplicateTest

ID	FirstName	LastName
1	Bob	Smith
2	Dave	Jones
3	Karen	White
1	Bob	Smith

If we try to select the record for Bob Smith will all of the available values such as the following query:

SELECT * FROM dbo.duplicateTest WHERE ID = 1 AND FirstName = 'Bob' AND LastName = 'Smith'

We still get 2 rows of data:

ID	FirstName	LastName
1	Bob	Smith
1	Bob	Smith

DELETE Duplicate Records Using ROWCOUNT

So to delete the duplicate record with SQL Server we can use the SET ROWCOUNT command to limit the number of rows affected by a query. By setting it to 1 we can just delete one of these rows in the table. Note: the select commands are just used to show the data prior and after the delete occurs.

SELECT * FROM dbo.duplicateTest 

DECLARE @id int = 1

IF EXISTS (SELECT count(*) FROM dbo.duplicateTest WHERE ID = @id HAVING count(*) > 1 )
BEGIN
   SET ROWCOUNT 1 
   DELETE FROM dbo.duplicateTest WHERE ID = @id 
   SET ROWCOUNT 0 
END

SELECT * FROM dbo.duplicateTest

Here is a note from Microsoft about using SET ROWCOUNT:

Using SET ROWCOUNT will not affect DELETE, INSERT, and UPDATE statements in the next release of SQL Server. Avoid using SET ROWCOUNT together with DELETE, INSERT, and UPDATE statements in new development work, and plan to modify applications that currently use it. Also, for DELETE, INSERT, and UPDATE statements that currently use SET ROWCOUNT, we recommend that you rewrite them to use the TOP syntax.

I tested the ROWCOUNT option with SQL Server 2017 and this option still works.

DELETE Duplicate Records Using TOP

With SQL Server 2005 and later we can also use the TOP command when we issue the delete, such as the following. Note: the select commands are just used to show the data prior and after the delete occurs.

-- delete all records and add records again
DELETE FROM dbo.duplicateTest
INSERT INTO dbo.duplicateTest VALUES(1, 'Bob','Smith') 
INSERT INTO dbo.duplicateTest VALUES(2, 'Dave','Jones') 
INSERT INTO dbo.duplicateTest VALUES(3, 'Karen','White') 
INSERT INTO dbo.duplicateTest VALUES(1, 'Bob','Smith')	
	
SELECT * FROM dbo.duplicateTest 

DECLARE @id int = 1

IF EXISTS (SELECT count(*) FROM dbo.duplicateTest WHERE ID = @id HAVING count(*) > 1 )
   DELETE TOP(1) FROM dbo.duplicateTest WHERE ID = @id 

SELECT * FROM dbo.duplicateTest

So as you can see with SQL Server 2005 and later there are two options to allow you to delete duplicate identical rows of data in your tables.

DELETE Multiple Duplicate Records Using TOP

One of the downsides to the above approaches is that they only delete one record at a time. So if there are more than two duplicates you have to rerun the commands.

Here is another option submitted by one of our readers Basharat Bhat if there are more than two duplicates.

-- delete all records and add records again
DELETE FROM dbo.duplicateTest
INSERT INTO dbo.duplicateTest VALUES(1, 'Bob','Smith') 
INSERT INTO dbo.duplicateTest VALUES(2, 'Dave','Jones') 
INSERT INTO dbo.duplicateTest VALUES(3, 'Karen','White') 
INSERT INTO dbo.duplicateTest VALUES(1, 'Bob','Smith')
INSERT INTO dbo.duplicateTest VALUES(1, 'Bob','Smith')	
	
SELECT * FROM dbo.duplicateTest 

DECLARE @id int = 1

DELETE TOP (SELECT COUNT(*) -1 FROM dbo.duplicateTest WHERE ID = @id)  
FROM dbo.duplicateTest  
WHERE ID = @id

SELECT * FROM dbo.duplicateTest

DELETE Multiple Duplicate Records Using CTE

Here is another option submitted by one of our readers. This approach checks all of the columns to make sure that each column is a duplicate versus just the ID column in the above examples. This will delete records when there are 2 or more duplicate rows.

DELETE FROM dbo.duplicateTest
INSERT INTO dbo.duplicateTest VALUES(1, 'Bob','Smith') 
INSERT INTO dbo.duplicateTest VALUES(2, 'Dave','Jones') 
INSERT INTO dbo.duplicateTest VALUES(3, 'Karen','White') 
INSERT INTO dbo.duplicateTest VALUES(1, 'Bob','Smith')
INSERT INTO dbo.duplicateTest VALUES(1, 'Bob','Smith')	

SELECT * FROM dbo.duplicateTest; 

with temp(rank1,id ,fname,lname)
as (
   select row_number() over ( partition by ID, FirstName, LastName order by ID, FirstName, LastName ) , * 
   from duplicateTest
)
delete from temp where rank1 > 1;

SELECT * FROM dbo.duplicateTest;

DELETE Multiple Duplicate Records Using %%lockres%%

Here is another option submitted by another one of our readers. This approach checks all of the columns to make sure that each column is a duplicate versus just the ID column in the above examples. This will delete records when there are 2 or more duplicate rows.

DELETE FROM dbo.duplicateTest
INSERT INTO dbo.duplicateTest VALUES(1, 'Bob','Smith') 
INSERT INTO dbo.duplicateTest VALUES(2, 'Dave','Jones') 
INSERT INTO dbo.duplicateTest VALUES(3, 'Karen','White') 
INSERT INTO dbo.duplicateTest VALUES(1, 'Bob','Smith')
INSERT INTO dbo.duplicateTest VALUES(1, 'Bob','Smith')	

SELECT * FROM dbo.duplicateTest; 

DELETE FROM a
FROM dbo.duplicateTest a
JOIN
(
SELECT MAX(%%lockres%%) pseudoID, id, FirstName, LastName
FROM dbo.duplicateTest
GROUP BY id, FirstName, LastName
) b ON b.id = a.id AND b.LastName = a.LastName AND b.FirstName = a.FirstName AND b.pseudoID <> a.%%lockres%%

SELECT * FROM dbo.duplicateTest;

Next Steps

Take a look how the ROWCOUNT command can be used to affect the results of your query
Also take a look at the TOP command and changes that have been implemented with SQL Server
Start using TOP instead of ROWCOUNT for SQL Server

About the author

Greg Robidoux is the President and founder of Edgewood Solutions, a technology services company delivering services and solutions for Microsoft SQL Server. He is also one of the co-founders of MSSQLTips.com. Greg has been working with SQL Server since 1999, has authored numerous database-related articles, and delivered several presentations related to SQL Server. Before SQL Server, he worked on many data platforms such as DB2, Oracle, Sybase, and Informix.

This author pledges the content of this article is based on professional experience and not AI generated.

View all my tips

Friday, March 22, 2024 - 1:28:03 PM - Brent Shaub	Back To Top (92106)
Helped me out of a big data jam today. Duplicate rows were exact and no primary key differentiated them. Thank you, Greg.

Wednesday, January 10, 2024 - 7:36:38 AM - Davood Taherkhani	Back To Top (91847)
That's Great.

Friday, September 11, 2020 - 9:40:19 AM - Greg Robidoux	Back To Top (86458)
Hi Josh, that is an option as well. Several ways to solve the problem. -Greg

Friday, September 11, 2020 - 9:38:37 AM - Greg Robidoux	Back To Top (86457)
Thanks Chuck. Let me test it out and I can update the article with your code. Thanks Greg

Thursday, September 10, 2020 - 9:55:06 PM - Josh	Back To Top (86453)
Couldn't you just solve this by creating a new table? INSERT INTO #newtable SELECT DISTINCT from the original.

Sunday, August 23, 2020 - 5:55:44 PM - Jeff Moden	Back To Top (86346)
Awesome, Greg. Thanks for the feedback and for taking the time to make the safety modifications to the article.

Tuesday, August 18, 2020 - 11:48:38 AM - Greg Robidoux	Back To Top (86325)
I made some updates to the article to make sure that a delete only occurs if there is more than one record. Also added a variable to use so the same value doesn't need to be put in multiple places. -Greg

Tuesday, August 18, 2020 - 11:26:53 AM - Greg Robidoux	Back To Top (86324)
Hi Jeff, Thanks for your feedback. What you said makes total sense. I guess I was assuming that someone using this already knows they have a duplicate so would only delete the record in that case. But I see what you are saying that it would be easy to make a mistake. I will make some updates to this. -Greg

Wednesday, December 18, 2019 - 10:07:50 AM - Jason	Back To Top (83468)
I haven't run into something like this since before CTEs and didn't even think of a CTE being updatable. Good little tidbit, thank you!

Tuesday, December 18, 2018 - 8:32:56 AM - Rick Dobson	Back To Top (78522)
nice read. thanks.

Monday, November 24, 2014 - 4:29:50 AM - Archana	Back To Top (35378)
Sir please tell how to delete updated recod in sql 2008

Tuesday, May 14, 2013 - 6:21:20 AM - Ajay	Back To Top (23937)
First of All thank for your Post, I Need to remove both Duplicate Record in SQL any other way to resolve this.

Friday, December 21, 2012 - 6:39:11 AM - Dorababu	Back To Top (21076)
Hi recently I was asked by an interviewer and the question is as follows, I am having duplicate names irrespective of the original name. For example my name is Dorababu some of them inserted as Dhorababu or dorebabu like this. I would like to delete the remaining except Dorababu from the table how can I do this

Friday, April 27, 2012 - 8:45:00 AM - Jeff	Back To Top (17154)
Seems like you could also use select top 1

Wednesday, April 11, 2012 - 12:14:13 PM - Gadi35	Back To Top (16856)
Greg, that last code will cancel anything that exists and only leave the duplicates.

Friday, March 2, 2012 - 1:49:40 AM - manolitobsj	Back To Top (16234)
This post have been posted for more than two year but it helped me alot. Thanks.

Sunday, May 18, 2008 - 11:57:57 PM - balakumar.sk	Back To Top (1007)
It didnt work Prakash, gave me the follwoing error Msg 156, Level 15, State 1, Line 2 Incorrect syntax near the keyword 'As'."

Tuesday, March 18, 2008 - 2:29:33 AM - Bals	Back To Top (743)
The corelated query in TOP clause is not working .. Basharat Bhat has given only for single dup entry