Learn more about SQL Server tools

 

Tutorials          DBA          Dev          BI          Career          Categories          Videos          Whitepapers          Today's Tip          Join

Tutorials      DBA      Dev      BI      Categories      Events

DBA    Dev    BI    Categories

 

Comparing performance for the MERGE statement to SELECT, INSERT, UPDATE or DELETE


By:   |   Read Comments (6)   |   Related Tips: 1 | 2 | 3 | 4 | 5 | 6 | More > T-SQL

Problem

Now that SQL Server 2008 has been out for quite awhile I've started to see a lot more use of the MERGE statement that became available in this version. While I do find it makes code more readable, I wondered whether or not there were any performance benefits or drawbacks to using this new statement as compared with just using the classic SELECT/INSERT/UPDATE/DELETE statements to complete the same task.

Solution

Test Setup

For this test we are going to compare the following four scenarios which at a high level insert a new record into a table or update it if it already exists (based on the primary key of the table). We will run each scenario twice, once against a table where the record being inserted/updated already exists and once when it doesn't.

Here are the scenarios:

  1. SELECT record then INSERT or UPDATE based on result of SELECT statement
  2. UPDATE record then SELECT update rowcount and INSERT based on result of SELECT statement
  3. DELETE record then INSERT
  4. MERGE record

Here is the sql used to implement the test scenarios outlined above along with the code to create and load the test table.

-- Create test table and primary key
CREATE TABLE test (col1 INT NOT NULL, col2 VARCHAR(100))
go
ALTER TABLE dbo.test ADD CONSTRAINT
 PK_test PRIMARY KEY CLUSTERED 
 (
 col1
 ) WITH( STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, 
                ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
 
GO
-- Populate table
DECLARE @val INT
SELECT @val=1
WHILE @val < 100000
BEGIN  
   INSERT INTO dbo.test(col1, col2) VALUES(@val,'TEST' + CAST(@val AS VARCHAR))
   SELECT @val=@val+1
END
GO
-- test SELECT record then INSERT or UPDATE based on result of SELECT statement (record exists)
DBCC DROPCLEANBUFFERS
GO
BEGIN TRANSACTION
DECLARE @cnt INTEGER
SELECT @cnt=COUNT(1) FROM test WHERE col1=2
IF @cnt > 0 
  UPDATE test SET col2='update2' WHERE col1=2
ELSE
  INSERT INTO test VALUES (2,'update2')
GO
sp_lock
GO
COMMIT TRANSACTION
-- test SELECT record then INSERT or UPDATE based on result of SELECT statement (record does not exist)
DBCC DROPCLEANBUFFERS
GO
BEGIN TRANSACTION
DECLARE @cnt INTEGER
select @cnt=COUNT(1) from test WHERE col1=100001
IF @cnt > 0 
  UPDATE test SET col2='update100001' WHERE col1=100001
ELSE
  INSERT INTO test VALUES (100001,'update100001')
GO
sp_lock
GO
COMMIT TRANSACTION
-- test UPDATE record then SELECT update rowcount and INSERT based on result of SELECT statement (record exists)
DBCC DROPCLEANBUFFERS
GO
BEGIN TRANSACTION
DECLARE @cnt INTEGER
UPDATE test SET col2='update4' WHERE col1=4
SELECT @cnt=@@ROWCOUNT
IF @cnt < 1
  INSERT INTO test VALUES (4,'update4')
GO
sp_lock
GO
COMMIT TRANSACTION
-- test UPDATE record then SELECT update rowcount and INSERT based on result of SELECT statement (record does not exist)
DBCC DROPCLEANBUFFERS
GO
BEGIN TRANSACTION
DECLARE @cnt INTEGER
UPDATE test SET col2='update100004' WHERE col1=100004
select @cnt=@@ROWCOUNT
IF @cnt < 1
  INSERT INTO test VALUES (100004,'update100004')
GO
sp_lock
GO
COMMIT TRANSACTION
-- test DELETE record then INSERT (record exists)
DBCC DROPCLEANBUFFERS
GO
BEGIN TRANSACTION
DELETE FROM test WHERE col1=40000
INSERT INTO test VALUES (40000,'update40000')
GO
sp_lock
GO
COMMIT TRANSACTION
-- test DELETE record then INSERT (record does not exist)
DBCC DROPCLEANBUFFERS
GO
BEGIN TRANSACTION
DELETE FROM test WHERE col1=100002
INSERT INTO test VALUES (100002,'update100002')
GO
sp_lock
GO
COMMIT TRANSACTION
-- test MERGE record (record exists)
DBCC DROPCLEANBUFFERS
GO
BEGIN TRANSACTION
MERGE test AS TARGET
USING (SELECT 50000 as col1,'update50000' as col2) AS SOURCE
ON (TARGET.col1=SOURCE.col1)
WHEN MATCHED THEN
  UPDATE SET TARGET.col2=SOURCE.col2
WHEN NOT MATCHED THEN
  INSERT (col1,col2) VALUES (SOURCE.col1,SOURCE.col2);
GO
sp_lock
GO
COMMIT TRANSACTION
-- test MERGE record (record does not exist)
DBCC DROPCLEANBUFFERS
GO
BEGIN TRANSACTION
MERGE test AS TARGET
USING (SELECT 100003 as col1,'update100003' as col2) AS SOURCE
ON (TARGET.col1=SOURCE.col1)
WHEN MATCHED THEN
  UPDATE SET TARGET.col2=SOURCE.col2
WHEN NOT MATCHED THEN
  INSERT (col1,col2) VALUES (SOURCE.col1,SOURCE.col2);
GO
sp_lock
GO
COMMIT TRANSACTION
-- cleanup
DROP TABLE test
GO

Comparing Locks Acquired

Using sp_lock before the "COMMIT TRANSACTION" was issued for each scenario I was able to determine that in every case an X lock was acquired on the row being inserted/updated and an IX lock on both the table and page where the record was located. Below is the output of sp_lock for the first scenario. All other scenarios followed the exact same pattern. More information on these lock types can be found here.

spid dbid ObjId IndId Type Resource Mode Status
80 1 1659152956 0 TAB   IX GRANT
80 1 1659152956 1 PAG 1:310 IX GRANT
80 1 1659152956 1 KEY (020068e8b274) X GRANT

Comparing Resources Used

Now that we have confirmed that the MERGE statement acquires the exact same locks as the classic SELECT, INSERT, UPDATE and DELETE would, let's focus on the performance of these statements. We will run through the scenario's we described above and use SQL Profiler to capture some key performance metrics. Here are the results.

    CPU (ms) Reads Writes Duration (ms)
Scenario #1 Record exists 0 4 0 5
Record does not exist 0 12 0 8
Scenario #2 Record exists 0 6 0 10
Record does not exist 0 4 0 2
Scenario #3 Record exists 0 12 1 9
Record does not exist 0 4 0 1
Scenario #4 Record exists 0 10 1 2
Record does not exist 0 10 0 9

Looking at these results we can see that Scenario #2 performs marginally better that the other scenarios. The reason for this can probably be attributed to the fact that when it performs the check (SELECT) it is checking the @@ROWCOUNT variable as compared to checking a physical table. As I mention in the next steps I would guess that if we were to perform a similar test using sets of data, rather than a single record insert/update, the MERGE statement may perform better than the other scenarios. Looks like I may have to have a part 2 for this tip.

Next Steps


Last Update:






About the author


 









Post a comment or let the author know this tip helped.

All comments are reviewed, so stay on subject or we may delete your comment. Note: your email address is not published. Required fields are marked with an asterisk (*).

*Name    *Email    Notify for updates 


Get free SQL tips:

*Enter Code refresh code     



Saturday, April 21, 2012 - 12:41:16 PM - Mladen Andrijasevic Back To Top

Ben,

 

Thanks for answering 

 

>>would acquire locks as it inserts so whichever process actually inserts/updates data first would hold the lock and any other process trying to insert/update the same record would receive the error.

 

 

Well,should it not WAIT until the other transaction ends? There seems to be only exclusive locks and they are kept until the end of the transaction.    And when the lock is released then MERGE should update (and not insert) if the key already exists and not result in a PK violation.   What is more, if I execute the same code from two connections without committing one of them the other one waits.   I agree that one should avoid using the serializable isolatoin level, that is precisely why I am trying to understand if we can avoid it.  SERIALIZABLE prevents phantom inserts through key range locking but in this particular case even when I put SERIALIZABLE ( no violations) there do not seem to be key range locks so why would a lesser isolation level not work?   I will check the exact allocations of locks .

 

Thanks

Mladen


Friday, April 20, 2012 - 1:50:41 AM - Allan Hansen Back To Top

My own testing when I blogged about MERGE showed that I got about 25% increase in performance using MERGE versus just a plain update/insert over 66.000 ish records.




Thursday, April 19, 2012 - 11:51:41 PM - Ben Snaidero Back To Top

Mladen,

Without seeing a trace of your system under load as well as the sql you are running (specifically to see the explain plans) it's hard to diagnose exactly what's happening in your situation.  That said, I'd have to assume that using the merge statement would be no different than issuing an update or insert from multiple sessions in that the sql engine would acquire locks as it inserts so whichever process actually inserts/updates data first would hold the lock and any other process trying to insert/update the same record would receive the error.  I've never liked using serializable transactions as this reduces concurrency, most of these types of issues can usually be solved with proper indexing.

Thanks for reading

Ben


Thursday, April 19, 2012 - 3:20:55 PM - GMAguilar Back To Top

UPSERT has the best performance against  (Update or Insert) combintation.

 

begin trans

UPDATE ...

If @rowcount = 0 then

INSERT ...

End If

Commit


Thursday, April 19, 2012 - 8:20:28 AM - Andy Wilbourn Back To Top

In your sampling perhaps you should update sp_lock with sys.dm_tran_locks as the docs show sp_lock is going away.


Thursday, April 19, 2012 - 8:19:35 AM - Mladen Andrijasevic Back To Top
Although MERGE is an atomic operation without the need for explicit transactions we do get PK violations if we do not put SET TRANSACTION ISOLATION LEVEL SERIALIZABLE or WITH (HOLDLOCK) in the MERGE statement when there are identical values specified from different connections under heavy load. Why is that the case?

Learn more about SQL Server tools