Use Caution with SQL Server’s MERGE Statement

Problem

SQL Server 2008 introduced the MERGE statement, which promised to be a simpler way to combine insert/update/delete statements, such as those used during ETL (extract, transform and load) operations. However, MERGE originally shipped with several “wrong results” and other bugs – some of which have been addressed, and some of which continue to exist in current versions. People also tend to make some leaps of faith regarding atomicity – they don’t realize that the single statement actually performs different operations separately, and thus can suffer from issues due to concurrency and race conditions just like separate statements can.

Solution

I have been recommending that – for now – people stick to their tried and true methods of separate statements. Here are the major reasons:

Bugs with the SQL Server Merge Statement

It can be quite difficult to validate and guarantee that you are immune from any of the bugs that still exist. A few Connect items that you should be aware of, that are either still active, closed as “Won’t Fix”/”By Design”, or have only been fixed in specific versions (often requiring a cumulative update or on-demand hotfix):

Connect issueCurrent / last known status
#773895 : MERGE Incorrectly Reports Unique Key Violations Won’t Fix
#771336 : Indexed view is not updated on data changes in base table Fixed only in 2012+
#766165 : MERGE evaluates filtered index per row, not post operation, which causes filtered index violation Won’t Fix
#723696 : Basic MERGE upsert causing deadlocks By Design
#713699 : A system assertion check has failed (“cxrowset.cpp”:1528) Won’t Fix
#699055 : MERGE query plans allow FK and CHECK constraint violations Active
#685800 : Parameterized DELETE and MERGE Allow Foreign Key Constraint Violations Won’t Fix
#654746 : merge in SQL2008 SP2 still suffers from “Attempting to set a non-NULL-able column’s value to NULL” Active
#635778 : NOT MATCHED and MATCHED parts of a SQL MERGE statement are not optimized Won’t Fix
#633132 : MERGE INTO WITH FILTERED SOURCE does not work properly Won’t Fix
#596086 : MERGE statement bug when INSERT/DELETE used and filtered index Won’t Fix
#583719 : MERGE statement treats non-nullable computed columns incorrectly in some scenarios Won’t Fix
#581548 : SQL2008 R2 Merge statement with only table variables fails Fixed only in 2012+
#539084 : Search condition on a non-key column and an ORDER BY in source derived table breaks MERGE completely Won’t Fix
#357419 : MERGE statement bypasses Referential Integrity Fixed only in 2012+
Merge statement fails when running db in Simple recovery model (2016) Fixed only with trace flag 692
MERGE statement assertion error when database is in simple recovery model (2017) Fixed only with trace flag 692
MERGE and INSERT with COLUMNSTORE index creates crash dump Unacknowledged
Support MERGE INTO target for memory optimized tables  Under Review
MERGE fails with a duplicate key error when using DELETE and INSERT actions  Under Review
CDC logging wrong operation type for a MERGE statement, when Unique index is present  Under Review
Query optimizer cannot make plan for merge statement  Under Review
Poor error message with MERGE when source/target appear in impossible places Won’t Fix
MERGE statement provokes deadlocking due to incorrect locking behavior  Under Review
Merge statement Delete does not update indexed view in all cases  Under Review
EXCEPTION_ACCESS_VIOLATION When referencing the “deleted” table from an OUTPUT statement during MERGE  Under Review
Fulltext index not updating after changing text column via MERGE statement on a partitioned table Under Review
MERGE on not matched by source UPDATE ignores variable declaration Under Review
“A severe error occurred on the current command.” with MERGE and OUTPUT Fixed only in 2014+

Some of these bugs can be worked around in the meantime using trace flags, but is the condensed format of MERGE really worth all of the extra testing that will require? Also, to give an idea of how many bugs might still have gone undetected, check out these informative blog posts by Paul White to understand how hard these bugs can be to even notice, never mind track down and fix.

SQL Server Merge Concurrency Issues

MERGE *looks* like it will take care of concurrency for you, because implicitly it seems to a single, atomic statement. However, under the covers, SQL Server really does perform the different operations independently. This means that you could end up with race conditions or primary key conflicts when multiple users are attempting to fire the same MERGE statement concurrently. Dan Guzman went into a lot of detail in his blog post a few years ago, but basically this means that unless you use a HOLDLOCK hint on your MERGE target, your statement is vulnerable to race conditions. In reviewing customer code and questions out in the community, I don’t recall ever coming across a HOLDLOCK hint naturally, except in cases where someone was demonstrating the very race condition I’m talking about. The pattern should be:

MERGE dbo.TableName WITH (HOLDLOCK) AS target
USING ... AS source ...;

And not what I typically see:

MERGE dbo.TableName AS target
USING ... AS source ...;

SQL Server Merge Effect on Triggers

Due to those same mechanics, converting your insert/update/delete code should be thoroughly tested when triggers are involved. Prevailing wisdom has instilled in most of us the mindset that, in SQL Server, triggers fire once per statement. However, with MERGE, this wisdom must be revisited, because it’s not quite true any longer. Let’s take this simple example:

SET NOCOUNT ON;
GO
CREATE TABLE dbo.MyTable(id INT);
GO
INSERT dbo.MyTable VALUES(1),(4);
GO
CREATE TRIGGER dbo.MyTable_All
 ON dbo.MyTable
 FOR INSERT, UPDATE, DELETE
AS
BEGIN
  PRINT 'Executing trigger. Rows affected: ' + RTRIM(@@ROWCOUNT);
  
  IF EXISTS (SELECT 1 FROM inserted) AND NOT EXISTS (SELECT 1 FROM deleted)
  BEGIN
    PRINT '  I am an insert...';
  END
  IF EXISTS (SELECT 1 FROM inserted) AND EXISTS (SELECT 1 FROM deleted)
  BEGIN
    PRINT '  I am an update...';
  END
  IF NOT EXISTS (SELECT 1 FROM inserted) AND EXISTS (SELECT 1 FROM deleted)
  BEGIN
    PRINT '  I am a delete...';
  END
END
GO

Now, when I first started playing with MERGE, I expected the trigger to fire once, just like it does for any multi-row operation. But this is not the case; it actually fires the trigger for each operation that ends up happening as a result of the MERGE command. For example, with this MERGE:

MERGE dbo.MyTable WITH (HOLDLOCK) AS Target
USING (VALUES(1),(2),(3)) AS Source(id)
ON Target.id = Source.id
WHEN MATCHED THEN UPDATE SET Target.id = Source.id
WHEN NOT MATCHED THEN INSERT(id) VALUES(Source.id)
WHEN NOT MATCHED BY SOURCE THEN DELETE;

I expected this output, since my MERGE operation affected a total of 4 rows:

Executing trigger. Rows affected: 4
  I am an insert...
  I am an update...
  I am a delete...

And then when I realized that the trigger gets fired multiple times, I expected this output, since I updated one row, inserted two new ones, and deleted one row:

Executing trigger. Rows affected: 2
  I am an insert...
Executing trigger. Rows affected: 1
  I am an update...
Executing trigger. Rows affected: 1
  I am a delete...

What I got was surprising to me – it made it seem like a hybrid between the trigger firing once and the trigger firing multiple times:

Executing trigger. Rows affected: 4
  I am an insert...
Executing trigger. Rows affected: 4
  I am an update...
Executing trigger. Rows affected: 4
  I am a delete...

So, the lesson here is, if you already have triggers and you’re switching to MERGE, don’t rely on @@ROWCOUNT for anything. This is true also if you have your INSERT / UPDATE / DELETE triggers in different modules – they’ll still all report the total @@ROWCOUNT for the operation even when no rows are affected by that branch. So when you do make this switch you’ll want to invest in refactoring or at least heavily testing your triggers in all unit test outcomes.

Improve Your Productivity or not with the SQL Server Merge Statement?

This is a minor and subjective gripe, but I’ve always found the Merge syntax quite daunting and difficult to memorize. I can’t imagine there are many people out there who can write a fully-featured MERGE statement without consulting Books Online, and without going through many more cycles of testing to verify that all branches function as expected.

Conclusion

I am not trying to spread fear, sound like Chicken Little, or make you wary about any new feature that is introduced in SQL Server, however the increasing number of issues I see with MERGE make me feel obligated to document the potential problems, make sure that you are very well aware of them before deciding to use the syntax. Personally, I have converted a few implementations back to separate statements, because there came a point where I no longer felt I could trust the behavior of MERGE. If you’re going to use it, please be careful!

Next Steps

One comment

  1. Aaron, I don’t know if you are interested, but I wrote a generator to generate upserts using regular CRUD statements. I always reference this blog post when I am explaining to people to avoid the use of MERGE.

    https://gist.github.com/tcartwright/cdec6b7df7e227c9b5beded468c6c588

Leave a Reply

Your email address will not be published. Required fields are marked *