Tips and Tricks to Growing Your SQL Server Estate from SMB to Enterprise Scale
Thursday, June 28, 2018 - click here to learn more
SQL Server 2008 introduced the MERGE statement, which promised to be a simpler way to combine insert/update/delete statements, such as those used during ETL (extract, transform and load) operations. However, MERGE originally shipped with several "wrong results" and other bugs - some of which have been addressed, and some of which continue to exist even in the early preview releases of SQL Server 2014. People also tend to make some leaps of faith regarding atomicity - they don't realize that the single statement actually performs different operations separately, and thus can suffer from issues due to concurrency and race conditions just like separate statements can.
I have been recommending that - for now - people stick to their tried and true methods of separate statements. Here are the major reasons:
Bugs with the SQL Server Merge Statement
It can be quite difficult to validate and guarantee that you are immune from any of the bugs that still exist. A few Connect items that you should be aware of, that are either still active, closed as "Won't Fix"/"By Design", or have only been fixed in specific versions (often requiring a cumulative update or on-demand hotfix):
Some of these bugs can be worked around in the meantime using trace flags, but is the condensed format of MERGE really worth all of the extra testing that will require? Also, keep in mind that this list is not necessarily exhaustive - just look at how many bugs have actually been reported on Connect that I haven't investigated and highlighted here. Also, to give an idea of how many bugs might still have gone undetected, check out these informative blog posts by Paul White to understand how hard these bugs can be to even notice, never mind track down and fix.
SQL Server Merge Concurrency Issues
MERGE *looks* like it will take care of concurrency for you, because implicitly it seems to a single, atomic statement. However, under the covers, SQL Server really does perform the different operations independently. This means that you could end up with race conditions or primary key conflicts when multiple users are attempting to fire the same MERGE statement concurrently. Dan Guzman went into a lot of detail in his blog post a few years ago, but basically this means that unless you use a HOLDLOCK hint on your MERGE target, your statement is vulnerable to race conditions. In reviewing customer code and questions out in the community, I don't recall ever coming across a HOLDLOCK hint naturally, except in cases where someone was demonstrating the very race condition I'm talking about. The pattern should be:
MERGE dbo.TableName WITH (HOLDLOCK) AS target USING ... AS source ...;
And not what I typically see:
MERGE dbo.TableName AS target USING ... AS source ...;
SQL Server Merge Effect on Triggers
Due to those same mechanics, converting your insert/update/delete code should be thoroughly tested when triggers are involved. Prevailing wisdom has instilled in most of us the mindset that, in SQL Server, triggers fire once per statement. However, with MERGE, this wisdom must be revisited, because it's not quite true any longer. Let's take this simple example:
SET NOCOUNT ON; GO CREATE TABLE dbo.MyTable(id INT); GO INSERT dbo.MyTable VALUES(1),(4); GO CREATE TRIGGER dbo.MyTable_All ON dbo.MyTable FOR INSERT, UPDATE, DELETE AS BEGIN PRINT 'Executing trigger. Rows affected: ' + RTRIM(@@ROWCOUNT); IF EXISTS (SELECT 1 FROM inserted) AND NOT EXISTS (SELECT 1 FROM deleted) BEGIN PRINT ' I am an insert...'; END IF EXISTS (SELECT 1 FROM inserted) AND EXISTS (SELECT 1 FROM deleted) BEGIN PRINT ' I am an update...'; END IF NOT EXISTS (SELECT 1 FROM inserted) AND EXISTS (SELECT 1 FROM deleted) BEGIN PRINT ' I am a delete...'; END END GO
Now, when I first started playing with MERGE, I expected the trigger to fire once, just like it does for any multi-row operation. But this is not the case; it actually fires the trigger for each operation that ends up happening as a result of the MERGE command. For example, with this MERGE:
MERGE dbo.MyTable WITH (HOLDLOCK) AS Target USING (VALUES(1),(2),(3)) AS Source(id) ON Target.id = Source.id WHEN MATCHED THEN UPDATE SET Target.id = Source.id WHEN NOT MATCHED THEN INSERT(id) VALUES(Source.id) WHEN NOT MATCHED BY SOURCE THEN DELETE;
I expected this output, since my MERGE operation affected a total of 4 rows:
Executing trigger. Rows affected: 4 I am an insert... I am an update... I am a delete...
And then when I realized that the trigger gets fired multiple times, I expected this output, since I updated one row, inserted two new ones, and deleted one row:
Executing trigger. Rows affected: 2 I am an insert... Executing trigger. Rows affected: 1 I am an update... Executing trigger. Rows affected: 1 I am a delete...
What I got was surprising to me - it made it seem like a hybrid between the trigger firing once and the trigger firing multiple times:
Executing trigger. Rows affected: 4 I am an insert... Executing trigger. Rows affected: 4 I am an update... Executing trigger. Rows affected: 4 I am a delete...
So, the lesson here is, if you already have triggers and you're switching to MERGE, don't rely on @@ROWCOUNT for anything. This is true also if you have your INSERT / UPDATE / DELETE triggers in different modules - they'll still all report the total @@ROWCOUNT for the operation even when no rows are affected by that branch. So when you do make this switch you'll want to invest in refactoring or at least heavily testing your triggers in all unit test outcomes.
Improve Your Productivity or not with the SQL Server Merge Statement?
This is a minor and subjective gripe, but I've always found the Merge syntax quite daunting and difficult to memorize. I can't imagine there are many people out there who can write a fully-featured MERGE statement without consulting Books Online, and without going through many more cycles of testing to verify that all branches function as expected.
I am not trying to spread fear, sound like Chicken Little, or make you wary about any new feature that is introduced in SQL Server, however the increasing number of issues I see with MERGE make me feel obligated to document the potential problems, make sure that you are very well aware of them before deciding to use the syntax. Personally, I have converted a few implementations back to separate statements, because there came a point where I no longer felt I could trust the behavior of MERGE. If you're going to use it, please be careful!
- Test and validate that all of your MERGE statements (and triggers) are operating as expected.
- Add the HOLDLOCK hint to any MERGE statements not already protected by adequate isolation levels in order to shield them from race conditions.
- Review the bugs above that have been fixed and verify that you are on a build of SQL Server that contains the fix.
- Review the following tips for specific usage of the MERGE statement:
Last Update: 2013-10-17
About the author
View all my tips
- Using MERGE in SQL Server to insert, update and de...
- SQL Server 2008 MERGE More than UPSERT...
- Resolving the MERGE statement attempted to UPDATE ...
- Comparing performance for the MERGE statement to S...
- Use Caution with SQL Server's MERGE Statement...
- Using the SQL Server MERGE Statement to Process Ty...
- More Database Developer Tips...