Problem In a typical data warehousing application, quite often during the ETL cycle you need to perform INSERT, UPDATE and DELETE operations on a TARGET table by matching the records from the SOURCE table. For example, a products dimension table has information about the products; you need to sync-up this table with the latest information about the products from the source table. You would need to write separate INSERT, UPDATE and DELETE statements to refresh the target table with an updated product list or do lookups. Though it seems to be straight forward at first glance, but it becomes cumbersome when you have do it very often or on multiple tables, even the performance degrades significantly with this approach. In this tip we will walk through how to use the MERGE statement and do this in one pass.
Solution Beginning with SQL Server 2008, now you can use MERGE SQL command to perform these operations in a single statement. This new command is similar to the UPSERT (fusion of the words UPDATE and INSERT.) command of Oracle; it inserts rows that don't exist and updates the rows that do exist. With the introduction of the MERGE SQL command, developers can more effectively handle common data warehousing scenarios, like checking whether a row exists, and then executing an insert or update or delete.
The MERGE statement basically merges data from a source result set to a target table based on a condition that you specify and if the data from the source already exists in the target or not. The new SQL command combines the sequence of conditional INSERT, UPDATE and DELETE commands in a single atomic statement, depending on the existence of a record. The new MERGE SQL command looks like as below:
MERGE <target_table> [AS TARGET]
USING <table_source> [AS SOURCE]
THEN <merge_matched> ]
[WHEN NOT MATCHED [BY TARGET]
THEN <merge_not_matched> ]
[WHEN NOT MATCHED BY SOURCE
THEN <merge_ matched> ];
The MERGE statement basically works as separate insert, update, and delete statements all within the same statement. You specify a "Source" record set and a "Target" table, and the join between the two. You then specify the type of data modification that is to occur when the records between the two data are matched or are not matched. MERGE is very useful, especially when it comes to loading data warehouse tables, which can be very large and require specific actions to be taken when rows are or are not present.
Putting it all together
In this example I will take a Products table as target table and UpdatedProducts as a source table containing updated list of products. I will then use the MERGE SQL command to synchronize the target table with the source table.
First Let's create a target table and a source table and populate some data to these tables.
MERGE SQL statement - Part 1
--Create a target table
CREATE TABLE Products
ProductID INT PRIMARY KEY,
--Insert records into target table
INSERT INTO Products
(1, 'Tea', 10.00),
(2, 'Coffee', 20.00),
(3, 'Muffin', 30.00),
(4, 'Biscuit', 40.00)
--Create source table
CREATE TABLE UpdatedProducts
ProductID INT PRIMARY KEY,
--Insert records into source table
INSERT INTO UpdatedProducts
(1, 'Tea', 10.00),
(2, 'Coffee', 25.00),
(3, 'Muffin', 35.00),
(5, 'Pizza', 60.00)
SELECT * FROM Products
SELECT * FROM UpdatedProducts
Next I will use the MERGE SQL command to synchronize the target table with the refreshed data coming from the source table.
MERGE SQL statement - Part 2
--Synchronize the target table with
--refreshed data from source table
MERGE Products AS TARGET
USING UpdatedProducts AS SOURCE
ON (TARGET.ProductID = SOURCE.ProductID)
--When records are matched, update
--the records if there is any change
WHEN MATCHED AND TARGET.ProductName <> SOURCE.ProductName
OR TARGET.Rate <> SOURCE.Rate THEN
UPDATE SET TARGET.ProductName = SOURCE.ProductName,
TARGET.Rate = SOURCE.Rate
--When no records are matched, insert
--the incoming records from source
--table to target table
WHEN NOT MATCHED BY TARGET THEN
INSERT (ProductID, ProductName, Rate)
VALUES (SOURCE.ProductID, SOURCE.ProductName, SOURCE.Rate)
--When there is a row that exists in target table and
--same record does not exist in source table
--then delete this record from target table
WHEN NOT MATCHED BY SOURCE THEN
--$action specifies a column of type nvarchar(10)
--in the OUTPUT clause that returns one of three
--values for each row: 'INSERT', 'UPDATE', or 'DELETE',
--according to the action that was performed on that row
DELETED.ProductID AS TargetProductID,
DELETED.ProductName AS TargetProductName,
DELETED.Rate AS TargetRate,
INSERTED.ProductID AS SourceProductID,
INSERTED.ProductName AS SourceProductName,
INSERTED.Rate AS SourceRate;
When the above is run this is the output. There were 2 updates, 1 delete and 1 insert.
If we select all records from the Products table we can see the final results. We can see the Coffee rate was updated from 20.00 to 25.00, the Muffin rate was updated from 30.00 to 35.00, Biscuit was deleted and Pizza was inserted.
The MERGE SQL statement requires a semicolon (;) as a statement terminator. Otherwise Error 10713 is raised when a MERGE statement is executed without the statement terminator.
When used after MERGE, @@ROWCOUNT returns the total number of rows inserted, updated, and deleted to the client.
At least one of the three MATCHED clauses must be specified when using MERGE statement; the MATCHED clauses can be specified in any order. However a variable cannot be updated more than once in the same MATCHED clause.
Of course it's obvious, but just to mention, the person executing the MERGE statement should have SELECT Permission on the SOURCE Table and INSERT, UPDATE and DELETE Permission on the TARGET Table.
MERGE SQL statement improves the performance as all the data is read and processed only once whereas in previous versions three different statements have to be written to process three different activities (INSERT, UPDATE or DELETE) in which case the data in both the source and target tables are evaluated and processed multiple times; at least once for each statement.
MERGE SQL statement takes same kind of locks minus one Intent Shared (IS) Lock that was due to the select statement in the ‘IF EXISTS' as we did in previous version of SQL Server.
For every insert, update, or delete action specified in the MERGE statement, SQL Server fires any corresponding AFTER triggers defined on the target table, but does not guarantee on which action to fire triggers first or last. Triggers defined for the same action honor the order you specify.
Hi, thank you so much this is a very nice article because i was facing a problm how to differentiate wich row is updated in final target table and i got an idea by reading your article so i declared Modified_date in target table and in mathed condition i checked if any row values is updated at the time modified_date will change and easy to identify in final result.
Friday, February 20, 2015 - 6:54:03 AM - Shmuel Milavski
Is there a way to show only the items that are not matched in the target table from the source table? I run the below script in sql 2008 and get the following which is great but I want to be able to see which record did not match in the source table so I don't have to go through 120 lines of data manually. The only table and row I am interested in is "stgitem.avgunitcost" which is the source table
use test truncate table dbo.stgitem
BULK insert dbo.stgitem FROM 'C:\PriceUpdate\priceupdate.csv' WITH ( KEEPIDENTITY, FIELDTERMINATOR = ',', ROWTERMINATOR = '\n' )
use test merge dbo.item using dbo.stgitem on item.code = stgitem.code when matched then update set item.avgunitcost = stgitem.avgunitcost,item.issuecost = stgitem.issuecost;
(120 row (s) affected)
(1 row (s) affected)
(1 row (s) affected)
(119 row (s) affected)
Friday, November 21, 2014 - 1:58:45 PM - Kimberly Ford
This was a fabulous article. I needed to share a few tables from my data warehouse with another team but didn't want to give them access to the entire database. So, I create another database with just the tables I needed, used PART 2 and I have a fantastic method of keeping the tables in the "off database" updated! Now to just create my stored procedure and I'm set. Definitely a huge help!!!
Am currently imploring the merge function in my code, and for some reason, the matched rows are not updated in the target table when the update was done, only the Not matched rows that got inserted into the target table.
I am trying to use when not matched by Target then Insert, is it possible to use a different table to insert other than the target? I want to keep the target clean and build the not matched data into a different table. someone suggested using cte, not surehot to, any suggestions are welcome, thanks
Yes John, I think you are right you can use either UPDATE or INSERT at one time. But that does not mean you cannot achieve what you want to. There is a trick for this.
Here is a tip which takes similar approach for managing slowly changing dimension. (it updates the matching target record and then insert the matched source record into target table along with the new records).
I do have one question regarding the MERGE command. Can it handle an update and then an insert. For example, if there is a match between the Target and Source then I want one field in the Target updated and then the Source record added. How would this be done in the MERGE command? Since SQL keeps giving the error that it does not allow INSERTS on MATCHES.
Wednesday, August 07, 2013 - 12:54:52 PM - Abraham Babu
I do have a Update Trigger in a table. Update Trigger works fine when separate Update statement is fired. However, when MERGE statement issues a UPDATE command, the trigger is not invoked. Is there any precaution i need to make here?
Wednesday, November 14, 2012 - 1:26:58 AM - Achilles
What if you don't have a source table? I want to insert a single row (id, value1, value2, value3, ...) if the id doesn't match or update the existing row if the id does match. Do I really have to create a one-row temporary table for this?
Will the MERGE work if the target table is not identical to the source table? For instance my target table only has a subset of the columns that are in the source table and it is those columns that I want updated. The columns are named the same in both tables but they are in different positions since the source table has many more columns than the destination table.
Friday, May 04, 2012 - 3:52:15 PM - joseph Jelasker
Sorry Please avoid the previous request becaust it has some typo mistakes..Please consider this.
i have to have 3 insert statements ,to Target Table, inside "Not MATCHED BLOCK" .I can't do bulk insert because the subsequence insert has to take the previously inserted Identity Column(basically primary key).. i fail to do that.can you pls provide me the solution..