Performance Considerations with SQL MERGE vs INSERT, UPDATE, DELETE

By:   |   Updated: 2023-03-08   |   Comments (6)   |   Related: 1 | 2 | 3 | 4 | 5 | 6 | 7 | > TSQL


Problem

Whenever someone mentions the MERGE statement, at least one person points out that the performance is sub-optimal compared to the same basic T-SQL INSERT, UPDATE, and DELETE statements. You can find a wide variety of articles either supporting or condemning MERGE. Like sushi or exercise, it's one of those SQL statements that people love or hate. Full disclosure, I like using MERGE. So, I'm putting the claims of it performing sub-optimally to the test. Perhaps by the end of this tutorial, I'll change my mind and stop using MERGE.

Solution

In this tutorial, we'll start by looking at the MERGE statement and why you would want to use it. I'll then look at alternatives to MERGE. I'll present the primary functionality MERGE provides that other options don't. We'll set up a decent size dataset and perform our comparison test. By the end of this tutorial, I hope to answer the question: Is MERGE slow? Perhaps, more importantly: Should new code projects include the MERGE statement?

Exploring the SQL Server MERGE Statement

Let's start by answering, "What is MERGE?" Microsoft defines MERGE as "a statement that runs insert, update, or delete operations on a target table from the results of a join with a source table." It seems simple, right? You have a source table, aka the source of truth. Then you have a destination table that you'll update based on the source of truth. The MERGE statement exists in other RDMS such as Oracle, MySQL, and Azure Synapse.

Data engineers commonly use MERGE for loading or syncing data into an environment, perhaps for a data warehouse. For example, if your database has an employee table synced nightly from an origination system, you could use MERGE to get the job done. MERGE inserts new employees, updates existing ones, and deletes ones missing from the source. Below is a simple example of using the MERGE statement.

MERGE dbo.DestinationTable AS dest
USING dbo.SourceTable AS src -- Source Clauses
ON (dest.SpecialKey = src.SpecialKey)
WHEN MATCHED THEN -- Matched Clauses
    UPDATE SET Column1 = src.Column1,
               Column2 = src.Column2,
               Column3 = src.Column3
WHEN NOT MATCHED BY TARGET THEN
    INSERT
    (
        Column1,
        Column2,
        Column3
    )
    VALUES
    (src.Column1, src.Column2, src.Column3);
GO

Don't forget to include the semicolon at the end of MERGE or SQL yells at you.

Unique Advantage of MERGE

One of the main reasons I advocate MERGE is that it offers the ability to use a special OUTPUT clause to reference columns not part of the inserted or deleted tables. The standard OUTPUT clause works with INSERT, UPDATE, and DELETE; however, you can't reference columns not being inserted or deleted. Daniel Hutmacher wrote a short blog post on the topic that's worth the time investment. This technique helps replace the logic of cursors or WHILE loops.

It's impossible to perform a search without coming across articles pointing out known issues with the MERGE statement. Aaron Bertrand has written several of them, and I recommend reading them. MERGE is not a perfect statement, but this article focuses on performance. When I say performance, I care primarily about the time something takes to complete.

Alternative to MERGE

What's the alternative to MERGE? It's straightforward—you individually perform INSERT, UPDATE, and DELETE statements. Microsoft recommends not using MERGE if you perform any of those individual operations. Of course, this would also mean you don't need access to the non-inserted and deleted columns mentioned above.

Building a Test Dataset

Now, let's set up the base for our test. It's important to mimic a real-world dataset versus a simplistic one with a single integer column. The basic idea is to update one table based on the source of another. Let's name our first table "EmployeeSource" and the second "EmployeeDestination." We'll also create a trusty numbers table. I rarely delete data in the real world due to compliance regulations. So, let's include a BIT column indicating a soft delete instead of deleting. Remember, you'll need about 7 GB of free space.

USE [master];
GO

IF DATABASEPROPERTYEX('MergeDemo', 'Version') IS NOT NULL
BEGIN
    ALTER DATABASE MergeDemo SET SINGLE_USER WITH ROLLBACK IMMEDIATE;
    DROP DATABASE MergeDemo;
END;
GO

CREATE DATABASE MergeDemo;
GO

ALTER DATABASE MergeDemo SET RECOVERY SIMPLE;
GO

USE MergeDemo;
GO

DROP TABLE IF EXISTS dbo.Numbers;
GO

DECLARE @UpperBound INT = 10010000;
;WITH cteN (Number)
AS (SELECT ROW_NUMBER() OVER (ORDER BY s1.[object_id])
    FROM sys.all_columns AS s1
        CROSS JOIN sys.all_columns AS s2)
SELECT [Number]
INTO dbo.Numbers
FROM cteN
WHERE [Number] <= @UpperBound;

CREATE UNIQUE CLUSTERED INDEX CIX_Number
ON dbo.Numbers ([Number])
WITH (FILLFACTOR = 100);

CREATE TABLE dbo.EmployeeDestination
(
    Id INT IDENTITY(1, 1) NOT NULL,
    EmployeeNumber CHAR(20) NOT NULL,
    FirstName NVARCHAR(256) NOT NULL,
    LastName NVARCHAR(256) NOT NULL,
    MiddleIntial CHAR(1) NULL,
    YearlySalary DECIMAL(36, 2) NOT NULL,
    DepartmentCode CHAR(5) NOT NULL,
    StartDate DATE NOT NULL,
    IsActive BIT NOT NULL
        DEFAULT 1,
    IsDeleted BIT NOT NULL
        DEFAULT 0,
    CONSTRAINT PK_Merge_Employee_Id
        PRIMARY KEY CLUSTERED (Id)
);

CREATE TABLE dbo.EmployeeSource
(
    RowNumber INT NOT NULL,
    EmployeeNumber CHAR(20) NOT NULL,
    FirstName NVARCHAR(256) NOT NULL,
    LastName NVARCHAR(256) NOT NULL,
    MiddleIntial CHAR(1) NULL,
    YearlySalary DECIMAL(36, 2) NOT NULL,
    DepartmentCode CHAR(5) NOT NULL,
    StartDate DATE NOT NULL,
    IsActive BIT NOT NULL,
    IsDeleted BIT NOT NULL
);
GO

The next code block performs the following actions:

  1. The first statement inserts 10 million rows of data into the "EmployeeDestination" table.
  2. The second statement inserts 1 million rows of data into the "EmployeeSource" table.
  3. The third statement updates 100,000 rows of data already in the "EmployeeSource" table.
  4. The last statement inserts 10,000 new rows of data into the "EmployeeSource" table.

After performing these actions, the goal is to insert 10,000 new rows of data and update 100,000 existing rows of data in the "EmployeeDestination" table.

INSERT INTO dbo.EmployeeDestination
(
    EmployeeNumber,
    FirstName,
    LastName,
    MiddleIntial,
    YearlySalary,
    DepartmentCode,
    StartDate,
    IsActive
)
SELECT TOP (10000000)
       CONCAT('E00', n.Number) AS EmployeeNumber,
       SUBSTRING('ABCDEFGHIJKLMNOPQRSTUVWXYZ', (ABS(CHECKSUM(NEWID())) % 26) + 1, 6) AS FirstName,
       SUBSTRING('ABCDEFGHIJKLMNOPQRSTUVWXYZ', (ABS(CHECKSUM(NEWID())) % 26) + 1, 7) AS LastName,
       SUBSTRING('ABCDEFGHIJKLMNOPQRSTUVWXYZ', (ABS(CHECKSUM(NEWID())) % 26) + 1, 1) AS MiddleInitial,
       ABS(CHECKSUM(NEWID()) % 100000) + 45000 AS YearlySalary,
       CONCAT('M000', ABS(CHECKSUM(NEWID()) % 3) + 1) AS DepartmentCode,
       DATEADD(DAY, RAND(CHECKSUM(NEWID())) * (1 + DATEDIFF(DAY, '2020-01-01', '03-31-2023')), '2020-01-01') AS StartDate,
       CASE
           WHEN (n.Number % 1000) = 0 THEN
               0
           ELSE
               1
       END AS IsActive
FROM dbo.Numbers n;
GO

INSERT INTO dbo.EmployeeSource
(
    RowNumber,
    EmployeeNumber,
    FirstName,
    LastName,
    MiddleIntial,
    YearlySalary,
    DepartmentCode,
    StartDate,
    IsActive,
    IsDeleted
)
SELECT ROW_NUMBER() OVER (ORDER BY EmployeeNumber) AS RowNumber,
       EmployeeNumber,
       FirstName,
       LastName,
       MiddleIntial,
       YearlySalary,
       DepartmentCode,
       StartDate,
       IsActive,
       IsDeleted
FROM dbo.EmployeeDestination
WHERE (Id % 10) = 0;


UPDATE dbo.EmployeeSource
SET YearlySalary = ABS(CHECKSUM(NEWID()) % 100000) + 45000,
    IsDeleted = 1,
    DepartmentCode = 'M0004'
WHERE (RowNumber % 10) = 0;

INSERT INTO dbo.EmployeeSource
(
    RowNumber,
    EmployeeNumber,
    FirstName,
    LastName,
    MiddleIntial,
    YearlySalary,
    DepartmentCode,
    StartDate,
    IsDeleted,
    IsActive
)
SELECT TOP (10000)
       ROW_NUMBER() OVER (ORDER BY n.Number) AS RowNumber,
       CONCAT('E00', n.Number) AS EmployeeNumber,
       SUBSTRING('ABCDEFGHIJKLMNOPQRSTUVWXYZ', (ABS(CHECKSUM(NEWID())) % 26) + 1, 6) AS FirstName,
       SUBSTRING('ABCDEFGHIJKLMNOPQRSTUVWXYZ', (ABS(CHECKSUM(NEWID())) % 26) + 1, 7) AS LastName,
       SUBSTRING('ABCDEFGHIJKLMNOPQRSTUVWXYZ', (ABS(CHECKSUM(NEWID())) % 26) + 1, 1) AS MiddleInitial,
       ABS(CHECKSUM(NEWID()) % 100000) + 45000 AS YearlySalary,
       CONCAT('M000', ABS(CHECKSUM(NEWID()) % 3) + 1) AS DepartmentCode,
       DATEADD(DAY, RAND(CHECKSUM(NEWID())) * (1 + DATEDIFF(DAY, '2020-01-01', '03-31-2023')), '2020-01-01') AS StartDate,
       0 AS IsDeleted,
       1 AS IsActive
FROM dbo.Numbers n
WHERE n.Number > 10000000;
GO

At this point in the process, I made a copy-only backup that I'll restore after each test. I wanted to ensure that certain factors like statistics or cache didn't affect the results.

Executing the SQL MERGE Statement

With either of these methods, I want to ensure SQL Server only updates rows where a modification exists. Said another way, the row in the source data and destination data are different. This technique is beneficial when dealing with temporal tables. Now that we have our dataset let's first run the MERGE statement:

MERGE dbo.EmployeeDestination AS dest
USING dbo.EmployeeSource AS src
ON (dest.EmployeeNumber = src.EmployeeNumber)
WHEN MATCHED AND EXISTS
                 (
                     SELECT FirstName,
                            LastName,
                            MiddleIntial,
                            YearlySalary,
                            DepartmentCode,
                            StartDate,
                            IsActive,
                            IsDeleted
                     FROM dbo.EmployeeSource
                     EXCEPT
                     SELECT FirstName,
                            LastName,
                            MiddleIntial,
                            YearlySalary,
                            DepartmentCode,
                            StartDate,
                            IsActive,
                            IsDeleted
                     FROM dbo.EmployeeDestination 
                 ) THEN
    UPDATE SET FirstName = src.FirstName,
               LastName = src.LastName,
               MiddleIntial = src.MiddleIntial,
               YearlySalary = src.YearlySalary,
               DepartmentCode = src.DepartmentCode,
               StartDate = src.StartDate,
               IsActive = src.IsActive,
               IsDeleted = src.IsDeleted
WHEN NOT MATCHED BY TARGET THEN
    INSERT
    (
        EmployeeNumber,
        FirstName,
        LastName,
        MiddleIntial,
        YearlySalary,
        DepartmentCode,
        StartDate,
        IsActive,
        IsDeleted
    )
    VALUES
    (src.EmployeeNumber, src.FirstName, src.LastName, src.MiddleIntial, src.YearlySalary, src.DepartmentCode,
     src.StartDate, src.IsActive, src.IsDeleted);
GO

After each iteration, I executed the above statement five times with a restore of the database. This process was laborious, but I wanted consistent comparisons.

Performing INSERT Statements and UPDATE Statements

Below is the code for performing the UPDATE and INSERT in separate statements. Like the MERGE, I only want to update rows where a modification occurred.

UPDATE dest
SET dest.FirstName = src.FirstName,
    dest.LastName = src.LastName,
    dest.MiddleIntial = src.MiddleIntial,
    dest.YearlySalary = src.YearlySalary,
    dest.DepartmentCode = src.DepartmentCode,
    dest.StartDate = src.StartDate,
    IsActive = src.IsActive,
    IsDeleted = src.IsDeleted
FROM dbo.EmployeeDestination_NonMerge dest
    INNER JOIN dbo.EmployeeSource src
        ON dest.EmployeeNumber = src.EmployeeNumber -- JOIN Condition
WHERE EXISTS
(
    SELECT FirstName,
           LastName,
           MiddleIntial,
           YearlySalary,
           DepartmentCode,
           StartDate,
           IsActive,
           IsDeleted
    FROM dbo.EmployeeSource
    EXCEPT
    SELECT FirstName,
           LastName,
           MiddleIntial,
           YearlySalary,
           DepartmentCode,
           StartDate,
           IsActive,
           IsDeleted
    FROM dbo.EmployeeDestination 
);

INSERT INTO dbo.EmployeeDestination 
(
    EmployeeNumber,
    FirstName,
    LastName,
    MiddleIntial,
    YearlySalary,
    DepartmentCode,
    StartDate,
    IsActive,
    IsDeleted
)
SELECT EmployeeNumber,
       FirstName,
       LastName,
       MiddleIntial,
       YearlySalary,
       DepartmentCode,
       StartDate,
       IsActive,
       IsDeleted
FROM dbo.EmployeeSource src
WHERE NOT EXISTS
(
    SELECT src.EmployeeNumber
    FROM dbo.EmployeeDestination dest
    WHERE src.EmployeeNumber = dest.EmployeeNumber
);
GO

As with MERGE, I executed the two statements five times with a restored database after each run.

Reviewing the Results

I captured the performance markers below with Extended Events. It's a helpful tool that allows for comprehensive and efficient monitoring and collection of system information from SQL Server. You could also use STATISTICS TIME & IO. Now back to the results.

The numbers reflect an average of five runs. There wasn't a significant difference between individual executions.

Method Logical Reads Writes CPU Time Duration Duration %
MERGE Statement 5992820 99404 838904667 79088005 18.9%
UPDATE/INSERT Operations 5829529 99316 720380333 66536671  

Before comparing the statements, I suspected MERGE might be slower. According to the results, MERGE is about 19% slower. The outcome surprised me. I would have guessed 3-5%. Does this change my mind? Yea, kind of. When starting a new project requiring a larger data sync, I'll likely choose the individual INSERT operations and UPDATE operations. A 19% difference is huge if your sync process takes hours. Dwain Camps found similar results when comparing the methods in this article.

Do you prefer the MERGE syntax? Then please keep on using it. I drive a particular route to my office because of the traffic pattern, even though it takes a bit longer sometimes.

I look forward to hearing about your experiences with MERGE in the comments below. If you feel the test was somehow skewed, please let me know. I based the original criteria on my specific needs.

Key Takeaways

  • A handy feature of MERGE is the ability to reference columns not in the inserted or deleted tables in the OUTPUT clause.
  • If you only perform individual INSERT, UPDATE, or DELETE statements, consider skipping MERGE.
  • Do you like using MERGE? Keep doing so but be aware that the command might add performance overhead.
Next Steps
  • Would you like to learn more about capturing performance metrics using Extended Events? Check out this link for a list of tips.
  • Arshad Ali wrote a detailed article on using the MERGE statement. I highly recommend giving it a read.
  • Aaron Bertrand wrote a massive collection of articles on proceeding with caution when using the MERGE statement.


sql server categories

sql server webinars

subscribe to mssqltips

sql server tutorials

sql server white papers

next tip



About the author
MSSQLTips author Jared Westover Jared Westover is a passionate technology specialist at Crowe, helping to build data solutions. For the past two decades, he has focused on SQL Server, Azure, Power BI, and Oracle. Besides writing for MSSQLTips.com, he has published 12 Pluralsight courses about SQL Server and Reporting Services. Jared enjoys reading and spending time with his wife and three sons when he's not trying to make queries go faster.

This author pledges the content of this article is based on professional experience and not AI generated.

View all my tips


Article Last Updated: 2023-03-08

Comments For This Article




Friday, October 13, 2023 - 11:59:02 AM - Jared Westover Back To Top (91667)
@Tim

Thank you for stopping by and adding a comment. I added the additional logic to avoid updating rows with no changes. Here is the first reference I made to it.

" I want to ensure SQL Server only updates rows where a modification exists."

I wouldn't include the additional logic if I didn't care about unnecessarily updating rows. An everyday use case involves temporal tables.

Thursday, October 12, 2023 - 10:32:32 AM - Tim Back To Top (91655)
Why not let the merge do what its supposed to do

MERGE dbo.EmployeeDestination AS dest
USING dbo.EmployeeSource AS src
ON (dest.EmployeeNumber = src.EmployeeNumber)
WHEN MATCHED
UPDATE SET FirstName = src.FirstName,
LastName = src.LastName,
MiddleIntial = src.MiddleIntial,
YearlySalary = src.YearlySalary,
DepartmentCode = src.DepartmentCode,
StartDate = src.StartDate,
IsActive = src.IsActive,
IsDeleted = src.IsDeleted
WHEN NOT MATCHED BY TARGET THEN
INSERT
(
EmployeeNumber,
FirstName,
LastName,
MiddleIntial,
YearlySalary,
DepartmentCode,
StartDate,
IsActive,
IsDeleted
)
VALUES
(src.EmployeeNumber, src.FirstName, src.LastName, src.MiddleIntial, src.YearlySalary, src.DepartmentCode,
src.StartDate, src.IsActive, src.IsDeleted);

There's no need to use the exists and another select statement

Wednesday, May 10, 2023 - 2:52:16 AM - Mikael Eriksson Back To Top (91173)
Looks like you made a mistake when checking for rows to update. You are just checking the entire table for if there are ay differences at all.
I think it would be better to check each matching row. Try this and tell me what you think.

WHEN MATCHED AND EXISTS
(
SELECT dest.FirstName,
dest.LastName,
dest.MiddleIntial,
dest.YearlySalary,
dest.DepartmentCode,
dest.StartDate,
dest.IsActive,
dest.IsDeleted
EXCEPT
SELECT src.FirstName,
src.LastName,
src.MiddleIntial,
src.YearlySalary,
src.DepartmentCode,
src.StartDate,
src.IsActive,
src.IsDeleted
)
THEN


Saturday, March 18, 2023 - 4:29:31 PM - Jared Westover Back To Top (91021)
@Carsten Saastamoinen

Thanks for taking the time to read and put together a reply. A couple of items piqued my interest.

1) No question that the 2nd set of MERGE/UPSERT statements performs faster since SQL is not checking if the individual columns of the row should be updated. I did outline that in the article above. Here is the first reference I made to it.
" I want to ensure SQL Server only updates rows where a modification exists."
I wouldn't include the additional logic if I didn't care about only updating those rows.

2) If I run all the statements you outlined, the last INSERT/UPDATE performs worse. However, the reason has nothing to do with either method.

For example, if you switch the order you execute the statements, you'll get different results.

The transaction log is growing larger, and it's at its peak on the last statement. Performing a CHECKPOINT before the last INSERT/UPDATE will be faster than the MERGE, even if you execute a CHECKPOINT before the second MERGE.

I specially restored the database after each test run so that I would avoid running into discrepancies with the cache, statistics, and transaction log.

Sunday, March 12, 2023 - 7:19:02 AM - Carsten Saastamoinen Back To Top (91000)
I do not agree in the conclusion. The statement should be modified, because there are a lot on unnecessary code. If I run the following script with the original code and the modified code on my computer, MERGE is faster.



Merge 1

SQL Server Execution Times: CPU time = 502042 ms, elapsed time = 86864 ms.

Merge 2

SQL Server Execution Times: CPU time = 8018 ms, elapsed time = 7419 ms.

Upsert 1

SQL Server Execution Times: CPU time = 508219 ms, elapsed time = 82928 ms.
SQL Server Execution Times: CPU time = 2889 ms, elapsed time = 573 ms.

Upsert 2

SQL Server Execution Times: CPU time = 10861 ms, elapsed time = 8963 ms.
SQL Server Execution Times: CPU time = 2751 ms, elapsed time = 968 ms.

USE master;
GO
DROP DATABASE IF EXISTS MergeDemo;
GO
CREATE DATABASE MergeDemo;
GO
ALTER DATABASE MergeDemo SET RECOVERY SIMPLE;
GO
USE MergeDemo;
GO
DECLARE @UpperBound INT = 10010000;

WITH cteN (Number)
AS
(
SELECT ROW_NUMBER() OVER (ORDER BY s1.[object_id])
FROM sys.all_columns AS s1
CROSS JOIN sys.all_columns AS s2
)
SELECT Number
INTO dbo.Numbers
FROM cteN
WHERE Number <= @UpperBound;
GO
CREATE UNIQUE CLUSTERED INDEX CIX_Number ON dbo.Numbers (Number)
WITH (FILLFACTOR = 100);

CREATE TABLE dbo.EmployeeDestination1
(
Id INT IDENTITY(1, 1) NOT NULL
CONSTRAINT PK_Merge_Employee_Id1 PRIMARY KEY,
EmployeeNumber CHAR(20) NOT NULL,
FirstName NVARCHAR(256) NOT NULL,
LastName NVARCHAR(256) NOT NULL,
MiddleIntial CHAR(1) NULL,
YearlySalary DECIMAL(36, 2) NOT NULL,
DepartmentCode CHAR(5) NOT NULL,
StartDate DATE NOT NULL,
IsActive BIT NOT NULL
CONSTRAINT DF_IsActive1 DEFAULT 1,
IsDeleted BIT NOT NULL
CONSTRAINT DF_IsDeleted1 DEFAULT 0
);

CREATE TABLE dbo.EmployeeDestination2
(
Id INT IDENTITY(1, 1) NOT NULL
CONSTRAINT PK_Merge_Employee_Id2 PRIMARY KEY,
EmployeeNumber CHAR(20) NOT NULL,
FirstName NVARCHAR(256) NOT NULL,
LastName NVARCHAR(256) NOT NULL,
MiddleIntial CHAR(1) NULL,
YearlySalary DECIMAL(36, 2) NOT NULL,
DepartmentCode CHAR(5) NOT NULL,
StartDate DATE NOT NULL,
IsActive BIT NOT NULL
CONSTRAINT DF_IsActive2 DEFAULT 1,
IsDeleted BIT NOT NULL
CONSTRAINT DF_IsDeleted2 DEFAULT 0
);

CREATE TABLE dbo.EmployeeDestination3
(
Id INT IDENTITY(1, 1) NOT NULL
CONSTRAINT PK_Merge_Employee_Id3 PRIMARY KEY,
EmployeeNumber CHAR(20) NOT NULL,
FirstName NVARCHAR(256) NOT NULL,
LastName NVARCHAR(256) NOT NULL,
MiddleIntial CHAR(1) NULL,
YearlySalary DECIMAL(36, 2) NOT NULL,
DepartmentCode CHAR(5) NOT NULL,
StartDate DATE NOT NULL,
IsActive BIT NOT NULL
CONSTRAINT DF_IsActive3 DEFAULT 1,
IsDeleted BIT NOT NULL
CONSTRAINT DF_IsDeleted3 DEFAULT 0
);

CREATE TABLE dbo.EmployeeDestination4
(
Id INT IDENTITY(1, 1) NOT NULL
CONSTRAINT PK_Merge_Employee_Id4 PRIMARY KEY,
EmployeeNumber CHAR(20) NOT NULL,
FirstName NVARCHAR(256) NOT NULL,
LastName NVARCHAR(256) NOT NULL,
MiddleIntial CHAR(1) NULL,
YearlySalary DECIMAL(36, 2) NOT NULL,
DepartmentCode CHAR(5) NOT NULL,
StartDate DATE NOT NULL,
IsActive BIT NOT NULL
CONSTRAINT DF_IsActive4 DEFAULT 1,
IsDeleted BIT NOT NULL
CONSTRAINT DF_IsDeleted4 DEFAULT 0
);

CREATE TABLE dbo.EmployeeSource
(
RowNumber INT NOT NULL,
EmployeeNumber CHAR(20) NOT NULL,
FirstName NVARCHAR(256) NOT NULL,
LastName NVARCHAR(256) NOT NULL,
MiddleIntial CHAR(1) NULL,
YearlySalary DECIMAL(36, 2) NOT NULL,
DepartmentCode CHAR(5) NOT NULL,
StartDate DATE NOT NULL,
IsActive BIT NOT NULL,
IsDeleted BIT NOT NULL
);
GO
INSERT INTO dbo.EmployeeDestination1
(
EmployeeNumber,
FirstName,
LastName,
MiddleIntial,
YearlySalary,
DepartmentCode,
StartDate,
IsActive
)
SELECT TOP (10000000)
CONCAT('E00', n.Number) AS EmployeeNumber,
SUBSTRING('ABCDEFGHIJKLMNOPQRSTUVWXYZ', (ABS(CHECKSUM(NEWID())) % 26) + 1, 6) AS FirstName,
SUBSTRING('ABCDEFGHIJKLMNOPQRSTUVWXYZ', (ABS(CHECKSUM(NEWID())) % 26) + 1, 7) AS LastName,
SUBSTRING('ABCDEFGHIJKLMNOPQRSTUVWXYZ', (ABS(CHECKSUM(NEWID())) % 26) + 1, 1) AS MiddleInitial,
ABS(CHECKSUM(NEWID()) % 100000) + 45000 AS YearlySalary,
CONCAT('M000', ABS(CHECKSUM(NEWID()) % 3) + 1) AS DepartmentCode,
DATEADD(DAY, RAND(CHECKSUM(NEWID())) * (1 + DATEDIFF(DAY, '2020-01-01', '03-31-2023')), '2020-01-01') AS StartDate,
CASE
WHEN (n.Number % 1000) = 0 THEN
0
ELSE
1
END AS IsActive
FROM dbo.Numbers n;

INSERT INTO dbo.EmployeeDestination2
(
EmployeeNumber,
FirstName,
LastName,
MiddleIntial,
YearlySalary,
DepartmentCode,
StartDate,
IsActive
)
SELECT EmployeeNumber,
FirstName,
LastName,
MiddleIntial,
YearlySalary,
DepartmentCode,
StartDate,
IsActive
FROM dbo.EmployeeDestination1;

INSERT INTO dbo.EmployeeDestination3
(
EmployeeNumber,
FirstName,
LastName,
MiddleIntial,
YearlySalary,
DepartmentCode,
StartDate,
IsActive
)
SELECT EmployeeNumber,
FirstName,
LastName,
MiddleIntial,
YearlySalary,
DepartmentCode,
StartDate,
IsActive
FROM dbo.EmployeeDestination1;

INSERT INTO dbo.EmployeeDestination4
(
EmployeeNumber,
FirstName,
LastName,
MiddleIntial,
YearlySalary,
DepartmentCode,
StartDate,
IsActive
)
SELECT EmployeeNumber,
FirstName,
LastName,
MiddleIntial,
YearlySalary,
DepartmentCode,
StartDate,
IsActive
FROM dbo.EmployeeDestination1;
GO
INSERT INTO dbo.EmployeeSource
(
RowNumber,
EmployeeNumber,
FirstName,
LastName,
MiddleIntial,
YearlySalary,
DepartmentCode,
StartDate,
IsActive,
IsDeleted
)
SELECT ROW_NUMBER() OVER (ORDER BY EmployeeNumber) AS RowNumber,
EmployeeNumber,
FirstName,
LastName,
MiddleIntial,
YearlySalary,
DepartmentCode,
StartDate,
IsActive,
IsDeleted
FROM dbo.EmployeeDestination1
WHERE (Id % 10) = 0;

UPDATE dbo.EmployeeSource
SET YearlySalary = ABS(CHECKSUM(NEWID()) % 100000) + 45000,
IsDeleted = 1,
DepartmentCode = 'M0004'
WHERE RowNumber % 10= 0;

INSERT INTO dbo.EmployeeSource
(
RowNumber,
EmployeeNumber,
FirstName,
LastName,
MiddleIntial,
YearlySalary,
DepartmentCode,
StartDate,
IsDeleted,
IsActive
)
SELECT TOP (100000)
ROW_NUMBER() OVER (ORDER BY n.Number) AS RowNumber,
CONCAT('E00', n.Number) AS EmployeeNumber,
SUBSTRING('ABCDEFGHIJKLMNOPQRSTUVWXYZ', (ABS(CHECKSUM(NEWID())) % 26) + 1, 6) AS FirstName,
SUBSTRING('ABCDEFGHIJKLMNOPQRSTUVWXYZ', (ABS(CHECKSUM(NEWID())) % 26) + 1, 7) AS LastName,
SUBSTRING('ABCDEFGHIJKLMNOPQRSTUVWXYZ', (ABS(CHECKSUM(NEWID())) % 26) + 1, 1) AS MiddleInitial,
ABS(CHECKSUM(NEWID()) % 100000) + 45000 AS YearlySalary,
CONCAT('M000', ABS(CHECKSUM(NEWID()) % 3) + 1) AS DepartmentCode,
DATEADD(DAY, RAND(CHECKSUM(NEWID())) * (1 + DATEDIFF(DAY, '2020-01-01', '03-31-2023')), '2020-01-01') AS StartDate,
0 AS IsDeleted,
1 AS IsActive
FROM dbo.Numbers n
WHERE n.Number > 10000000;
GO
SELECT COUNT (*)
FROM dbo.EmployeeSource;

SELECT COUNT (*)
FROM dbo.EmployeeDestination1;

SELECT COUNT (*)
FROM dbo.EmployeeDestination2;

SELECT COUNT (*)
FROM dbo.EmployeeDestination3;

SELECT COUNT (*)
FROM dbo.EmployeeDestination4;
GO
PRINT 'Merge 1';

SET STATISTICS TIME ON;

MERGE dbo.EmployeeDestination1 AS dest
USING dbo.EmployeeSource AS src
ON (dest.EmployeeNumber = src.EmployeeNumber)
WHEN MATCHED AND EXISTS
(
SELECT FirstName,
LastName,
MiddleIntial,
YearlySalary,
DepartmentCode,
StartDate,
IsActive,
IsDeleted
FROM dbo.EmployeeSource
EXCEPT
SELECT FirstName,
LastName,
MiddleIntial,
YearlySalary,
DepartmentCode,
StartDate,
IsActive,
IsDeleted
FROM dbo.EmployeeDestination1
) THEN
UPDATE SET FirstName = src.FirstName,
LastName = src.LastName,
MiddleIntial = src.MiddleIntial,
YearlySalary = src.YearlySalary,
DepartmentCode = src.DepartmentCode,
StartDate = src.StartDate,
IsActive = src.IsActive,
IsDeleted = src.IsDeleted
WHEN NOT MATCHED BY TARGET THEN
INSERT
(
EmployeeNumber,
FirstName,
LastName,
MiddleIntial,
YearlySalary,
DepartmentCode,
StartDate,
IsActive,
IsDeleted
)
VALUES
(src.EmployeeNumber, src.FirstName, src.LastName, src.MiddleIntial, src.YearlySalary, src.DepartmentCode,
src.StartDate, src.IsActive, src.IsDeleted);

SET STATISTICS TIME OFF;
GO
PRINT 'Merge 2';

SET STATISTICS TIME ON;

MERGE dbo.EmployeeDestination2 AS dest
USING dbo.EmployeeSource AS src
ON (dest.EmployeeNumber = src.EmployeeNumber)
WHEN MATCHED THEN
UPDATE SET FirstName = src.FirstName,
LastName = src.LastName,
MiddleIntial = src.MiddleIntial,
YearlySalary = src.YearlySalary,
DepartmentCode = src.DepartmentCode,
StartDate = src.StartDate,
IsActive = src.IsActive,
IsDeleted = src.IsDeleted
WHEN NOT MATCHED BY TARGET THEN
INSERT
(
EmployeeNumber,
FirstName,
LastName,
MiddleIntial,
YearlySalary,
DepartmentCode,
StartDate,
IsActive,
IsDeleted
)
VALUES
(src.EmployeeNumber, src.FirstName, src.LastName, src.MiddleIntial, src.YearlySalary, src.DepartmentCode,
src.StartDate, src.IsActive, src.IsDeleted);

SET STATISTICS TIME OFF;
GO
PRINT 'Upsert 1';

SET STATISTICS TIME ON;

UPDATE dest
SET dest.FirstName = src.FirstName,
dest.LastName = src.LastName,
dest.MiddleIntial = src.MiddleIntial,
dest.YearlySalary = src.YearlySalary,
dest.DepartmentCode = src.DepartmentCode,
dest.StartDate = src.StartDate,
IsActive = src.IsActive,
IsDeleted = src.IsDeleted
FROM dbo.EmployeeDestination3 AS dest INNER JOIN dbo.EmployeeSource AS src
ON dest.EmployeeNumber = src.EmployeeNumber -- JOIN Condition
WHERE EXISTS
(
SELECT FirstName,
LastName,
MiddleIntial,
YearlySalary,
DepartmentCode,
StartDate,
IsActive,
IsDeleted
FROM dbo.EmployeeSource
EXCEPT
SELECT FirstName,
LastName,
MiddleIntial,
YearlySalary,
DepartmentCode,
StartDate,
IsActive,
IsDeleted
FROM dbo.EmployeeDestination3
);

INSERT INTO dbo.EmployeeDestination3
(
EmployeeNumber,
FirstName,
LastName,
MiddleIntial,
YearlySalary,
DepartmentCode,
StartDate,
IsActive,
IsDeleted
)
SELECT EmployeeNumber,
FirstName,
LastName,
MiddleIntial,
YearlySalary,
DepartmentCode,
StartDate,
IsActive,
IsDeleted
FROM dbo.EmployeeSource src
WHERE NOT EXISTS
(
SELECT src.EmployeeNumber
FROM dbo.EmployeeDestination3 AS dest
WHERE src.EmployeeNumber = dest.EmployeeNumber
);

SET STATISTICS TIME OFF;
GO
PRINT 'Upsert 2';

SET STATISTICS TIME ON;

UPDATE dest
SET dest.FirstName = src.FirstName,
dest.LastName = src.LastName,
dest.MiddleIntial = src.MiddleIntial,
dest.YearlySalary = src.YearlySalary,
dest.DepartmentCode = src.DepartmentCode,
dest.StartDate = src.StartDate,
IsActive = src.IsActive,
IsDeleted = src.IsDeleted
FROM dbo.EmployeeDestination4 AS dest INNER JOIN dbo.EmployeeSource AS src
ON dest.EmployeeNumber = src.EmployeeNumber; -- JOIN Condition

INSERT INTO dbo.EmployeeDestination4
(
EmployeeNumber,
FirstName,
LastName,
MiddleIntial,
YearlySalary,
DepartmentCode,
StartDate,
IsActive,
IsDeleted
)
SELECT EmployeeNumber,
FirstName,
LastName,
MiddleIntial,
YearlySalary,
DepartmentCode,
StartDate,
IsActive,
IsDeleted
FROM dbo.EmployeeSource AS src
WHERE NOT EXISTS
(
SELECT src.EmployeeNumber
FROM dbo.EmployeeDestination4 AS dest
WHERE src.EmployeeNumber = dest.EmployeeNumber
);

SET STATISTICS TIME OFF;
GO
SELECT COUNT (*)
FROM dbo.EmployeeDestination1;

SELECT COUNT (*)
FROM dbo.EmployeeDestination2;

SELECT COUNT (*)
FROM dbo.EmployeeDestination3;

SELECT COUNT (*)
FROM dbo.EmployeeDestination4;
GO
SELECT EmployeeNumber,
FirstName,
LastName,
MiddleIntial,
YearlySalary,
DepartmentCode,
StartDate,
IsActive,
IsDeleted
FROM dbo.EmployeeDestination1
EXCEPT
SELECT EmployeeNumber,
FirstName,
LastName,
MiddleIntial,
YearlySalary,
DepartmentCode,
StartDate,
IsActive,
IsDeleted
FROM dbo.EmployeeDestination2;

SELECT EmployeeNumber,
FirstName,
LastName,
MiddleIntial,
YearlySalary,
DepartmentCode,
StartDate,
IsActive,
IsDeleted
FROM dbo.EmployeeDestination3
EXCEPT
SELECT EmployeeNumber,
FirstName,
LastName,
MiddleIntial,
YearlySalary,
DepartmentCode,
StartDate,
IsActive,
IsDeleted
FROM dbo.EmployeeDestination4;

SELECT EmployeeNumber,
FirstName,
LastName,
MiddleIntial,
YearlySalary,
DepartmentCode,
StartDate,
IsActive,
IsDeleted
FROM dbo.EmployeeDestination2
EXCEPT
SELECT EmployeeNumber,
FirstName,
LastName,
MiddleIntial,
YearlySalary,
DepartmentCode,
StartDate,
IsActive,
IsDeleted
FROM dbo.EmployeeDestination4;

Thursday, March 9, 2023 - 3:32:38 AM - Steen Schlüter Persson Back To Top (90991)
I'm one of the ones, that never have fallen in love with the MERGE statement, so I'm glad to see that individual UPDATE, INSERT and DELETE statements performs better :-).
One of the major reasons for why I prefer the manual statements over the MERGE statements, is that the debugging and manual "stepping through" the statement is easier with individual statements where you can run the various parts one by one. With the MERGE statement, you can't just as easy run the various steps during development and debugging.














get free sql tips
agree to terms