Learn more about SQL Server tools

mssqltips logo

Tutorials          DBA          Dev          BI          Career          Categories          Webcasts          Whitepapers          Today's Tip          Join

Tutorials      DBA      Dev      BI      Categories      Webcasts

DBA    Dev    BI    Categories


Synchronize Table Data Using a Merge Join in SSIS

By:   |   Read Comments (2)   |   Related Tips: More > Integration Services Data Flow Transformations

SQL Server Indexing Tips and Tricks --- Free MSSQLTips Webcast!


We have a need to synchronize data in tables across SQL Server instances. There are multiple ways this can be done such as using link servers, replication, TableDiff utility, etc., but in this tip I am going to demonstrate one-way synchronization using an SSIS Merge Join.


I have two SQL Server instances: serverA and serverB. I also have two different databases, TestingA on serverA and TestingB on serverB. Both of these databases have the same data in table Customer_Detail with a unique key of custID.   Then we changed some of the data on serverA and now I want the data on serverB to be the same for reporting purposes, so based on that I will demonstrate how to synchronize this data from serverA to serverB using a Merge Join in SSIS.

Below is what my SSIS Package looks like. I will discuss each of the steps in the package.

SSIS Package Synchronization

Step 1: Initiate SQL Connections

I have two OLE DB Sources, one for serverA (source) and the other for serverB (destination). Below are how these connections are setup.

Source connection: serverA

serverA - Description: serverA connection

Source connection: serverB

serverB - Description: serverB connection

Step 2: Apply sorting on tables

When applying a Merge Join, the data needs to be sorted for both inputs, hence I am applying a sort operation to both sides. An alternative to this step is to pre-sort the data in a query and use a SQL command instead of table. This needs to be done for both inputs.

SSIS Data sorting - Description: apply sorting

Step 3: Applying Merge Join

I have applied a full outer join on the unique column custid in the Merge Join. I need to get NULL records from the right and left tables where the custid does not match and this will determine if we need to perform a delete or insert operation.

Full Outer Join as Merge Join Transformation Editor - Description: apply merge join

Step 4: Split the Records

In the next step, I have applied a conditional split.  I have added three different conditions based on the full outer join result. These are the three conditions:

  1. UPDATE (custID match): If the custID exists on both the source and destination then we do an update.
  2. INSERT (custid not found in destination): If custID is found only on the left (source) then we need to insert this data into the destination table.
  3. DELETE (custID is not found in source): If custID is not found in the source table then we need to delete this data from the destination table.
Conditional Split Transformation Editor - Description: split operations

Step 5: Find Updated Record

As mentioned above the UPDATE (custID Match) condition, it returns the data where custID matches. We don't know if the data has changed on the source, so to minimize updating every record I have added a conditional split that compares each column to see if any data is different.  If it is, then we will do the update and if not we will not update the record.  Only the records that are different will go to Step 6.


(custName! = [custName (1)]) || (custAddress!= [custAddress (1)]) || (pincode != [pincode (1)]) || (buaddress != [buaddress (1)]) || (varmobile != [varmobile (1)]) || (dtentrydate != [dtentrydate (1)]) || (ref_userid != [ref_userid (1)])
finding updated records in the Conditional Split Transformation Editor - Description: finding updated records

Step 6: Performing Insert, Update and Delete to Destination Table

So we now have three streams of data to do either an Insert, Update or Delete.

Insert Records

I have an OLE DB Destination connection object Customer_detail and mapped the input columns as follows.

Insert the data
Mapping data

Update Records

For updating the data, I have an OLE DB Destination connection and added a sqlcommand as follows using the Advanced Editor.

updataing data - Description: Conditional base 
Updating the records
updating the data - Description: Updating the data

After initiating the command with parameter ?, I then applied the input column mapping with available parameter destination columns based on the destination table.

Mapping columns - Description: Mapping columns

DELETE Records

For deleting data, I have an OLE DB Destination connection and added a sqlcommand as follows using the Advanced Editor.

Deleting the record - Description: Delete the records
configure command - Description: delete the records

For the delete record the column mapping is as follows.

Mapping the data - Description: Mapping columns

Running the Package

After setting up all of the above steps, randomly changing some of the data, I then run the package and get the following.

Synchronization  - Description: Package execution


As per the above package synchronization, I have performed insert, update and delete operations with data from the source to the destination. In SSIS, there are also other methods available for synchronization data such as using the tablediff utility in SSIS. I suggest you find the method that works best for your business needs.

Next Steps

Last Update:

signup button

next tip button

About the author
MSSQLTips author Bhavesh Patel Bhavesh Patel has been working with SQL Server since 2007 as a Senior DBA.

View all my tips

Post a comment or let the author know this tip helped.

All comments are reviewed, so stay on subject or we may delete your comment. Note: your email address is not published. Required fields are marked with an asterisk (*).

*Name    *Email    Notify for updates 

Send me SQL tips:


Sunday, December 03, 2017 - 12:20:50 AM - Bhavesh Patel Back To Top

Thanks Gerardo Arevalo for this advice. some times this sorting operatior cost created problem in server so with respect of this, I have mentioned alernate in step2. It is better to use T-SQL with Order statement instead of sorting component. for more clearification I have added reference of such discussion..









Tuesday, November 21, 2017 - 10:06:07 AM - Gerardo Arevalo Back To Top

Nice post, well documented. But, a  notice of caution before trying to implement that on a production server: sort and merge join components run entirely on memory, the SSIS server must have plenty enoung free memory to sort (twice) all the data, and then compare between them (join). I have seen to fail implementations like this while in dev boxes ran succesfuly as the (T/SQL) load in the prod server usually is ten times the dev load, letting less usuable memory for SSIS operations.

Learn more about SQL Server tools