Sync SQL Server Change Tracking Tables without Changing Data
How do you identify SQL Server database records that are new or changed in order to copy/synchronize to a secondary table/system, as part of a data load process? Typically, we use columns such as LastModified with datetime datatype in SQL Server to identify the records that changed. What if you must re-synchronize some of the data? Often if these systems get out of sync, we might update the LastModified value to a current datetime to resynchronize the data to the secondary system. However, changing the record’s LastModified date value may not be ideal because doing so, in this circumstance assigns a new LastModified datetime value resulting in the inability to accurately tell when the record really changed. So, in this article I will show you how to mark a record so it can be re-synchronized to a secondary system without changing the data values of the original record, thus maintaining data integrity.
To mark records so they can be identified to move or resynchronize to a secondary system in a load process without changing any data values, I will use the SQL Change Tracking (CT) feature and the WITH CHANGE_TRACKING_CONTEXT clause to mark and identify the source data that I want to re-sync. For this example, I will use a copy of the AdventureWorksLT (Lite) sample database in Azure. My database is named AWLTDB. SQL Change Tracking was introduced in SQL Server 2008. You can test this in any version of SQL Server 2008 or newer.
Step 1 – Turn on SQL Server Change Tracking at the Database Level
First, turn on Change Tracking at the Database Level. In SQL Server Management Studio, right click your database and click Properties. In the Properties window click Change Tracking. In the right pain set Change Tracking to TRUE. It is important to note that by default Auto Cleanup is on with a default retention Period of 2 Days. We will touch on this again later.
Step 2 – Turn on SQL Server Change Tracking at the Table Level
For this demo I will use the SalesLT.Customer table. Right click on the table you want to track changes. Click Properties, click Change Tracking, then in the right pane set Change Tracking to TRUE. In a later article I will discuss the Track Columns Updates option.
Step 3 – Review Change Tracking Functions
In this exercise we will be using the following Change Tracking functions:
- The CHANGETABLE (CHANGES) function: Returns tracking information since the version specified. The parameter are the table name and the change version number.
- The CHANGETABLE (VERSION) function: Returns the latest Change Tracking information for the current rows in a table including the change version number and the change context if used. We will include this as a bonus.
- The WITH CHANGE_TRACKING_CONTEXT (context): Used to mark the changed records to easily be identified or select later. The context is a string converted to varbinary (128).
Step 4 – Update Records to be Re-Synced
Previously to force records back to the secondary system we had to change the ModifiedDate value. But with SQL Change Tracking, we can mark the records for change without jeopardizing the integrity of the data! I refer to this as a Dummy Update. Below I show both methods. I will run these in the AdventureWorksLT database on my Azure SQL instance.
--Option 1: Update the ModifiedDate to trigger the ETL process to pickup the records as changes. UPDATE SalesLT.Customer SET [ModifiedDate] = GETDATE() WHERE CustomerID = 1; GO --Option 2: Dummy Update - Pick a column to update with itself thus not changing value and marking the transaction using CHANGE_TRACKING_CONTEXT as 'SpecialUpdate', maintaining the integrity of the record. DECLARE @originator_id varbinary(128); SET @originator_id = CAST('SpecialUpdate' AS varbinary(128)); -- Set Your value WITH CHANGE_TRACKING_CONTEXT (@originator_id) UPDATE SalesLT.Customer SET [ModifiedDate] = [ModifiedDate] WHERE CustomerID = 2; GO
Step 5 – Identify Records to Move to the Secondary System with Function CHANGETABLE (CHANGES)
Next, I show how to identify changed records to be reprocessed to the secondary system. Results 5a with be the traditional approach. Results 5b will use the function CHANGETABLE (CHANGES) to select the records that were marked for change. In this example I pass the change version number of 0 to the CHANGETABLE function to retrieve all records within the change retention period marked as ‘SpecialUpdate’.
NOTE: Previously, I pointed out the Change Tracking Auto Clean Up setting of 2 days. You must retrieve the marked transactions before the records fall out of the Auto Clean Up period. In a production system the 2-day default may need to be adjusted to meet your system requirements.
--Results A: Retrieve Results based on change to ModifiedDate. SELECT CustomerID, [EmailAddress], CompanyName, [ModifiedDate] FROM SalesLT.Customer WHERE [ModifiedDate] >= DATEADD(MINUTE,-5,GETDATE()); --Just Updates a few minute ago --Results B: Retrieve Results with Join to the CHANGETABLE (CHANGES) Function. SELECT cust.CustomerID, cust.[EmailAddress], cust.CompanyName, cust.[ModifiedDate], c.SYS_CHANGE_VERSION, c.SYS_CHANGE_OPERATION, CAST(c.SYS_CHANGE_CONTEXT as varchar(255)) as 'SYS_CHANGE_CONTEXT' FROM CHANGETABLE (CHANGES SalesLT.Customer, 0) AS c INNER JOIN SalesLT.Customer AS cust ON cust.CustomerID = c.CustomerID WHERE c.SYS_CHANGE_CONTEXT = CAST('SpecialUpdate' AS varbinary(128));
Result 5a – Shows the results based on changes made to the ModifiedDate value.
Results 5b – Shows the results of the dummy update done with CHANCE_TRACKING_CONTEXT. Here no values changed.
Step 6 – Build the Option into the Process Sync Code
In this example the original Load Process sync code might be modified as shown to always look for ad-hoc updates along with normal record updates and inserts by combining the ModifiedDate logic of the Load Process with the Change Tracking function CHANGETABLE (CHANGES) and SYS_CHANGE_CONTEXT usage.
-- Build into your ELT Process with Or condition and in the future if you have to resync data -- Just do the Update WITH CHANGE_TRACKING_CONTEXT ('SpecialUpdate'). DECLARE @LastRunDate as Datetime = DATEADD(DAY,-1,GETDATE()) SELECT cust.CustomerID, cust.[EmailAddress], cust.CompanyName, cust.[ModifiedDate], c.SYS_CHANGE_VERSION, c.SYS_CHANGE_OPERATION, CAST(c.SYS_CHANGE_CONTEXT as varchar(255)) as 'SYS_CHANGE_CONTEXT' FROM CHANGETABLE (CHANGES SalesLT.Customer, 0) AS c INNER JOIN SalesLT.Customer AS cust ON cust.CustomerID = c.CustomerID WHERE [ModifiedDate] >= @LastRunDate OR c.SYS_CHANGE_CONTEXT = CAST('SpecialUpdate' AS varbinary(128));
Note: Again, in this example I pass the change version number of 0 to the CHANGETABLE function. If your Load Process was implemented using Change Tracking Versioning you would pass the actual version number used to track data changes since the last Data Sync in place of 0.
Results 6: Show record that may have been modified thru normal system DML and manual dummy updates done with CHANGE_TRACKING_CONTEXT.
Step 7 – Exclude Scenario Built into Process Sync Code
In the case that you want to modify data but do not want the rows to be picked up in your Load Process, this example shows how to use WITH CHANGE_TRACKING_CONTEXT marking an update as ‘DONOTProcess’ and how to exclude it in the Select query.
--Option 3: Use CHANGE_TRACKING_CONTEXT to Mark a Record Not to be Processed. DECLARE @originator_id varbinary(128); SET @originator_id = CAST('DONotProcess' AS varbinary(128)); ---Your value WITH CHANGE_TRACKING_CONTEXT (@originator_id) UPDATE SalesLT.Customer SET [Title] = 'Sr.', [ModifiedDate] = GETDATE() WHERE CustomerID = 130; GO --Retrieve all Updates except for ones where CHANGE_TRACKING_CONTEXT as 'DONotProcess'. DECLARE @LastRunDate as Datetime = DATEADD(DAY,-1,GETDATE()) SELECT cust.CustomerID, cust.[EmailAddress], cust.CompanyName, cust.[ModifiedDate], c.SYS_CHANGE_VERSION, c.SYS_CHANGE_OPERATION, c.SYS_CHANGE_COLUMNS, CAST(c.SYS_CHANGE_CONTEXT as varchar(255)) as 'SYS_CHANGE_CONTEXT' FROM CHANGETABLE (CHANGES SalesLT.Customer, 0) AS c INNER JOIN SalesLT.Customer AS cust ON cust.CustomerID = c.CustomerID WHERE ([ModifiedDate] >= @LastRunDate OR c.SYS_CHANGE_CONTEXT = CAST('SpecialUpdate' AS varbinary(128)) ) AND (c.SYS_CHANGE_CONTEXT <> CAST('DONotProcess' AS varbinary(128)) OR c.SYS_CHANGE_CONTEXT IS NULL ); GO
Results 7: Show that the updates done with CHANGE_TRACKING_CONTEXT set to 'DONotProcess' are excluded.
Step 8 - Show CHANGETABLE (VERSION) Function
The final example shows the use of the CHANGETABLE (VERSION) function which returns all rows from a Change Tracking enabled table and shows the change version number for Inserts and Updates and the SYS_CHANGE_CONTEXT when used. For this we use Cross Apply, the change table name and the primary key column.
--Retrieve All rows with CROSS APPLY Join to the CHANGETABLE (VERSION) Function, returning the row VERSION number and the SYS_CHANGE_CONTEXT if used for updates. SELECT c.*, CT.*, CAST(ct.SYS_CHANGE_CONTEXT as varchar(255)) as 'SYS_CHANGE_CONTEXT' FROM SalesLT.Customer AS c CROSS APPLY CHANGETABLE(VERSION SalesLT.Customer, (CustomerID), (c.CustomerID)) AS CT Order by 1
Results 8: Show the SYS_CHANGE_VERSION and SYS_CHANGE_CONTEXT for each of the change in this demo.
Change Tracking is a nice light weight feature that is more accurate than relying on a last modified date time column to tell when data changes. It can be turned on and implemented across all tables without having to modify your schema. Using Change_Tracking_Context provides the only way I know to mark records in a dummy update to identify them later without changing any data values. If you use Change Tracking in any other way, please leave a comment and share your experience. Also, remember that these functions are affected by the Change Tracking retention or Auto Clean up setting.
- Review Microsoft Documents on Change Tracking
- Read about Using Change Tracing in SQL Server 2008
- Find other articles on Change Tracking – MSSQLTips Search
- Look for an upcoming tip on Change Tracking column updates
About the author
View all my tips
Article Last Updated: 2020-08-05