SQL Server Change Tracking to Optimize Data Refreshes for Reporting

Problem

You’re a developer responsible for maintaining a SQL Server database used to feed data to Power BI. You realize that some tables don’t have a modifiedDate column and some do. Also, you know the data is updated by the application as well as other processes which makes the modifiedDate unreliable. This is because it may not always be updated when data changes. You need a way to track data changes to ensure only changed rows are updated on reports in Power BI.

Solution

SQL Server has a feature called Change Tracking which can be used to keep track of changes in your database regardless of a modifiedDate column. Change Tracking uses a versioning mechanism to keep track of changing rows of selected database tables. We are going to explore how we can setup and use Change Tracking tables for Power BI reports easily and report on just changes that have occurred.

Populating Sample Data

If you don’t have an existing database, you can create a sample database and populate it using the code below.

This code will create a database named “SalesDB” and a table named “Customer”, then populate the table the 10 million rows of sample data to simulate real world data.

CREATE DATABASE SalesDB;
GO
USE SalesDB;
GO
 
SET NOCOUNT ON;
 
CREATE TABLE dbo.Customer (
    CustomerID INT IDENTITY(1,1) PRIMARY KEY,
    FirstName NVARCHAR(50),
    LastName NVARCHAR(50),
    PhoneNumber NVARCHAR(20),
    Email NVARCHAR(100),
    City NVARCHAR(50),
    Status CHAR(1), -- A = Active, I = Inactive (deleted)
    ModifiedDate DATETIME2 NOT NULL DEFAULT SYSUTCDATETIME()
);
GO
WITH N AS ( 
SELECT TOP (10000000) ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) AS rn FROM sys.all_objects a CROSS JOIN sys.all_objects b ) 
INSERT INTO dbo.Customer (FirstName, LastName, PhoneNumber, Email, City, Status, ModifiedDate) 
SELECT 
   CONCAT('FN', rn), 
   CONCAT('LN', rn), 
   CONCAT('555-', (rn % 10000)),
   CONCAT('user', rn, '@email.com'),
   CASE WHEN rn % 5 = 0 THEN 'Johannesburg'
      WHEN rn % 5 = 1 THEN 'Cape Town' 
      WHEN rn % 5 = 2 THEN 'Durban'
      WHEN rn % 5 = 3 THEN 'Pretoria'
      ELSE 'Bloemfontein' 
   END, 
   'A',  
   DATEADD(MINUTE, -ABS(CHECKSUM(NEWID())) % 300000, SYSUTCDATETIME())  
FROM N; 
GO  

How Change Tracking (CT) works

Change Tracking works by maintaining a small amount of metadata in SQL Server that records which rows were updated, deleted, inserted, and the primary key value of changed rows with their change version number.

It only keeps track of what changed since the last data request without relying on timestamps, detecting deletes and not requiring any database schema changes.

When CT is enabled, this is what SQL Server does:

  1. SQL Server assigns every committed transaction with a numeric version number.
  2. When a row in a tracked table is inserted, updated, or deleted, SQL Server writes a record into an internal CT table that stores:
    • The primary key
    • Version number when the change happens
    • Type of change (insert, update, delete)

Enable Change Tracking on SQL Server

In SSMS, you can enable Change Tracking using the dialog box or open a new query and use the following T-SQL code to enable Change Tracking at the database level and table level.

-- Enable change tracking at DB level 
ALTER DATABASE [YourDBName] 
SET CHANGE_TRACKING = ON (CHANGE_RETENTION = 5 DAYS, AUTO_CLEANUP = ON);
 
-- Enable on table 
ALTER TABLE [YourTableName]
ENABLE CHANGE_TRACKING WITH (TRACK_COLUMNS_UPDATED = ON);

The change_retention value is used to specify how long change tracking information is kept before it is wiped periodically. You should relate this to what the maximum time is needed for your application to sync to your database.

You can capture the starting version using this query.

DECLARE @LastSyncVersion BIGINT = CHANGE_TRACKING_CURRENT_VERSION(); 

After enabling CT on your table, run the following query to simulate exactly 35,718 rows being changed to match the number of rows affected when querying the indexed table.

UPDATE TOP (35718) dbo.Customer 
SET City = CONCAT('CT_Update_', ABS(CHECKSUM(NEWID()) % 1000)), 
    ModifiedDate = SYSUTCDATETIME();
GO

Then, you can query Change Tracking by only using the keys on your table.

USE SalesDB 
SET STATISTICS IO, TIME ON; 
GO 
 
DECLARE @LastSyncVersion BIGINT = CHANGE_TRACKING_CURRENT_VERSION() - 1;
 
SELECT CT.CustomerID, CT.SYS_CHANGE_VERSION, CT.SYS_CHANGE_OPERATION 
FROM CHANGETABLE(CHANGES dbo.Customer, @LastSyncVersion) AS CT; 
GO

The performance for this is shown below:

MetricChange Tracking results with Full row joinChange Tracking results using keys onlyindexed results
CPU time483 ms32 ms~ 0 ms
Elapsed time1,210 ms744 ms2 002 ms
Logical reads111,848625561
Read-ahead reads~ 411~ 0557
Rows returned35,71835,71835,718

Disabling Change Tracking

To disable Change Tracking on the Database level, you must first start with disabling it on your tables:

ALTER TABLE [YourTableName] DISABLE CHANGE_TRACKING;

After disabling CT on the table level, you can now disable in on the database level:

ALTER DATABASE [YourDBName] SET CHANGE_TRACKING = OFF

Using Change Tracking Data with Power BI

First, you should create a table to keep track of changing versions. This will help with the report we will build below.

USE SalesDB;
GO
 
IF NOT EXISTS (SELECT 1 FROM sys.tables WHERE name = 'ETL_SyncVersion')
BEGIN
    CREATE TABLE dbo.ETL_SyncVersion (
        TableName SYSNAME PRIMARY KEY,
        LastSyncVersion BIGINT NOT NULL
    );
 
    INSERT INTO dbo.ETL_SyncVersion (TableName, LastSyncVersion)
    VALUES ('Customer', 0);
END

Create a Parameterized Change Tracking Function

Now, you can create a function to allow Power BI to dynamically get the LastSyncVersion value from the database.

CREATE OR ALTER FUNCTION dbo.ufn_Customer_Changes(@LastSyncVersion BIGINT) 
RETURNS TABLE 
AS 
RETURN ( SELECT C.CustomerID, C.FirstName, C.Email, CT.SYS_CHANGE_VERSION, CT.SYS_CHANGE_OPERATION FROM CHANGETABLE(CHANGES dbo.Customer, @LastSyncVersion) 
AS CT LEFT JOIN dbo.Customer C ON C.CustomerID = CT.CustomerID ); 
GO

Now to create a view that Power BI can use:

CREATE OR ALTER VIEW dbo.vw_Customer_Current 
AS 
SELECT CustomerID, CustomerName, Email, SYS_CHANGE_VERSION, SYS_CHANGE_OPERATION 
FROM dbo.ufn_Customer_Changes( (SELECT LastSyncVersion FROM dbo.ETL_SyncVersion WHERE TableName = 'Customer') ); 
GO

After connecting Power BI to the database, you can see the database objects we can use:

connecting to dataset sources.

Create a parameter “LasySyncVersionNum” for the vw_Customer_Current function we created earlier. On your Power BI top ribbon, navigate to the “Modeling” tab and on the “Parameters” tab click on the New Parameter button and create a new parameter as shown below:

creating parameter

Then edit your vw_Customer_Current query using the Advanced Editor.

= 
let Source = Sql.Database("DESKTOP-GTCV10B\SQL2025EVA", "SalesDB",
 [ Query = "SELECT * FROM dbo.ufn_Customer_Changes(" & Number.ToText(LastSyncVersionNum) & ")" ] ) 
in 
Source

Then simulate an update for the database table. This will update the top 350 customers data.

UPDATE TOP (350) dbo.Customer 
SET City = CONCAT('MMSQL_CT_Update_101', ABS(CHECKSUM(NEWID()) % 1000)), 
ModifiedDate = SYSUTCDATETIME(); 
GO

After updating the data, update the ETL_SyncVersion to get the latest version.

DECLARE @NewVersion BIGINT;
SELECT @NewVersion = CHANGE_TRACKING_CURRENT_VERSION();
UPDATE dbo.ETL_SyncVersion SET LastSyncVersion = @NewVersion WHERE TableName = 'Customer';

Filtering Data in Power BI

In the Power BI report we can then use the LastSyncVersion to filter the data based on that version of the data as shown below. As you can see with the counter at the bottom of the report, the version “2” data has only the 350 updated rows.

Example Power BI report.

Why use Change Tracking

This article uses Power BI as the example analytical tool that integrates with easily with SQL Server. The SQL Server will check if there are Views, Stored procedures or filters included in the sent query to narrow down the searches and if your report connects to a Change Tracking enabled table (or a SQL view that uses CHANGETABLE()), the SQL engine automatically filters only rows that have changed since the last tracked version.

To use timestamp filters, you must ensure that the tables you are working with have a reliable timestamp column, and the logic that updates these columns is always correct. This means that you must be in control of all the application sources that write to this database, and you would most probably change your database schema for it to work.

It is an option to use custom change tracking mechanisms, but these mechanisms require a lot of work such as reliable timestamp columns, combinations of triggers, new tables to store tracking information and custom cleanup process, which tells your mechanism how far back it should start deleting metadata.

On the other hand, Change Tracking does not require any changes to the database schema, it works automatically across the entire database. It can detect insert, update and delete activity from the database and it can guarantee you data consistency even if the timestamps are not reliable. Change tracking also comes with an automated cleanup process that is enabled by default, this new cleanup feature allows you do choose how deep and how frequent your cleanups are.

SQL Server Change Tracking Interfaces

This SQL Server feature can be interfaced and integrated with different software tools:

  1. SQL Server: The first and most natural place to use this is natively on your SQL Server using T-SQL code.
  2. Power BI: Power BI does not natively support Change Tracking, but you can write custom SQL views and procedures that can help filter imported data and simulate a form of a more customized incremental refresh system.
  3. SQL Server Integration Services: There is built-in support for Change Tracking using its components and T-SQL queries. This offers a great way for scheduled ETL jobs to only grab changed rows into data warehouses or stores.

Conclusion

The built-in SQL Server change tracking queries can detect which rows of a specified table were altered since the last query. This feature is highly useful for analytics tools, real-time data applications, and general queries. It provides the option to query only the last updated version of data while keeping the history unchanged.

Change Tracking works by tracking row keys, rather than relying solely on timestamps, making it ideal when you want to identify updated or deleted rows without building complex timestamp logic in your application.

Using indexed tables can outperform a Change Tracking setup if it is used effectively, this means that if you have a solid timestamp setup on your database with additional columns to track other changes like deletes and have logic to verify timestamps, using an indexed table can be ideal for you.

The Change Tracking feature is ideal if you do not want to keep track of all the timestamp logic on your applications and using addition table columns to keep track of changes that cannot fully be captured like deletes.

Next Steps

Leave a Reply

Your email address will not be published. Required fields are marked *