Synchronizing SQL Server data using rowversion


By:   |   Updated: 2016-11-09   |   Comments (11)   |   Related: More > Database Administration

Problem

For SQL Server database professionals a common task is synchronizing data between two tables. There are different ways of implementing a solution in SQL Server such as Change Data Capture or Change Data Tracking, replication, using triggers and so on. However, there are some situations when the above solutions are not applicable. For example source and destination tables are located in database systems from different vendors (for example Sybase and SQL Server).

Suppose, we have two tables in two different databases and the data in these tables needs to be fully synchronized. We can compare the data in these tables, update existing rows and add new rows. However, comparing all of the data in the tables is not an effective way and it would be better if we can detect changes in the source table and only apply these changes to the target. This problem can be solved by using the SQL Server rowversion data type.

Solution

In this article we'll look at the rowversion data type and see how it can be used in data synchronization.

The values for a rowversion data type are automatically generated binary numbers which take 8 bytes of storage and are unique within a database. A non-nullable rowversion column is semantically equivalent to a binary(8) column and a nullable rowversion column is semantically equivalent to a varbinary(8) column. Columns with a rowversion data type are generally used for detecting changes in the row. You can have only one rowversion column in a table. When data is inserted or updated in a table that has a rowversion column, the database counter called the database rowversion is incremented. This incremented value is set as the value for the rowversion column for updated or inserted rows.

SQL Server rowversion Overview

Let's create a sample database:

USE master
GO

CREATE DATABASE TestDB
GO
 

We can get the database's current rowversion value with this query:

USE TestDB
GO

SELECT @@DBTS AS CurrentRowversionForTestDB
 

The result is the following:

Result for SELECT @@DBTS AS CurrentRowversionForTestDB

Here are a few things to note.  A timestamp data type is a synonym for rowversion. When we create a table with a column using a rowversion data type using T-SQL and then look at that column in SQL Server Management Studio (SSMS), we can see that the data type of column is timestamp:

USE TestDB
GO

CREATE TABLE TestTableA
(
 ID INT,
 Value INT,
 LastChange ROWVERSION
)
GO
 

with a column using a rowversion data type using T-SQL and then look at that column in SQL Server Management Studio, we can see that the data type of column is timestamp

If we try to add a column with a rowversion data type in SSMS, we will notice that there is no rowversion in the data type list as shown below. The only choice is timestamp.

Data Types in SQL Server

It is possible to create a table with a timestamp column without mentioning the column name and it will be automatically generated:

USE TestDB
GO

CREATE TABLE TestTableB
(
 ID INT,
 Value INT,
 TIMESTAMP
)
GO
 

create a table with a timestamp column without mentioning the column name and it will be automatically generated

However, if we try this with rowversion we get an error:

USE TestDB
GO

CREATE TABLE TestTableC
(
 ID INT,
 Value INT,
 ROWVERSION
)
 

Error - The definition for column ROWVERSION must include a data type.

Microsoft recommends using rowversion instead of timestamp in DDL statements, because the timestamp will be deprecated in future versions.

Example Using SQL Server rowversion

Letís use the above created TestTableA to demonstrate the rowversion data type behavior. We will insert data into TestTableA table and we will see the values in the LastChange column:

USE TestDB
GO

SELECT @@DBTS AS DatabaseRowversion

INSERT INTO TestTableA (ID, Value)
VALUES(1, 2), (2, 2), (3, 3)

SELECT * FROM TestTableA

SELECT @@DBTS AS DatabaseRowversion
 

As we can see, the values have automatically been generated and the last one is the same as the databases current rowversion:

Review the values for TestTableA

Now when we update the row, we can see that the value in the "LastChange" column for that row has changed:

USE TestDB
GO

SELECT @@DBTS AS DatabaseRowversion

SELECT * FROM TestTableA WHERE ID=2

UPDATE TestTableA 
SET Value=5
WHERE ID=2

SELECT * FROM TestTableA WHERE ID=2

SELECT @@DBTS AS DatabaseRowversion
 

when we update the row, we can see that the value in the "LastChange" column for that row has changed

It is also important to note that after an update where the values have not been changed, the rowversion does still change:

USE TestDB
GO

SELECT *
FROM TestTableA
WHERE ID=2 

SELECT @@DBTS AS DatabaseRowversion

UPDATE TestTableA 
SET Value=Value
WHERE ID=2


SELECT *
FROM TestTableA
WHERE ID=2

SELECT @@DBTS AS DatabaseRowversion
 

important to note that after an update where the values have not been changed, the rowversion does still change

Detecting changes and updating target table

Now it's is time to explore data synchronization using the rowversion data type using a simple example. Letís assume that we have two tables - source and target - data on the target table needs to be periodically synchronized with data on the source table (it is assumed that the data in source table does not get deleted):

USE master
GO

--Creating databases
CREATE DATABASE SourceDB
GO

CREATE DATABASE TargetDB
GO


--Creating tables
USE SourceDB
GO

CREATE TABLE SourceTable
(
 ID INT IDENTITY(1,1),
 Value INT,
 LastChange ROWVERSION NOT NULL,
 CONSTRAINT PK_SourceTable_ID PRIMARY KEY CLUSTERED (ID)
)


USE TargetDB
GO

CREATE TABLE TargetTable
(
 ID INT,
 Value INT,
 LastChange BINARY(8) NOT NULL,
 CONSTRAINT PK_TargetTable_ID PRIMARY KEY CLUSTERED (ID)
)
 

We can see that "LastChange" column in the source table is created as a rowversion data type and the corresponding column in the target table is created as a binary(8) data type. The following code inserts data into the source table:

--Inserting data into the source database
USE SourceDB
GO

INSERT INTO SourceTable(Value)
VALUES(1),(2),(3),(4),(5)

SELECT *
FROM SourceTable


SELECT @@DBTS AS SourceDBRowversion

SELECT MAX(LastChange) AS MaxValueOfLastChange FROM SourceTable
 

Data from the SourceTable

Now we will synchronize data in the target and source tables. In order to not compare all of the data in the source and target tables, we will insert only changed data after the last synchronization from the source to the target table. To identify these changes, we will use the data in the "LastChange" column in the following way: first, we will find the maximum value in the "LastChange" column in the target table. After that we will choose only the rows from the source table where the value in the "LastChange" column is greater than the maximum value of the "LastChange" column in the target table and finally we will update the target table by using these rows from the source table. In this way we do not update rows on the target which have not changed in the source table after the last synchronization. We will use the MERGE command to implement this data synchronization process:

USE TargetDB
GO

--Data in TargetTable before update
SELECT *
FROM TargetTable

DECLARE @lastChange AS BINARY(8)

SET @lastChange=(SELECT MAX(LastChange) FROM TargetTable)

--Updating data in TargetTable table, merging by SourceTable table
MERGE TargetTable AS target
USING ( SELECT ID, Value, LastChange
  FROM SourceDB.dbo.SourceTable o
  WHERE  LastChange > ISNULL(@lastChange,0)
  ) AS source 
ON target.ID = source.ID 
WHEN MATCHED AND ISNULL(target.Value,0)<>ISNULL(source.Value,0)
    THEN UPDATE SET target.Value = source.Value, target.LastChange=source.LastChange
WHEN NOT MATCHED 
    THEN INSERT (ID, Value, LastChange)
      VALUES(source.ID, source.Value, source.LastChange)
;

--Data in TargetTable after update
SELECT *
FROM TargetTable
 

After running this command for the first time in our test environment, all of the data in the source table will be inserted into the target table, because the target table is empty:

all of the data in the source table will be inserted into the target table, because the target table is empty

Now letís do some updates in the source table:

--Inserting data into the source database
USE SourceDB
GO

INSERT INTO SourceTable(Value)
VALUES(6),(7),(8),(9),(10)

UPDATE SourceTable
SET Value=55
WHERE ID=5

After that letís run the merge command again:

USE TargetDB
GO

DECLARE @lastChange AS BINARY(8)

SET @lastChange=(SELECT MAX(LastChange) FROM TargetTable)

--Updating data in TargetTable table, merging by SourceTable table
MERGE TargetTable AS target
USING ( SELECT ID, Value, LastChange
  FROM SourceDB.dbo.SourceTable o
  WHERE  LastChange > ISNULL(@lastChange,0)
  ) AS source 
ON target.ID = source.ID 
WHEN MATCHED AND ISNULL(target.Value,0)<>ISNULL(source.Value,0)
    THEN UPDATE SET target.Value = source.Value, target.LastChange=source.LastChange
WHEN NOT MATCHED 
    THEN INSERT (ID, Value, LastChange)
      VALUES(source.ID, source.Value, source.LastChange)
;
 

We will see that all changes made in source table are applied to the target and now the SourceTable and TargetTable tables are fully synchronized:

USE SourceDB
GO

SELECT * FROM SourceTable

USE TargetDB
GO

SELECT * FROM TargetTable
 

all changes made in source table are applied to the target and now the SourceTable and TargetTable tables are fully synchronized

Conclusion

To automate this data synchronization process you can create a synchronization script, stored procedure or SSIS package that runs on a predefined schedule.

All in all, the rowversion (timestamp) data type is very useful to detect data changes. They are synonymous data types, however as we can see in this tip, there are some differences between choosing one of them during implementation. Also, Microsoft recommends to use rowversion instead of timestamp, because it will be deprecated in future versions. Using the rowversion data type to track changes can be the best solution, especially when other solutions are not applicable to the given task. This method is often used in loading data warehouses.

Next Steps

Review this related information:



Last Updated: 2016-11-09


get scripts

next tip button



About the author
MSSQLTips author Sergey Gigoyan Sergey Gigoyan is a database professional with more than 10 years of experience, with a focus on database design, development, performance tuning, optimization, high availability, BI and DW design.

View all my tips
Related Resources




Post a comment or let the author know this tip helped.

All comments are reviewed, so stay on subject or we may delete your comment. Note: your email address is not published. Required fields are marked with an asterisk (*).

*Name
*Email
Email me updates

Signup for our newsletter

I agree by submitting my data to receive communications, account updates and/or special offers about SQL Server from MSSQLTips and/or its Sponsors. I have read the privacy statement and understand I may unsubscribe at any time.





Thursday, August 01, 2019 - 8:15:44 AM - Mi casa Back To Top

Good article and good explanation and good code.

This is very good for new or updated records that happen with records that are deleted, how do you synchronize them?


Tuesday, September 11, 2018 - 6:56:54 AM - Sergey Gigoyan Back To Top

Hi Amir,

Two-way synchronization can be a bit complicated as compared to one-way and the approaches to implementing it can be quite different depending on your synchronization criteria.

Thanks,

Sergey


Sunday, September 09, 2018 - 11:49:30 AM - Amir Back To Top

Hi Sergey,

Thank you for your good and clean code sample.

Your example demonstrates a one-way synchronization. However, in my case, both tables can be changed and need to be synched. The solution that comes to my mind is to have two columns in each table, one binary and one rowversion, and treat each table as both source and destination. Is it the right solution?

thanks


Thursday, August 16, 2018 - 11:40:37 AM - Anil K Javali Back To Top

Hey Sergey,

rowversion is getting incremented for me unlike you are showing in the example above. any thoughts? any flags to enable?

print @@dbts

update one set two = two where id = 11

print @@dbts

result

----

0x000000000008AD0D

(1 row(s) affected)

0x000000000008AD0E


Tuesday, October 03, 2017 - 11:21:18 AM - ZigoZago Back To Top

 Hi Sergey,i want use your example for a small project.I need to have 2 table equal in two different PC.Can i use your example? Do you have some helo for me.Thanks.Regards.

 


Wednesday, November 23, 2016 - 12:31:10 PM - Sergey Gigoyan Back To Top

Hi rainmaker,

Using rowversion for data synchronization is more convenient than using datetime. For example, every time when row is updated, the value in rowversion column is incremented automatically, but in case of using datetime, the developer needs to manually update the last date for change. Also, table can be updated directly (not from code), and it's more likely that one may forget to change the value in that 'LastChange' column. However in case of rowversion we do not need to care about updating it, it will be done automatically.
 
Thanks

Tuesday, November 22, 2016 - 9:56:58 PM - rainmaker Back To Top

Sir,

   I have a question. If i use datetime to replace rowversion, still achieve the same result.

What's the advantage to use rowversion? 

Thanks.


Friday, November 11, 2016 - 9:33:50 AM - sushil agarwal (bansal) Back To Top

sir,

for existing table/ database adding in target table is giving errro

IF COL_LENGTH('['[email protected]+'].['[email protected]+']','LastChange') IS NULL     

begin

      EXEC ('ALTER TABLE [' + @schemaname + '].[' + @tablename +

            '] ADD LastChange BINARY(8) not NULL ')

end 

teherefore i think here in while creating target table lastchange binary(8) null would be a good choice.

sorry , i am not an expert but a newbe , i was just trying the solution as it appeared great for data importing and faced the problem 

so for benifit of all it is my suggestion

eprts like you can still provide better option and suggetions


Friday, November 11, 2016 - 6:41:18 AM - Ruslan K. Back To Top

Hi, Sergey

Sorry, I forget to mention that my example is for database where READ_COMMITTED_SNAPSHOT is ON.


Thursday, November 10, 2016 - 11:13:55 AM - Sergey Gigoyan Back To Top

Hi Ruslan,

While data is not committed in source, target table can not be updated,
because we read data from source to do an update. MERGE command waits until changes are committed in source and then reads data for update. Please execute the following command :
 
--Inserting data into the source database
USE SourceDB
GO
 
 
INSERT INTO SourceTable(Value)
VALUES(6),(7),(8),(9),(10)
 
BEGIN TRANSACTION 
 
UPDATE SourceTable
SET Value=55
WHERE ID=5
 
After that, in another query window, execute the MERGE command:
 
USE TargetDB
GO
 
DECLARE @lastChange AS BINARY(8)
 
SET @lastChange=(SELECT MAX(LastChange) FROM TargetTable)
 
--Updating data in TargetTable table, merging by SourceTable table
MERGE TargetTable AS target
USING ( SELECT ID, Value, LastChange
  FROM SourceDB.dbo.SourceTable o
  WHERE  LastChange > ISNULL(@lastChange,0)
  ) AS source 
ON target.ID = source.ID 
WHEN MATCHED AND ISNULL(target.Value,0)<>ISNULL(source.Value,0)
    THEN UPDATE SET target.Value = source.Value, target.LastChange=source.LastChange
WHEN NOT MATCHED 
    THEN INSERT (ID, Value, LastChange)
      VALUES(source.ID, source.Value, source.LastChange)
;
 
You will see, that the second query waits for the first transaction to be committed.
Now execute COMMIT command in the first query window to commit the transaction.
After completion of the MERGE command tables will be fully synchronized.
 
Thanks
 
 

Thursday, November 10, 2016 - 2:15:44 AM - Ruslan K. Back To Top

Rowversion generated when INSERT/UPDATE statement proceed, not when commited. For example:

1 0x000000000001  commited
2 0x000000000002  not commited yet
3 0x000000000003  commited
4 0x000000000004  commited


So your predicate LastChange > ISNULL(@lastChange,0), where @lastChange = 0x000000000001 will read only 3 rows instead of 4 rows, and after data synchronizated @lastChange become to 0x000000000004. So second row NEVER will be synchronized.



download

























get free sql tips

I agree by submitting my data to receive communications, account updates and/or special offers about SQL Server from MSSQLTips and/or its Sponsors. I have read the privacy statement and understand I may unsubscribe at any time.



Learn more about SQL Server tools