Handling Data Consistency Errors in SQL Server Transactional Replication

By:   |   Comments (6)   |   Related: > Replication


Problem

Recently at a client assessment, the chief complaint that every time a "Violation Constraint, Cannot insert duplicate key error...." occurs replication breaks and this causes significant downtime on their production environment while they need to setup replication again, reinitialize and resynchronize. Unfortunately with transactional replication, this is totally normal behavior.

The common data consistency errors that can occur are:

  • 2601 Cannot insert duplicate key row in object '%.*ls' with unique index '%.*ls'
  • 20598 The row was not found at the Subscriber when applying the replicated command.
  • 2627 Violation of PRIMARY KEY constraint 'PK__A'. Cannot insert duplicate key in object 'dbo.A'.

Since real-time data replication is synchronous, it is an all-or-none proposition, meaning that in order to guarantee transactional consistency and atomicity, what ever transactions are committed to the publisher, must be committed to the subscriber. For example, a write operation (INSERT, UPDATE, DELETE) either completes on both sides or not at all. These are not considered complete until acknowledgement from the subscriber. Therefore, when there is a data consistency error, and the replication engine cannot guarantee consistency - the distribution agent stops.

Here is a typical violation of a Primary Key constraint error as shown by Replication Monitor:

violation of a primary key constraint error

Ultimately, you must investigate further as to the cause of these consistency errors, and fix the problem. However, there are some options that can allow Replication to continue while you find out the root cause which we will cover in this tip.

Solution

By default, when the Distribution Agent encounters any of the above-mentioned errors, the agent stops. However, there are some workarounds, that will prevent these consistency errors from interfering with replication and let it continue running. SQL Server will log these errors, which is very important, so you can come back to them, understand what the error condition is, why it is occurring, and resolve the issue.

There are some caveats of course, and it is recommended to use the workaround cautiously, as it is better to know why the error occurred or why the specific transaction needs to be skipped rather than resolved. We'll talk about some of these scenarios later on.


Skipping These Errors

To have Replication "ignore" these errors, Microsoft provides us with a set of predefined replication agent files, that are installed on the Distributor. The one we are discussing is the Distribution Agent Profile, which defines parameters in the profiles for the Distribution Agent. Here we can find the -SkipErrors parameter that will skip errors 2601, 2627, and 20598 as described above.

To change the default profile that the Distributor is using, you can access the Agent Profiles dialog box from Replication Monitor.

distribution agent profile

Once Replication Monitor is launched, you'll then need to drill-down to the publisher under "My Publishers", select the appropriate Publication, right-click and select "Agent Profiles" as shown here:

launch replication monitor

When the Agent Profiles window is displayed, make sure the "Distribution Agents" page is selected. Then in the Agent profiles section, click and select the "Default for New" checkbox for "Continue on data consistency errors".

make sure the distribution agents page is selected

You can click on the ellipsis button next to this profile, to view its parameters. The properties dialog box will open, and if you scroll down a bit, you will see the -SkipErrors Parameter, as highlighted below:

skip errors parameter

Finally, click , and then to set the new default profile for the Distribution Agent.


Enabling the Profile Changes

Before the new profile takes affect, you MUST restart the Distribution Agent. There are two ways to do this. I think this confuses most people, because there are various replication agents, but none specifically say "Distribution Agent".

  1. You can drill-down in SSMS to your Replication Folder --> Local Subscriptions
  2. Select your subscriber, and right click "View Synchronization Status" You will see the START/STOP buttons. This is your distribution agent.
  3. Click . A message will prompt you, "Are you sure you want to stop synchronizing? Data that has already been synchronized will not be rolled back"
  4. Click .
  5. Once the agent is stopped, then click to start synchronizing again.

drill-down in ssms to yor replication folder

To restart the distribution agent in T-SQL, you can run the following commands from a SQL Query Window, and must provide the specified parameter info:

-- To STOP the Distribution Agent:
sp_MSstopdistribution_agent @publisher, @publisher_db, @publication, @subscriber, @subscriber_db
 
--To START the Distribution Agent:
sp_MSstartdistribution_agent @publisher, @publisher_db, @publication, @subscriber, @subscriber_db

The Continue on Data Consistency Errors profile is now active.


Reasons Why Consistency Errors Occur in Replication

Now that we have told Replication to "skip" these errors, and carry on, this band-aid approach still needs to be looked at more closely. We need to answer, why are these errors occurring? Here are some of the reasons why:

  • Your subscriber should be "read-only". In other words, the only thing that should be writing to your Subscriber, is the Publisher. You need to check if you are doing ad-hoc updates, inserts, deletes of your own. (Applies to 1-way transactional replication)
  • Check to see if you have any triggers enabled at the subscriber. If triggers are fired, there could be INSERTS, UPDATES and DELETES putting duplicate rows in a table(s).
  • Check if you are replicating identity columns which can cause "duplicate key" errors, and primary key collisions.
  • Deferred Updates - UPDATE Statements May be Replicated as DELETE/INSERT Pairs - can cause constraint errors.
  • Make sure the publisher and subscriber are in-sync - Run Validations, or any data compare tool between the two databases.

You can view the logged consistency errors by executing the following query against the distribution database:

Use Distribution
go

select * from dbo.MSrepl_errors
where error_code in ('2601','2627','25098')

Important information to help troubleshoot, such as the time, error details, and the xact_seqno, resides in this table. You will see similar output to this:

select from dbo.msrepl_errors

In my next article on this topic, I will expand and provide ways of actually resolving these consistency errors, and ensure that your publisher and subscriber are in perfect harmony.

Next Steps


sql server categories

sql server webinars

subscribe to mssqltips

sql server tutorials

sql server white papers

next tip



About the author
MSSQLTips author Robert Pearl Robert Pearl is a SQL MVP, and President of Pearl Knowledge Solutions, Inc., offering SQLCentric monitoring and DBA services.

This author pledges the content of this article is based on professional experience and not AI generated.

View all my tips



Comments For This Article




Friday, December 26, 2014 - 12:36:13 AM - Pejman Back To Top (35764)

Hi Rob,

 

Could you please mention the next step and articles as pair to this?

 

Regards,

Pejman

 


Thursday, October 2, 2014 - 1:01:57 AM - Bill Back To Top (34815)

Be careful: Skipping errors can cause additional replication errors to occur. Under certain circumstances (which are perhaps best understood by the database's and the client DML's designers), the number of new replication errors (caused by running -skiperrors) can be greater than the number of replication errors that were skipped. Those who use -skiperrors can thus end up causing a even more critical need to reinitialize a subscription. There are some systems where a reinitialization is considered onerous, so when -skiperrors is used, the need to reinitialize the subscriber can be increased.... Use skiperrors carefully -  do not blindly try it. You must know why you are using -skiperrors, and you must understand the consequences:). It may be better to validate first.


Wednesday, May 8, 2013 - 6:16:41 AM - Srikanth Back To Top (23803)

**********Hello Robert,

Thanks for providing such a good article.

It helped me to some extent but we require a permanant solution to reduce these kind of errors.

Please post an article on the permanat solution.

 

Regards

Srikanth********

 

 

 

 


Thursday, May 10, 2012 - 11:00:12 AM - deepak Back To Top (17400)

 

Hi Robert,

This is indeed a good article. Please write some articles on all the 3 consistency error. We are eager to hear something on these from your side.\\

Thanks

Deepak


Monday, August 8, 2011 - 10:46:55 PM - Abi Chapagai Back To Top (14346)

Good article Robert. This really helped me.


Monday, August 8, 2011 - 10:12:17 AM - Mohammed Moinudheen Back To Top (14341)

Very nice article, looking forward for your next tips on replication.















get free sql tips
agree to terms