Learn more about SQL Server tools

   
   















































Troubleshooting Slow SQL Server Replication issue due to Distributor Database Growth

MSSQLTips author Kun Lee By:   |   Read Comments (10)   |   Related Tips: More > Replication

Problem
My company has a very replication intensive environment (mostly transactional replication) and recently I started to see that replicating data was taking longer and longer.  In this tip I will show you some of the techniques that I used to troubleshoot the issue as well as what the final resolution was.

Solution
In addition to providing the solution, let me provide the symptoms to my issue and provide how I started troubleshooting the issue.


Basic Symptoms

Here are list of symptoms that I had.

  • Taking longer to replicate data.
  • High CPU on Distributor server
  • High Disk IO on Distributor server
  • High growth rate on Distribution database

Troubleshooting High CPU and High Disk IO

First, I wanted to know why I was getting high CPU and high Disk IO on the distributor server, so I setup Performance Monitor (also known as perfmon) on the server. You can read this article "Creating SQL Server performance based reports using Excel" on how I setup perfmon to collect data and how I found high CPU/Disk IO for about 7~8min every 10min. That means, we had a job running every 10min with a duration of 7~8min. 

I was able to found out there are two jobs that are causing the issue and they were related to replication clean up.

  • Agent history clean up: distribution
  • Distribution clean up: distribution

So, I disabled these two jobs and ran them manually to see if that was causing the high CPU/Disk IO and I was able to find out the "Distribution clean up: distribution" job that runs "EXEC dbo.sp_MSdistribution_cleanup @min_distretention = 0, @max_distretention = 48" stored procedure was causing the problem and when I ran it manually, it ran for more than 7min and returned the below message.

Removed 0 replicated transactions consisting of 0 statements in 451 seconds (0 rows/sec).

 


Troubleshooting Distribution cleanup issue

From previous research, I was able to find out why we were having high CPU issues and why the distributor database was getting bigger. The issue was that the clean up job was not removing the commands and I was able to see that table "dbo.MSrepl_commands" was 27GB and kept growing everyday.

So, I took a look at the Distributor Properties and I confirmed that retention was set correctly as you can see below.

 


Fixing Retention days for Each Publication

The next step is to check the retention days for each publication and that is where I found the problem. Many of our publication properties subscription expiration setup was set to "never expire" as you can see from the screen capture below.

 

Now, I can open each publication to see how it is setup, but as I mentioned we have many publications so I didn't want to waste my time opening each publication using SQL Server Management Studio.  So instead,  I used the query that I wrote in this tip "SQL Server Replication Scripts to Pull Replication Configuration Information" and added one more column to the script to pull retention days as shown below.

set transaction isolation level read uncommitted
select distinct 
srv.srvname publication_server 
, a.publisher_db
, p.publication publication_name
, p.retention
, ss.srvname subscription_server
, s.subscriber_db
from MSArticles a 
join MSpublications p on a.publication_id = p.publication_id
join MSsubscriptions s on p.publication_id = s.publication_id
join master..sysservers ss on s.subscriber_id = ss.srvid
join master..sysservers srv on srv.srvid = p.publisher_id
join MSdistribution_agents da on da.publisher_id = p.publisher_id 
and da.subscriber_id = s.subscriber_id
ORDER BY p.retention 

 

 


Confirmation

After I identified and changed the retention days to 120 hours and ran the above job, the distributor database space used was reduced from 27GB to 300MB.  After that I shrunk the database to 500MB and then I re-indexed all tables in the distribution database and ran the clean up job again. Now, it takes less 3 sec to run the clean up job.  Also, replication as a whole is a lot faster.

Next Steps

  • Setup an Alert to notify if any new publication is created where retention day is set to never expire
  • Implement additional replication monitoring
  • To shrink databases or database files look at DBCC SHRINKDATABASE and DBCC SHRINKFILE


Last Update: 8/25/2009


About the author
MSSQLTips author Kun Lee
Kun Lee is a database administrator and his areas of interest are database administration, architecture, data modeling and development.

View all my tips
Related Resources


print tip Print  
Become a paid author





join MSSQLTips for free SQL Server tips     



Learn more about SQL Server tools
Post a comment or let the author know this tip helped you.

       All comments are reviewed, so stay on subject or we may delete your comment.

*Name   *Email Notify for updates



       Note: your email address is not published. Required fields are marked with an asterisk (*)


Get free SQL tips:

*Enter Code refresh code     



Friday, October 24, 2014 - 5:19:56 PM - Marios Philippopoulos Read The Tip

Hi, thank you for this informative article.

How can we decide how many hrs to set the subscription expiration period to?

What factors would dictate setting it to 5 days (120 hrs) vs., say, 30 days (720 hrs)?

Thanks again,

Marios Philippopoulos

 

 


Sunday, September 08, 2013 - 8:27:30 AM - Muhanned Maayeh Read The Tip

Question for Transactional Replication, do you really set the subscription to expire after 336 or 120 hours.  I thought I read documentation per Microsoft that the recommendation is to never expire but for merge replication it recommends expiration of 336 hours.  Can you please clarify this point?  I have a transactional replication at a customer site and when the Distribution Clean job ran it produced slowness in our system where the job needed to be stopped and disabled.  Can you please advise on the proper course of action with a similar problem as described.  (note, using SQL 2008 R2 engine).


Tuesday, April 30, 2013 - 4:52:44 PM - Kun Lee Read The Tip

Salu,

I assume that you are looking at the Distributor SQL server? I haven't seen that case. Let me research for you but I haven't come accross that issue before.

Kun


Tuesday, April 30, 2013 - 4:52:29 PM - Kun Lee Read The Tip

Vijay,

To make long story short, even if it is simple mode, the log file won't flush out the log until Log Reader reads the data and mark it. You can easily reproduce the situation (please don't do on production db), you can create a publisher with simple recovery mode and setup replication and stop the log reader and start loading data to the db and watch the open transaction and log space by using

- DBCC SQLPERF(LOGSPACE)

- use <db name>

DBCC OPENTRAN

And you will see that. As matter of fact, you can search many issue because of the log reader wasn't mark it soon due to heavy load if you google it..

 


Friday, April 26, 2013 - 5:57:16 PM - Salu Read The Tip

I am not seeing these two jobs on my SQL Server and I have transactional replication set up. Do you know why?

  • Agent history clean up: distribution
  • Distribution clean up: distribution

Thanks


Sunday, August 26, 2012 - 2:40:41 AM - Vijay Read The Tip

In replication ,when Backup is SIMPLE RECOVERY MODE. In simple recovry mode. t-log will be treuncate. then,how wll be working replication when, back up is SIMPLE RECOVERY MODE.

 

Vijay


Friday, May 04, 2012 - 11:58:50 AM - sam Read The Tip

Thanks! Saved the busy day!

 


Thursday, April 12, 2012 - 8:21:25 PM - Daniel Read The Tip

Kun,

Fantasic article.  One of our Distribution DBs grew out of control just today.  Your solution has helped me greatly to resolve this issue and to learn more about how replication works!   

Thank you so much!

 


Monday, May 03, 2010 - 4:43:31 PM - drsloat Read The Tip

That helped me zero in on my problem very quickly.  My cleanup job had been running for 15 minutes.  Thanks for the write up!


Tuesday, November 17, 2009 - 2:49:06 PM - mictiu Read The Tip

Hi,

Thanks for the excellent tips .. but want to understand something more in details. In quite new to replication but from ur tips here .. i still can't get what ur meaning of

  • Setup an Alert to notify if any new publication is created where retention day is set to never expire
  • your retention day is set to never expire here refer to Subscription expiration in publication properties or Transaction Retention in Distribution properties. Sorry to ask such a question but just to make myself clear :) ....

    Can this setting help to prevent my replication to become inactive and need to be reinitialized ? Coz im also having problem where replication didn't run for numbers of days like 3 days ... then it become inactive. Please advice and if can notify me through email .. mictiu@mafrica.com.my




     
    Sponsor Information