Learn more about SQL Server tools

mssqltips logo
 

Tutorials          DBA          Dev          BI          Career          Categories          Webcasts          Scripts          Today's Tip          Join

Tutorials      DBA      Dev      BI      Categories      Webcasts

DBA    Dev    BI    Categories

 

Tracking Down SQL Server Replication Distribution Subsystem Agent (null) Failed Errors


By:   |   Last Updated: 2012-06-05   |   Comments (6)   |   Related Tips: More > Replication

Problem

We have a very complicated replication environment and we have a lot of subscribers as well as distributors to support the environment. Once in a while, we get replication errors like below from our distributor server via replication alerts when we didn't drop a replication component cleanly. Since we have so many subscribers and distributors, it is not that easy to find where it is coming from, especially when there is no agent name. Keep in mind that this gets even harder when you are using "Pull Transactional Replication", since you need to track down the server where the job is being run. For push replication, you can find the orphan job on the distributor, but even this requires detective work.

Here is the error message:

find the orphan job from the distributor

Typically, if you use the below query you get the information that you need most of the time, but once in a while you don't get what is needed.

use distribution
SELECT TOP 5 *
FROM msdb.dbo.sysreplicationalerts
WHERE alert_error_text like 'Replication-Replication Distribution Subsystem: agent (null) failed%'
ORDER BY time desc
Solution

There are many ways to troubleshoot this, like using the Central Management Servers feature and writing a multi-server query to get all failed jobs. We can also try to query the MSpublications table, but often that information doesn't have what is needed. So, I used Profiler to track where the error is coming from and I'd like to share what Events and Filters I used to track it down. I also included the Profiler template file for you to download and import at the end of article.

Setup SQL Profiler

For Events you need to add the below events:

Errors and Warnings

  • EventClass: User Error Message
  • ErrorLog
  • EventLog

Store Procedures

  • RPC:Completed

For columns you need to add "HostName"

Here is what the Trace Properties look like.

Here is how the Trace Properties looks like

By the way, I use "Organize Columns" to move "HostName" up in the list as shown below.

I use "Organize Columns" to move "HostName" to be up like below

Modify "Column Filters..." to add "ApplicationName: Replication Distribution History" under "ApplicationName" like below.

add "ApplicationName: Replication Distribution History" under "ApplicationName" like below

Profiler Output

Here is a sample output from the trace events that were set. (Note I masked the HostName to hide this data from my server)

Profiler Output

*Note: defending how busy your distributor is, you may get more than what you need, but my recommendation is to set it up this way to get more data and later use the "Find" feature to search for words like "Replication Distribution Subsystem: agent (null) failed".

Steps to Track Down Failing Replication Component

First, you need to pay attention to "HostName" and get the server name where it comes from.

Next, if you click on the "RPC:Completed" line, you will see something like this.

exec sp_MShelp_distribution_agentid 6,N'<Publisher dbname>',N'<publisher name>',10,N'<Subscriber db name>',1

So, based on this command you can collect the below information:

  • Publisher DB name
  • Subscriber DB name
  • Publication name
  • Host Name

Then use the above information to connect to the server provided by the host name.  You can then find the job on that server and delete the unneeded jobs.

Next Steps
  • Download the Profiler template based on the Events that were used in this tip
  • You can setup alerts to provide the above information automatically.
  • You can write a script to drop and clean up replication in an automated why, so we can eliminate any human errors.
  • Stay tuned for this article "Replication - Clean up Orphan Publisher on Distributor Database" for more techniques on how to clean up orphaned replication components.


Last Updated: 2012-06-05


get scripts

next tip button



About the author
MSSQLTips author Kun Lee Kun Lee is a database administrator and his areas of interest are database administration, architecture, data modeling and development.

View all my tips
Related Resources




Post a comment or let the author know this tip helped.

All comments are reviewed, so stay on subject or we may delete your comment. Note: your email address is not published. Required fields are marked with an asterisk (*).

*Name    *Email    Email me updates 


Signup for our newsletter
 I agree by submitting my data to receive communications, account updates and/or special offers about SQL Server from MSSQLTips and/or its Sponsors. I have read the privacy statement and understand I may unsubscribe at any time.



    



Wednesday, May 28, 2014 - 12:25:14 PM - Bhanu Back To Top

Good Job Kun.  Very clear and straight forward explanation.

 

Keep it up Buddy.


Sunday, September 22, 2013 - 10:56:32 AM - Kun Lee Back To Top

Thanks Dominic!


Thursday, September 19, 2013 - 9:59:12 PM - Dominic Back To Top

Brilliant well written and concise article. Look forward to reading more from you Kun.


Tuesday, September 10, 2013 - 3:56:55 PM - Kun Lee Back To Top

Thanks Andy and Ant!


Tuesday, September 10, 2013 - 3:04:36 AM - Ant Back To Top

Well written article. Helped me resolve the issue in no time.


Friday, May 24, 2013 - 4:38:20 AM - Andy Back To Top

Fantastic Article found my problem in 2 mins


Learn more about SQL Server tools