By: Kun Lee | Comments (2) | Related: > Replication
Problem
Once in a while, we notice there are huge SQL Server replication delays, but we do not easily notice when “Undistributed Commands” from replication monitor says “0” commands left while there is still a huge delay on the subscriber server(s). The problem was latency on the SQL Server Log Reader Agent. There are many ways to monitor the distribution agent once data is loaded to the distribution database, but I haven’t see any good alerts kick off due to Log Reader Agent Latency. So in this tip we will cover how this can be done.
Solution
Detecting Issue via Replication Monitor
This is how you can check via Replication Monitor. Below is a screenshot of what it looks like when you see the Log Reader Latency causing issues while “Number of commands” shows zero.
This is Log Reader Agent status and you will see “Latency” as 16100 (ms) below.
Solution I – General T-SQL Code to show status
Here is the basic code that you can run on the distribution database. You can also download the code here.
USE distribution go SET TRANSACTION ISOLATION LEVEL READ UNCOMMITTED SELECT ma2.publisher_db, mh1.delivery_latency / ( 1000 * 60 ) AS delivery_latency_Minutes, mh1.agent_id , mh1.time, CAST(mh1.comments AS XML) AS comments, CASE mh1.runstatus WHEN 1 THEN 'Start' WHEN 2 THEN 'Succeed.' WHEN 3 THEN 'In progress.' WHEN 4 THEN 'Idle.' WHEN 5 THEN 'Retry.' WHEN 6 THEN 'Fail' END AS Status, mh1.duration, mh1.xact_seqno, mh1.delivered_transactions, mh1.delivered_commands, mh1.average_commands, mh1.delivery_time, mh1.delivery_rate, ma2.name as jobname FROM mslogreader_history mh1 JOIN ( SELECT mh1.agent_id, MAX(mh1.time) as maxtime FROM mslogreader_history mh1 JOIN MSlogreader_agents ma on ma.id = mh1.agent_id GROUP BY mh1.agent_id) AS mh2 ON mh1.agent_id = mh2.agent_id and mh1.time = mh2.maxtime JOIN MSlogreader_agents ma2 on ma2.id = mh2.agent_id ORDER BY mh1.delivery_latency desc
NOTE: Before you drill down, please make sure to watch the “comments”. If it says “No replicated transactions are available”, you are not having Log Reader Latency issues despite the “delivery_latency_Minutes” says otherwise. I noticed that “delivery_latency_Minutes” takes time to reset back to 0. Below is an example of the screenshot you can ignore.
Here is what it looks like when you have a potential issue.
Solution II – Sample code for setting up a SQL Server Agent Job
Here is an example code you can utilize the code. Basically, the code has two parts. I am doing a different, more complicated way, but this is a simple way to show how you can easily apply the code to your environment.
- Part 1: Generate uspGetLogReaderAgentStatus
- Part 2: Code to setup a SQL Server Agent Job
The area that you need to watch is the threshold parameter. You can change the value based on your environment needs. Also you can see I filtered out “No replicated transactions are available” for alerts.
/******************************************************************************/ DECLARE @AlertThresholdinMin INT = 10 -- If more than X min, delay, get alerts /******************************************************************************/ SET @AlertThresholdinMin = @AlertThresholdinMin * 1000 * 60 SELECT cast(ma2.publisher_db as varchar(32)) as dbname, cast(mh1.delivery_latency / ( 1000 * 60 ) as int) AS delivery_latency_Minutes, mh1.time INTO ##tmpLogReaderLatencyStatus FROM mslogreader_history mh1 JOIN ( SELECT mh1.agent_id, MAX(mh1.time) as maxtime FROM mslogreader_history mh1 JOIN MSlogreader_agents ma on ma.id = mh1.agent_id GROUP BY mh1.agent_id) AS mh2 ON mh1.agent_id = mh2.agent_id and mh1.time = mh2.maxtime JOIN MSlogreader_agents ma2 on ma2.id = mh2.agent_id WHERE mh1.delivery_latency > @AlertThresholdinMin and mh1.comments not like '%No replicated transactions are available.%'
That’s it. Please feel free to change the code to make it work for your environment. Also, I highly recommend reading the below two articles. You will see more issues on distribution agent delay then Log Reader Agent delay and fully understanding replication latency is critical for replication monitoring in general.
- Troubleshooting transactional replication latency issues in SQL Server by Susantha Bathige
- Determine data latency between Publisher and Subscriber in SQL Server Transactional Replication by Mohammed Moinudheen
Next Steps
- It always better to send notifications with a nice formatted body from the query result.
- I would collect the latency duration and save as history and setup a reporting server to pull last week, month or yesterday's data for trending analysis. This will be good information if you want to optimize the log reader process like putting the log file on faster disk (SSD), removing some high transaction tables to a separate database, etc.
About the author
This author pledges the content of this article is based on professional experience and not AI generated.
View all my tips