Monitoring Jobs on Multiple Servers by Using Multi Server Query
There are often SQL Agent jobs that need to run on multiple servers in a coordinated effort, such as refreshing Development and Test environments from a Production server. This requires that many jobs run on different servers and it requires many different steps. Some of these steps maybe kicking off jobs to delete replication settings, making sure all jobs that need to run are finished, starting jobs that restore databases, etc... Because there are so many jobs and many of these jobs maybe running on different servers it is hard to get an overall status for the entire process. Each job might send an email when it is done, but since there are so many jobs involved it may be difficult to get the overall picture from a bunch of emails. In this tip, I will walk through a process I setup to monitor multiple jobs from multiple servers from one central location.
There are many different ways to get this type of data, but for this tip I am going to utilize Multi Server Query. For those who are new to Multi Server Query, please see my previous article Execute SQL Server query on multiple servers at the same time for more detail.
Setup a Multi Server Group
Before you do anything, you need to setup the server group for servers that you want to query. For this tip, I am going to just setup Local Server Groups, but I highly recommend using Central Management Servers, so that other team members can also share the same groups.
How to Execute a Multi Server Query
First, right click on the group and click "New Query" as shown below.
This should create a connection to your group of servers as shown below. In our example "_MSSQLTIP_SERVER".
Once you are connected, you can copy and paste the script or open the script file from the connection box that you just created and configure it to query the jobs you want to track.
Here is the code that is used for this process.
SET NOCOUNT ON SET TRANSACTION ISOLATION LEVEL READ UNCOMMITTED DECLARE @JobList TABLE (Servername varchar(128), Jobname varchar(500), Stage TINYINT); DECLARE @Stage TINYINT
/* == Stage Level == 1 - Part 1 : Put Maintenance Mode Check 2 - Part 2 : Drop Replication Check 3 - Part 3 : Start Restore 4 - Part 4 : Bring Replication Back and full refresh 5 - Part 5 : Bring all online */
SET @Stage = 1
INSERT INTO @JobList SELECT 'MyServer1\SQLA', 'Test_OSQLFailed', 1 UNION ALL SELECT 'MyServer1\SQLA', 'Test job', 1 UNION ALL SELECT 'MyServer2\SQLB', 'Test job', 2 UNION ALL SELECT 'MyServer2\SQLB', 'syspolicy_purge_history', 2 UNION ALL SELECT 'MyServer3\SQLC', 'Expired subscription clean up', 1 UNION ALL SELECT 'MyServer3\SQLC', 'Test job', 1
--Checking for SQL Server version IF CONVERT(tinyint,(SUBSTRING(CONVERT(CHAR(1),SERVERPROPERTY('productversion')),1,1))) <> 8 BEGIN ---This is for SQL 2k5 and SQL2k8 servers SET NOCOUNT ON SELECT j.name AS job_name, CASE j.enabled WHEN 1 THEN 'Enabled' Else 'Disabled' END AS job_status, CASE jh.run_status WHEN 0 THEN 'Error Failed' WHEN 1 THEN 'Succeeded' WHEN 2 THEN 'Retry' WHEN 3 THEN 'Cancelled' WHEN 4 THEN 'In Progress' ELSE 'Status Unknown' END AS 'last_run_status', ja.run_requested_date as last_run_date, CONVERT(VARCHAR(10),CONVERT(DATETIME,RTRIM(19000101)) +(jh.run_duration * 9 + jh.run_duration % 10000 * 6 + jh.run_duration % 100 * 10) / 216e4,108) AS run_duration, ja.next_scheduled_run_date FROM (msdb.dbo.sysjobactivity ja LEFT JOIN msdb.dbo.sysjobhistory jh ON ja.job_history_id = jh.instance_id) join msdb.dbo.sysjobs_view j on ja.job_id = j.job_id join @JobList jl on jl.Servername = @@servername and jl.Jobname = j.name and jl.Stage = @Stage WHERE ja.session_id=(SELECT MAX(session_id) from msdb.dbo.sysjobactivity) ORDER BY job_name,job_status END ELSE BEGIN --This is for SQL2k servers SET NOCOUNT ON DECLARE @SQL VARCHAR(8000)
--Getting information from sp_help_job to a temp table SET @SQL=' SELECT job_id,name AS job_name,CASE enabled WHEN 1 THEN ''Enabled'' ELSE ''Disabled'' END AS job_status, CASE last_run_outcome WHEN 0 THEN ''Error Failed'' WHEN 1 THEN ''Succeeded'' WHEN 2 THEN ''Retry'' WHEN 3 THEN ''Cancelled'' WHEN 4 THEN ''In Progress'' ELSE ''Status Unknown'' END AS last_run_status, CASE RTRIM(last_run_date) WHEN 0 THEN 19000101 ELSE last_run_date END last_run_date, CASE RTRIM(last_run_time) WHEN 0 THEN 235959 ELSE last_run_time END last_run_time, CASE RTRIM(next_run_date) WHEN 0 THEN 19000101 ELSE next_run_date END next_run_date, CASE RTRIM(next_run_time) WHEN 0 THEN 235959 ELSE next_run_time END next_run_time, last_run_date AS lrd, last_run_time AS lrt INTO ##jobdetails FROM OPENROWSET(''sqloledb'', ''server=(local);trusted_connection=yes'', ''set fmtonly off exec msdb.dbo.sp_help_job'')' exec (@SQL)
--Merging run date & time format, adding run duration and adding step description SELECT distinct jd.job_name,jd.job_status,jd.last_run_status, CONVERT(DATETIME,RTRIM(jd.last_run_date)) +(jd.last_run_time * 9 + jd.last_run_time % 10000 * 6 + jd.last_run_time % 100 * 10) / 216e4 AS last_run_date, CONVERT(VARCHAR(10),CONVERT(DATETIME,RTRIM(19000101))+(jh.run_duration * 9 + jh.run_duration % 10000 * 6 + jh.run_duration % 100 * 10) / 216e4,108) AS run_duration, CONVERT(DATETIME,RTRIM(jd.next_run_date)) +(jd.next_run_time * 9 + jd.next_run_time % 10000 * 6 + jd.next_run_time % 100 * 10) / 216e4 AS next_scheduled_run_date FROM (##jobdetails jd LEFT JOIN msdb.dbo.sysjobhistory jh ON jd.job_id=jh.job_id AND jd.lrd=jh.run_date AND jd.lrt=jh.run_time JOIN @JobList jl on jl.Jobname = jd.job_name and jl.Servername = @@servername and jl.Stage = @Stage) WHERE step_id=0 or step_id is null ORDER BY jd.job_name,jd.job_status
--dropping the temp table DROP TABLE ##jobdetails END
Let me explain how this code is used.
For this sample code, I defined stages 1 through 5. This is used as a grouping for the various parts of my process. I have multiple jobs that run under each stage, so this helps me narrow down what is returned in the query. Other than that, the stage has no real meaning.
The key part of this process is the jobs that I am inserting into my temp table @JobList.
There are three values that are used:
- SQL Server Instance Name - this is the SQL instance that I want to check for the SQL Agent Job status
- Job Name - this is the name of the SQL Agent Job that I want to check
- Stage - this is an arbitrary grouping of the jobs. You can see how I classified them in the Stage Level comments below
Another key part of this code is to set the @Stage variable. This is basically setting a filter which determines what data is returned in my query. In this case I am setting the @Stage = 1 which will only return the three jobs that have a Stage = 1 as shown below. This example is pretty simple, but if you have 10 or 20+ jobs that run for one process, you can see how this filtering could be very useful.
So when the code is run, as you can see below, it only returns the three jobs that have a Stage = 1.
Notes About the SQL Code
I recommend downloading the script here just in case you have issues with copy the code.
- This code has been tested on SQL 2000 to SQL 2008 R2.
- Some of the code I've had for a long time and for SQL 2000 I did not write this originally. This is a very smart way to pull information for SQL 2000, so that it works on other versions.
- The script has limitations that if a job has only one step and it is still running, the last_run_status will return as "Status Unknown", which you can see in my above example.
The code uses a combination of OPENROWSET and "SET FMTONLY OFF" to make sure this works for both SQL Server 2000 and SQL 2005 / SQL 2008. For more details, I highly recommend reading Using SQL Servers OPENROWSET to break the rules. The rest of the code is just formatting the output to be more readable.
There you have it. Obviously, there are so many other ways to use this script and like someone that I know always says, "The sky's the limit" when it comes to using Multi Server Query.
- Download the script
- Enhance the query to pull job status information more accurately by using xp_sqlagent_enum_jobs
- Enhance the query to include the steps if there are multiple steps that run
- Enhance the query to include the job log information
Last Updated: 2012-07-03
About the author
View all my tips