There are often SQL Agent jobs that need to run on multiple servers in a coordinated effort, such as refreshing Development and Test environments from a Production server. This requires that many jobs run on different servers and it requires many different steps. Some of these steps maybe kicking off jobs to delete replication settings, making sure all jobs that need to run are finished, starting jobs that restore databases, etc... Because there are so many jobs and many of these jobs maybe running on different servers it is hard to get an overall status for the entire process. Each job might send an email when it is done, but since there are so many jobs involved it may be difficult to get the overall picture from a bunch of emails. In this tip, I will walk through a process I setup to monitor multiple jobs from multiple servers from one central location.
Before you do anything, you need to setup the server group for servers that you want to query. For this tip, I am going to just setup Local Server Groups, but I highly recommend using Central Management Servers, so that other team members can also share the same groups.
How to Execute a Multi Server Query
First, right click on the group and click "New Query" as shown below.
This should create a connection to your group of servers as shown below. In our example "_MSSQLTIP_SERVER".
Once you are connected, you can copy and paste the script or open the script file from the connection box that you just created and configure it to query the jobs you want to track.
Here is the code that is used for this process.
SET NOCOUNT ON
SET TRANSACTION ISOLATION LEVEL READ UNCOMMITTED
DECLARE @JobList TABLE (Servername varchar(128), Jobname varchar(500), Stage TINYINT);
DECLARE @Stage TINYINT
== Stage Level ==
1 - Part 1 : Put Maintenance Mode Check
2 - Part 2 : Drop Replication Check
3 - Part 3 : Start Restore
4 - Part 4 : Bring Replication Back and full refresh
5 - Part 5 : Bring all online
SET @Stage = 1
INSERT INTO @JobList
SELECT 'MyServer1\SQLA', 'Test_OSQLFailed', 1
SELECT 'MyServer1\SQLA', 'Test job', 1
SELECT 'MyServer2\SQLB', 'Test job', 2
SELECT 'MyServer2\SQLB', 'syspolicy_purge_history', 2
SELECT 'MyServer3\SQLC', 'Expired subscription clean up', 1
SELECT 'MyServer3\SQLC', 'Test job', 1
--Checking for SQL Server version
IF CONVERT(tinyint,(SUBSTRING(CONVERT(CHAR(1),SERVERPROPERTY('productversion')),1,1))) <> 8
---This is for SQL 2k5 and SQL2k8 servers
SET NOCOUNT ON
SELECT j.name AS job_name,
CASE j.enabled WHEN 1 THEN 'Enabled' Else 'Disabled' END AS job_status,
CASE jh.run_status WHEN 0 THEN 'Error Failed'
WHEN 1 THEN 'Succeeded'
WHEN 2 THEN 'Retry'
WHEN 3 THEN 'Cancelled'
WHEN 4 THEN 'In Progress' ELSE
'Status Unknown' END AS 'last_run_status',
ja.run_requested_date as last_run_date,
+(jh.run_duration * 9 + jh.run_duration % 10000 * 6 + jh.run_duration % 100 * 10) / 216e4,108)
LEFT JOIN msdb.dbo.sysjobhistory jh ON ja.job_history_id = jh.instance_id)
join msdb.dbo.sysjobs_view j on ja.job_id = j.job_id
join @JobList jl on jl.Servername = @@servername and jl.Jobname = j.name and jl.Stage = @Stage
WHERE ja.session_id=(SELECT MAX(session_id) from msdb.dbo.sysjobactivity)
ORDER BY job_name,job_status
--This is for SQL2k servers
SET NOCOUNT ON
DECLARE @SQL VARCHAR(8000)
--Getting information from sp_help_job to a temp table
SELECT job_id,name AS job_name,CASE enabled WHEN 1 THEN ''Enabled'' ELSE ''Disabled'' END AS job_status,
CASE last_run_outcome WHEN 0 THEN ''Error Failed''
WHEN 1 THEN ''Succeeded''
WHEN 2 THEN ''Retry''
WHEN 3 THEN ''Cancelled''
WHEN 4 THEN ''In Progress'' ELSE
''Status Unknown'' END AS last_run_status,
CASE RTRIM(last_run_date) WHEN 0 THEN 19000101 ELSE last_run_date END last_run_date,
CASE RTRIM(last_run_time) WHEN 0 THEN 235959 ELSE last_run_time END last_run_time,
CASE RTRIM(next_run_date) WHEN 0 THEN 19000101 ELSE next_run_date END next_run_date,
CASE RTRIM(next_run_time) WHEN 0 THEN 235959 ELSE next_run_time END next_run_time,
last_run_date AS lrd, last_run_time AS lrt
FROM OPENROWSET(''sqloledb'', ''server=(local);trusted_connection=yes'', ''set fmtonly off exec msdb.dbo.sp_help_job'')'
--Merging run date & time format, adding run duration and adding step description
+(jd.last_run_time * 9 + jd.last_run_time % 10000 * 6 + jd.last_run_time % 100 * 10)
/ 216e4 AS last_run_date,
CONVERT(VARCHAR(10),CONVERT(DATETIME,RTRIM(19000101))+(jh.run_duration * 9
+ jh.run_duration % 10000 * 6 + jh.run_duration % 100 * 10) / 216e4,108) AS run_duration,
CONVERT(DATETIME,RTRIM(jd.next_run_date)) +(jd.next_run_time * 9
+ jd.next_run_time % 10000 * 6 + jd.next_run_time % 100 * 10) / 216e4 AS next_scheduled_run_date
FROM (##jobdetails jd
LEFT JOIN msdb.dbo.sysjobhistory jh ON jd.job_id=jh.job_id AND jd.lrd=jh.run_date AND jd.lrt=jh.run_time
JOIN @JobList jl on jl.Jobname = jd.job_name and jl.Servername = @@servername and jl.Stage = @Stage)
WHERE step_id=0 or step_id is null
ORDER BY jd.job_name,jd.job_status
--dropping the temp table
DROP TABLE ##jobdetails
Let me explain how this code is used.
For this sample code, I defined stages 1 through 5. This is used as a grouping for the various parts of my process. I have multiple jobs that run under each stage, so this helps me narrow down what is returned in the query. Other than that, the stage has no real meaning.
The key part of this process is the jobs that I am inserting into my temp table @JobList.
There are three values that are used:
SQL Server Instance Name - this is the SQL instance that I want to check for the SQL Agent Job status
Job Name - this is the name of the SQL Agent Job that I want to check
Stage - this is an arbitrary grouping of the jobs. You can see how I classified them in the Stage Level comments below
Another key part of this code is to set the @Stage variable. This is basically setting a filter which determines what data is returned in my query. In this case I am setting the @Stage = 1 which will only return the three jobs that have a Stage = 1 as shown below. This example is pretty simple, but if you have 10 or 20+ jobs that run for one process, you can see how this filtering could be very useful.
So when the code is run, as you can see below, it only returns the three jobs that have a Stage = 1.
Notes About the SQL Code
I recommend downloading the script here just in case you have issues with copy the code.
This code has been tested on SQL 2000 to SQL 2008 R2.
Some of the code I've had for a long time and for SQL 2000 I did not write this originally. This is a very smart way to pull information for SQL 2000, so that it works on other versions.
The script has limitations that if a job has only one step and it is still running, the last_run_status will return as "Status Unknown", which you can see in my above example.
The code uses a combination of OPENROWSET and "SET FMTONLY OFF" to make sure this works for both SQL Server 2000 and SQL 2005 / SQL 2008. For more details, I highly recommend reading Using SQL Servers OPENROWSET to break the rules. The rest of the code is just formatting the output to be more readable.
There you have it. Obviously, there are so many other ways to use this script and like someone that I know always says, "The sky's the limit" when it comes to using Multi Server Query.