Find SQL Server Downtime Window After Reboot

By:   |   Comments (9)   |   Related: > High Availability


Problem

Each time we get a production SQL Server reboot incident, we need to investigate and prepare an "Incident Report" in which the DBA Team must report the duration that the SQL Server service was unavailable.  Is there an easy way to determine the downtime window after a reboot?  Check out this tip to learn more.

Solution

First, for a SQL Server service restart notification there are two ways (assuming Database Mail is setup correctly) to determine the downtime:

  • Send a notification via a startup stored procedure, see sp_procoption

  • Send a notification via a SQL Server Agent Job, whose schedule is set to start when the SQL Server Agent service starts. Here we assume SQL Server Agent is set to auto-start, i.e. when the SQL Server service starts, the agent service starts too.

Second, to determine the SQL Server service downtime window there are two ways to determine the downtime as well:

  • Finding the time for the last entry in the previous SQL Server Error Log and the first entry of the current SQL Server Error Log
  • Finding the time for the last entry in the previous default SQL Server Trace file and the first entry of the current default Trace file

Note: these two methods may not give accurate information in some scenarios, such as a sudden power outage (assuming there is not a UPS), but in the majority of cases these two methods work fine.  I once had a case whrere a managed clustered node failed over to another node. The whole failover downtime window was only around 14 seconds and since it was so short, nobody noticed this incident except for the DBA team.

SQL Server Error Log Entries Indicating Downtime

Here is what the SQL Server Error Log entries look like:

SQLError_Log

Query the SQL Server Default Trace to Determine Downtime

If we check the previous default trace file using the code below:

-- need to change to the trace file path to your own:
SELECT f.EventClass, f.EventSubClass, te.name, v.SubClass_Name, f.StartTime
FROM fn_trace_gettable('C:\Program Files\Microsoft SQL Server\MSSQL10_50.MSSQLSERVER\MSSQL\Log\log_509.trc', 1) f 
inner join sys.trace_events te
on te.trace_event_id = f.eventclass
inner join sys.trace_subclass_values v
on v.trace_event_id = f.eventclass
and f.eventsubclass = v.subclass_value
where f.eventclass=18
ORDER BY f.STARTTIME DESC

We will get:

trace info when sql service stops

Determining SQL Server Downtime from the Default SQL Server Trace and Email DBAs

I will provide two methods, one is a stored procedure (checking the default trace) to be used as a startup procedure and another a PowerShell script (checking the SQL Server Error Log) to be used in a SQL Server Agent Job which is scheduled to run when the SQL Server Agent service starts. These scripts assume the default SQL Server Trace is not disabled and PowerShell 2.0 is installed.

/*
Created: Jun 1, 2014 by Jeffrey Yao

Function: Send email notification about the sql service restart
and report the down time window.
It is applicable to sql server 2005 and above version only.

Assumption: The database mail is already configured and workable on the sql server instance.
*/

use master
if object_id('dbo.uspRestartAlert', 'P') is not null
 drop proc dbo.uspRestartInfo;
go
create proc dbo.uspRestartAlert
as
begin
 set nocount on;
 declare @curr_trc_file varchar(256), @prev_trc_file varchar(256);
 declare @trc_file_name varchar(128), @trc_file_folder varchar(256);
 declare @trc_num int, @slash_pos int;
 declare @stopTime datetime, @startTime datetime;
 declare @msg varchar(500), @subject varchar(100), @crlf char(2);
 declare @recipients varchar(100), @sqlcmd varchar(max);

 set @recipients = '<your email account>'; -- change to the proper account

 set @crlf = nchar(0x0d) + nchar(0x0a);
 select @curr_trc_file = '', @msg='';

 -- check downtime window details
 select @curr_trc_file = [path] from sys.traces
 where id =1 and status =1;

 set @subject= 'Server ' + quotename(@@servername, '[]') + ' has restarted';

 if @curr_trc_file <> ''
 begin
 -- the following is to try to get the number in the current default trace file name,
 -- default trace file has a naming convention like <path>\log_<num>.trc
  set @curr_trc_file = REVERSE(@curr_trc_file);
  set @slash_pos = CHARINDEX('\', @curr_trc_file);
  set @trc_file_folder = reverse(SUBSTRING(@curr_trc_file, @slash_pos, 256));
  set @trc_file_name = reverse(SUBSTRING(@curr_trc_file, 1, @slash_pos-1));
 
  set @trc_num = cast(SUBSTRING(@trc_file_name, 5, len(@trc_file_name)-8) as int) -- 8 = length of "log_" plus ".trc"

  set @curr_trc_file = REVERSE(@curr_trc_file);
  set @prev_trc_file = @trc_file_folder + 'log_' + CAST((@trc_num-1) as varchar(12)) + '.trc'
  select @stopTime=max(starttime) from fn_trace_gettable(@prev_trc_file, 1) -- get the last StartTime of the previous trace 
  select @startTime=min(starttime) from fn_trace_gettable(@curr_trc_file, 1) -- get the first StartTime of the current trace

  set @msg =  'The downtime window is from ' + CONVERT(varchar(30), @stopTime, 120) + ' to ' +  CONVERT(varchar(30), @startTime, 120) + ';  ' + @crlf + @crlf
  + 'The total down time duration is: ['
  + cast(DATEDIFF(second, @stopTime, @startTime)/60 as varchar(20)) + ' minutes, '  
  + cast(DATEDIFF(second, @stopTime, @startTime)%60 as varchar(20)) + ' seconds]';
  exec msdb.dbo.sp_send_dbmail @recipients = @recipients, @subject = @subject, @body = @msg--, @body_format='HTML'
 end  
 else -- there is no trace file
  exec msdb.dbo.sp_send_dbmail @recipients = @recipients, @subject = @subject, @body = 'Please check';
end
go

if exists (select * from sys.procedures where name='uspRestartAlert')
 exec sp_procoption @ProcName='dbo.uspRestartAlert', @OptionName='startup', @OptionValue='on';
go

List_1 - t-sql Code

The notification email will be similar to the following:

The down time window is from 2014-06-03 16:50:37 to 2014-06-03 17:09:05;
The total down time duration is: [18 minutes, 28 seconds]

Use PowerShell to Determine SQL Server Downtime

Next is the PowerShell script.  In the script there is additional information about who initiated the shutdown or reboot of the computer. This additional information shows the advantage of the PowerShell script over the T-SQL approach because PowerShell can easily access the Windows System Log as well as the SQL Server Error Log, while the T-SQL solution generally can only access the SQL Server Error Log.

add-type -AssemblyName "microsoft.sqlserver.smo, version=10.0.0.0, Culture=neutral, PublicKeyToken=89845dcd8080cc91";#version should be changed according to your installed sql version, sql2008=10.0.0.0
 
$info=get-eventlog -logname System -entrytype information -source User32 -computer . -after (get-date).addhours(-12) -message "The process * has initiated * on behalf of *"  | select timegenerated, message -first 1
 
if ($info -ne $null)
{
    $info.message -match "on behalf of user (.*?) for"
    [string]$user=$matches[1];
 
    $svr = new-object 'microsoft.sqlserver.management.smo.server' 'SQLInstance_Name' # change <SQLInstance_Name> accordingly
    $curr=$svr.ReadErrorLog(0) | Where-Object {$_.Text -match 'Recovery is complete'} | select LogDate -last 1;
 
    [string]$msg='';
    if ($curr -ne $null)
    {
        $prev = $svr.ReadErrorLog(1) | select LogDate -last 1;
 
        $d=$curr.logdate - $prev.LogDate;
        $msg += "The downtime window is from $(($prev.LogDate).ToString('yyyy-MM-dd HH:mm:ss')) to $(($curr.Logdate).toString('yyyy-MM-dd HH:mm:ss')) `r`n";
        $msg += "Total downtime is $($d.hours) hours, $($d.Minutes) minutes and $($d.Seconds) seconds `r`n`r`n";
    }
 
    $msg +=  $info.Message;
 
    [string]$qry=@"
    exec msdb.dbo.sp_send_dbmail @recipients='<your_email_acct>', @subject='Server is rebooted at $($info.TimeGenerated) by $($user)'
    , @body= '$($msg)';
"@;
 
    $svr.databases['msdb'].ExecuteNonQuery($qry);
   
}

List_2 - PowerShell script

The notification email will be similar to the following:

The downtime window is from 2014-06-03 16:50:37 to 2014-06-03 17:09:30 Total downtime is 0 hours, 18 minutes and 53 seconds

The process C:\WINDOWS\system32\winlogon.exe <computer_name> has initiated the power off of computer <computer_name> on behalf of user <domain_user_name> for the following reason: No title for this reason could be found

Reason Code: 0x500ff

Shutdown Type: power off

Comment:

Next Steps
  • For the auto-start procedure implementation, you just run the T-SQL script in List_1 on your target SQL Server to generate the auto startup procedure.

  • For the SQL Server Agent Job implementation, you can save List_2 as a PS file, for example c:\temp\ReportReboot.ps1, and then, create a new job, with one job step as follows (i.e. powershell.exe -file c:\temp\ReportReboot.ps1), you also need to set up the job schedule type to be "Start automatically when SQL Server Agent Starts" :

    job_step_ps.jpg


You can read a few other tips related to SQL Server startup procedures/jobs here:

Note: The scripts were tested for SQL Server 2008 and SQL Server 2012.



sql server categories

sql server webinars

subscribe to mssqltips

sql server tutorials

sql server white papers

next tip



About the author
MSSQLTips author Jeffrey Yao Jeffrey Yao is a senior SQL Server consultant, striving to automate DBA work as much as possible to have more time for family, life and more automation.

This author pledges the content of this article is based on professional experience and not AI generated.

View all my tips



Comments For This Article




Wednesday, June 14, 2023 - 3:36:12 PM - jeff_yao Back To Top (91289)
@avinash, you may need to check whether your database email setup is done correctly. Please first try to send yourself a test email.

Wednesday, June 14, 2023 - 8:47:29 AM - avinash Back To Top (91286)
ExecuteNonQuery" with "1" argument(s): "ExecuteNonQuery failed for Database 'msdb'

Wednesday, June 14, 2023 - 5:32:28 AM - avinash Back To Top (91284)
I have some issue running it on 2019, will it work on 2019?

Friday, March 22, 2019 - 8:15:51 AM - Ayushi Srivastava Back To Top (79369)

Hi,

If I want to publish same data in SSRS Report, then what should be the procedure ?


Friday, September 26, 2014 - 4:01:40 PM - jeff_yao Back To Top (34734)

@ann, yes, as I said that with the PS script, we can find who initiated the shutdown from the notification message

The process C:\WINDOWS\system32\winlogon.exe <computer_name> has initiated the power off of computer <computer_name> on behalf of user <domain_user_name> for the following reason: No title for this reason could be found 

 

In a sudden crash case, usually we do not see such info.

 

Hope this helps.

 

Thanks,

JY


Friday, September 26, 2014 - 12:47:47 PM - ann Back To Top (34733)

Thanks, nice article and tips.

Is there a way to know who shut down the server? or it is system crash?


Wednesday, June 25, 2014 - 3:37:42 PM - jeff_yao Back To Top (32404)

Thanks Nagesh and Sajal for your comments.

Sajal, I am not quite sure about your question "is there any other way we can give reboot window?" In this article, I mentioned two ways, one is to check the default trace, and another is to check sql error log. I actually use both in my different environments and they are automated of course. Honestly, there can be other ways, like if you have a dedicated monitoring server. But the easiest should be the two methods I mentioned above.


Kind regards,

JY 


Wednesday, June 25, 2014 - 2:26:35 PM - Sajal Bagchi Back To Top (32402)

Very nice article Jeffrey. In our enviornment we provide reboot window by manually checking the job schedules. I created a query that gives us the detail for the same. For instance, the time between the two consecutive jobs. Is there any other way we can give reboot window? My motto is to automate the entire process..thanks in advance. 


Wednesday, June 25, 2014 - 2:30:57 AM - nagesh Back To Top (32386)

Very useful article, Thanks alot Jeffrey Yao.



Regards,

Nagesh















get free sql tips
agree to terms