By: Ben Snaidero | Comments (4) | Related: > Monitoring
Problem
You administer a large number of SQL Server instances and you need a way to automatically be notified of any errors that have occurred on these servers as it's too time consuming to have to login to every machine and check windows event logs and SQL Server error log files manually.
Solution
To solve the above problem we are going to use the system_health Extended Events session. This session is included by default with SQL Server and starts automatically whenever the database engine is started. The link above outlines everything that is configured for monitoring within this session, but for this tip we are just going to look at any errors that are captured.
Gathering errors from system_health session
The data for the system_health session is stored in a ring buffer target that can be queried from the sys.dm_xe_session_targets DMV. The data is actually stored here in XML format so we will also need to flatten this data into a relational format to make it easier for us to filter and query only the data that we are interested in. In case you're not familiar with how to flatten XML data, a good explanation on how to do this can be found here. Below is the query we can use to extract any errors that have occurred in the last 24 hours.
SELECT CAST(xet.target_data as XML) as XMLData INTO #RingBufferData FROM sys.dm_xe_session_targets xet INNER JOIN sys.dm_xe_sessions xe ON (xe.address = xet.event_session_address) WHERE xe.name = 'system_health' SELECT e.query('.').value('(/event/@timestamp)[1]', 'datetime') as "TimeStamp", e.query('.').value('(/event/data/value)[1]', 'int') as "Number", e.query('.').value('(/event/data/value)[2]', 'int') as "Severity", e.query('.').value('(/event/data/value)[3]', 'int') as "State", e.query('.').value('(/event/data/value)[5]', 'varchar(max)') as "Message" FROM #RingBufferData CROSS APPLY XMLData.nodes('/RingBufferTarget/event') AS Event(e) WHERE e.query('.').value('(/event/@name)[1]', 'varchar(255)') = 'error_reported' AND e.query('.').value('(/event/@timestamp)[1]', 'datetime') > GETDATE()-1 DROP TABLE #RingBufferData
Here is a sample output for this query.
TimeStamp | Number | Severity | State | Message |
---|---|---|---|---|
2014-02-03 00:54:22.377 | 17806 | 20 | 2 | SSPI handshake failed with error code 0x80090311 while establishing a connection with integrated security; the connection has been closed. [CLIENT: xxx.xxx.xxx.xxx] |
Scheduling query using PowerShell
Now that we have our query defined we can move on to scheduling this query using PowerShell. In PowerShell 3.0 there are a couple of new cmdlets that we will use to accomplish this task, New-JobTrigger and Register-ScheduledJob. The first cmdlet is used to define the schedule that triggers the job to execute and the second cmdlet contains the job definition. First let's take a look at the script and then I will explain exactly what each step does.
$trigger = New-JobTrigger -Daily -At "8:00 AM" Register-ScheduledJob -Name CheckExtendedEvents_SystemHealth -Trigger $trigger -ScriptBlock ` { Add-PSSnapin SqlServerCmdletSnapin100 $errorlist = Invoke-Sqlcmd -InputFile "C:\checkextendedevent.sql" ` -ServerInstance "SQLSERVER\INSTANCE" ` -U "username" -P "password" | Out-String if ($errorlist) { Send-MailMessage -To "[email protected]" ` -Subject "##ServerName## - system_health extended events error messages" ` -From "[email protected]" -Body $errorlist ` -SmtpServer "smtp.domain.com" } }
The New-JobTrigger cmdlet is pretty self explanatory. We are simply creating a trigger that will fire everyday at 8am. The Register-ScheduledJob is pretty straightforward as well. We are using 3 parameters in this cmdlet. The first one contains the name that we will use to identify this job. The second is the trigger (or really the schedule) that defines when this job will execute. And finally the script block definition which details the cmdlets this job will run.
Let's now take a look at the script block in detail. The first cmdlet, Add-PSSnapin SqlServerCmdletSnapin100, has to be run in order for us to be able to use any of the database engine cmdlets as these are not loaded by default. Once that has been run we can use the Invoke-Sqlcmd cmdlet to run the sql script we created in the first section of this tip. A couple things to note with this cmdlet. First, if you can't use SQL authentication in your environment you can remove the -U and -P parameters and the cmdlet will run using Windows Authentication. Second, since we are connecting to a SQL Server instance this job could be created and run from any machine in your environment provided it can connect to the database instance we are trying to monitor. With that said we take the output of this cmdlet and convert it to a string so we can use the details as an input to our next cmdlet. The next cmdlet, Send-MailMessage, sends the details of any events returned from the checkextendedevent.sql script by including the $errorlist string in the body of the email. Note that an if statement is used so that when the $errorlist string is empty, that is the query did not return anything, an email is not sent.
One thing I should also mention is before executing these cmdlets to create our PowerShell job we need to launch PowerShell with administrator privileges since some of the cmdlets we are using require these privileges. In case you are not familiar with this you can read more here.
After executing the cmdlets above we should see the following output which confirms that our job has been created successfully.
Id Name JobTriggers Command Enabled -- ---- ----------- ------- ------- 1 CheckExtende... 1 ... True
All that's left to do now is check that it's running successfully. To do this we can use the Get-Job cmdlet. Here is a example command (and sample output) that would list all previous job executions and their status.
Get-Job |Select Id,Name,PSBeginTime,State Id Name PSBeginTime State -- ---- ----------- ----- 1 CheckExtende... 2/11/2014 8:00:01 AM Completed
One final thing to note with this command is by default Windows will keep the results of the last 32 executions of each job. Once this is reached older results are overwritten by subsequent executions. If you need to keep more results than is configured by default you can update the number of results saved for each scheduled job by using the MaxResultCount parameter of the Set-ScheduledJob cmdlet. For example the following command sets this parameter to 100.
Get-ScheduledJob -Name ##JobName## | Set-ScheduledJob -MaxResultCount 100
Next Steps
- Extend this to monitor other events from system_health session, i.e. wait_info or deadlocks
- Create customized Extended Events sessions
About the author
This author pledges the content of this article is based on professional experience and not AI generated.
View all my tips