Learn more about SQL Server tools

mssqltips logo
 

Tutorials          DBA          Dev          BI          Career          Categories          Webcasts          Whitepapers          Today's Tip          Join

Tutorials      DBA      Dev      BI      Categories      Webcasts

DBA    Dev    BI    Categories

 
Untangle TempDB Performance with SQL Diagnostic Manager - Free Webinar
 

Automate Alerting for SQL Server Suspect Database Pages


By:   |   Last Updated: 2016-01-25   |   Comments (1)   |   Related Tips: More > Monitoring

Problem

We suspect there are some issues with our disk subsystem and would like to be alerted by email if SQL Server encounters any corruption in any of the data files that reside on the disk in question. How can we add this automated monitoring?

Solution

Luckily for us, SQL Server 2005 introduced a new table in the msdb database called suspect_pages which logs any pages which it suspects might be bad. Any time the database engine comes across a page id that it thinks might be bad, whether through a query reading a page, a DBCC operation or a backup/restore, the page id is added/updated in this table along with some details about the event. The details can be found in the event_type column in this table and a description of these types is below. Notice that event_types 4, 5 and 7 actually show that the issue has been resolved.

suspect_pages error description

It's important to note that suspect_pages has to be maintained by a DBA or someone with permissions to make updates in the msdb database. This table is limited to 1000 rows, so if there are records in the table that are no longer needed they should be deleted so new rows can be added if required. More on managing this table can be found here.

Monitoring Setup

The first thing we'll need is a query to pull information from the suspect_pages table. Since this table mainly contains non-descriptive ids, let's join it with sys.databases and sys.master_files so our alert will contain everything we need to track down and resolve the issue. Below is the TSQL for this query.

select sp.database_id "Database ID",
       d.name "Database",
       sp.file_id "File ID",
       mf.physical_name "File",
       sp.page_id "Page ID",
       case when sp.event_type = 1 then '823 or 824 error other than a bad checksum or a torn page'
            when sp.event_type = 2 then 'Bad checksum'
            when sp.event_type = 3 then 'Torn Page'
            when sp.event_type = 4 then 'Restored (The page was restored after it was marked bad)'
            when sp.event_type = 5 then 'Repaired (DBCC repaired the page)'
            when sp.event_type = 7 then 'Deallocated by DBCC'
       end as "Event Desc",
       sp.error_count "Error Count",
       sp.last_update_date "Last Updated"
from msdb.dbo.suspect_pages sp
inner join sys.databases d on d.database_id=sp.database_id
inner join sys.master_files mf on mf.database_id=sp.database_id and mf.file_id=sp.file_id

Now that we have a query with all the information we need we just have to wrap this query with some logic to send out an email only when it actually finds data. Using database mail and some html formatting, samples of each can be found here, we can produce a nicely formatted email alert. The TSQL code for this is below and can be added to a new or existing SQL job to monitor your instance.

declare @count integer
declare @tableHTML  nvarchar(MAX);
declare @subj nvarchar(100)

select @count=count(1)
from msdb.dbo.suspect_pages;

set @subj = 'Suspect Pages Found in ' + @@SERVERNAME;

set @tableHTML =
    N'<H1>Suspect Pages Found in ' + @@SERVERNAME + ', details are below.</H1>' +
    N'<table border="1">' +
    N'<tr><th>Database ID</th><th>Database</th>' +
    N'<th>File ID</th><th>File</th><th>Page ID</th>' +
    N'<th>Event Desc</th><th>Error Count</th><th>Last Updated</th></tr>' +
    cast ( ( select td = sp.database_id,       '',
       td = d.name,       '',
       td = sp.file_id,       '',
       td = mf.physical_name,       '',
       td = sp.page_id,       '',
       td = case when sp.event_type = 1 then '823 or 824 error other than a bad checksum or a torn page'
            when sp.event_type = 2 then 'Bad checksum'
            when sp.event_type = 3 then 'Torn Page'
            when sp.event_type = 4 then 'Restored (The page was restored after it was marked bad)'
            when sp.event_type = 5 then 'Repaired (DBCC repaired the page)'
            when sp.event_type = 7 then 'Deallocated by DBCC'
       end,       '',
       td = sp.error_count,       '',
       td = sp.last_update_date
from msdb.dbo.suspect_pages sp
inner join sys.databases d on d.database_id=sp.database_id
inner join sys.master_files mf on mf.database_id=sp.database_id and mf.file_id=sp.file_id
              for xml path('tr'), TYPE 
    ) as nvarchar(max) ) +
    N'</table>' ;

IF @count > 0
  exec msdb.dbo.sp_send_dbmail
    @recipients=N'[email protected]',
    @body= @tableHTML, 
    @subject = @subj,
    @body_format = 'HTML',
    @profile_name ='testprofile'

As an added benefit, assuming you would be running this at some interval whether it be daily/weekly/monthly whatever you decide is best for your environment, we could also piggy back on this job and do the cleanup of this table at the same time. Adding another step to this job that contains the following TSQL we can purge anything in this table that is older than 90 days. Note that you can change this purge logic however you see fit.

delete from msdb.dbo.suspect_pages
where last_update_date < getdate()-90

Sample Output

Hopefully you check the suspect_pages table in all of your instances and you don't find anything. Rather than have you go through the process of corrupting a page in one of your databases to see what the report looks like I'll include a sample of one below for reference.

sample email
Next Steps


Last Updated: 2016-01-25


next webcast button


next tip button



About the author
MSSQLTips author Ben Snaidero Ben Snaidero has been a SQL Server and Oracle DBA for over 10 years and focuses on performance tuning.

View all my tips
Related Resources





Post a comment or let the author know this tip helped.

All comments are reviewed, so stay on subject or we may delete your comment. Note: your email address is not published. Required fields are marked with an asterisk (*).

*Name    *Email    Email me updates 


Signup for our newsletter
 I agree by submitting my data to receive communications, account updates and/or special offers about SQL Server from MSSQLTips and/or its Sponsors. I have read the privacy statement and understand I may unsubscribe at any time.



    



Monday, January 25, 2016 - 10:30:08 PM - Jeff Moden Back To Top

 Awesome tip, Ben.  I never knew this table existed.  The year is starting out very well for learning new things thanks to you.  Thank you for taking the time to post this rather complete tip.


Learn more about SQL Server tools