Find all stopped SQL Server services for all instances


By:   |   Updated: 2018-03-21   |   Comments (4)   |   Related: More > Monitoring


Is the Database the Culprit of Your Application Issues?

Free MSSQLTips Webinar: Is the Database the Culprit of Your Application Issues?

When you're troubleshooting application performance issues, have you ever found the problem residing a few layers deep in the database or not at all? How long did it take you to find and fix the issue? Don't worry if you said "a while". Learn how to solve performance problems fast.


Problem

As a proactive measure, you need to know when any of your SQL Server services are not running. It could be due to a failure and it's better to know about the issues before the users start to complain. Also, it is possible the SQL Server Agent stops working and you won’t notice it until you check the service manually because jobs didn’t run as scheduled.

Solution

Here is a script that checks the status of all SQL Server services across all of your instances and reports when a service is not running.

As it is regular on my scripts, it checks the servers in parallel, so in less than 5 seconds I can get the results for around 170 instances. Also, the script is able to identify active-active clusters and determine if the service is up in one of the nodes, to not report false information. You can run this on demand or schedule it as a regular job, so you can be alerted of issues.

In the script there are some variables you need to modify:

  • $server: the name of the server where you have stored your server inventory
  • $database: the database where you have created the inventory table
  • $query: modify it depending on the names of your columns, this one queries the servers that aren’t clusters
  • $query2: modify depending on the names of your columns, this one queries the clusters
  • $Maxthreads: query XX servers at a time; if you’re tight on resources you can lower this number, but the results may take longer to appear.

The script works as follows:

  1. Query the database “Inventory” in the table “Inventory”, which contains one entry for each instance with the details:
    1. Environment
    2. ServerName
    3. Cluster
    4. ClusterName
  2. Use the library “FailoverClusters” with the method “Get-ClusterGroup” to identify which node the services are running. In case it is an active-active configuration, both nodes will be returned.
  3. Parallelize the work, one thread for each server. If there is only one server or one active node in the cluster, it is going to store the services that are stopped. If it is an active-active cluster, it is going to get all stopped services, and then filter the services that are running on another node.
  4. For all identified entries, use the WMI service to check if the service is disabled or not, to filter the results and remove unneeded entries. In case it can’t find the details using WMI, the StartMode column will be empty, meaning you need to check the service manually.
  5. Show any error (like lack of permissions to query the services) and show the list of stopped services.

Sample Output from Monitoring Script

After running the script, here is the output.

We can see we have an error for a server that cannot be connected to.  Then we see a list of all of the services that are stopped for each server.

PS C:\Users\PabloEcheverria\Desktop> .\GetServicesStatus.ps1
Production PRODUCTIONSERVER2:  Cannot open Service Control Manager on computer 'PRODUCTIONSERVER2'. This operation might require other privileges.
 
Environment ServerName     Service                                          Status StartMode
----------- ----------     -------                                          ------ ---------
Production  PROD04         SQL Server Agent (SQLEXPRESS)                    Stopped
Test        TEST01         SQL Server Browser                               Stopped
Development DEV01          SQL Active Directory Helper Service              Stopped
Production  PROD05         MSSQLServerADHelper                              Stopped Manual
Production  prod06node01   SQL Active Directory Helper Service              Stopped
Production  prod06node01   SQL Server CEIP service (MSSQLSERVER)            Stopped
Production  prod06node01   SQL Server Analysis Services CEIP (MSSQLSERVER)  Stopped
Production  prod07node01   SQLDIAG                                          Stopped Manual
Production  prod07node02   SQLDIAG                                          Stopped Manual
			

Script to create Instance Inventory Table and Insert Records

This will create the table and some example records.  You will need to put in the correct name for your servers.

CREATE TABLE [Inventory] (
  [Environment] VARCHAR(128),
  [ServerName] VARCHAR(128),
  [Cluster] INT,
  [ClusterName] VARCHAR(128) )

-- insert sample records into table 
INSERT INTO [Inventory] VALUES ('Development', 'DEV01', 0, NULL)
INSERT INTO [Inventory] VALUES ('Test', 'TEST01', 0, NULL)
INSERT INTO [Inventory] VALUES ('Production', 'PROD04', 0, NULL)
INSERT INTO [Inventory] VALUES ('Production', 'PROD05', 0, NULL)
INSERT INTO [Inventory] VALUES ('Production', 'PROD06', 1, 'WIN06') --Active-Passive cluster
INSERT INTO [Inventory] VALUES ('Production', 'PROD07', 1, 'WIN07') --Active-Active cluster
 
-- query table results
SELECT DISTINCT [Environment], [ServerName]
FROM [Inventory]
WHERE [Cluster] = 0
 
SELECT DISTINCT [Environment], [ClusterName]
FROM [Inventory]
WHERE [Cluster] = 1	

PowerShell Script to Check SQL Server Service Status

This is the complete PowerShell script.  As noted above, you will need to adjust some of the variables for your environment.

$ErrorActionPreference = "Stop" #stop when an error is encountered
# Declare variables
$server = "."
$database = "Inventory"
$query = @"
SELECT DISTINCT [Environment], [ServerName]
  FROM [Inventory]
WHERE [Cluster] = 0
"@
$query2 = @"
SELECT DISTINCT [Environment], [ClusterName]
  FROM [Inventory]
WHERE [Cluster] = 1
"@
$SleepTimer = 1000 #after X milliseconds, check if the jobs have finished. 1000 is every second.
$MaxResultTime = 300 #after X seconds, all jobs are killed. 300 is 5 minutes.
$Maxthreads = 20 #number of parallel jobs
# import modules
Import-Module SqlPs -DisableNameChecking
Import-Module FailoverClusters
$error.clear() #clear error generated by last command
# get list of instances
$objects = @(Invoke-Sqlcmd -ServerInstance $server -Database $database -Query $query)
$clusters = @(Invoke-Sqlcmd -ServerInstance $server -Database $database -Query $query2)
# get node(s) for the clusters and add to the list of servers
foreach ($cluster in $clusters) {
   $services = Get-ClusterGroup -Cluster $cluster.ClusterName
   foreach ($service in $services) {
      if (!($service.Name -eq "Cluster Group") -And !($service.Name -eq "Available Storage")) {
         $found = 0
         for ($i = 0; $i -lt $objects.length; $i++) {
            if ($objects[$i].ServerName.StartsWith($cluster.ClusterName)) {
               $found = 1
               if (!$objects[$i].ServerName.contains($service.OwnerNode.Name)) {
                  $objects[$i].ServerName = $objects[$i].ServerName + " " + $service.OwnerNode.Name
               }
               break
            }
         }
         if ($found -eq 0) {
            $obj = New-Object -TypeName PSObject
            $obj | Add-Member -MemberType NoteProperty -Name Environment -Value $cluster.Environment
            $obj | Add-Member -MemberType NoteProperty -Name ServerName -Value "$($cluster.ClusterName) $($service.OwnerNode.Name)"
            $objects += $obj
         }
      }
   }
}
# environment setup
$output = @()
$errors = ""
$Jobs = @()
$ISS = [system.management.automation.runspaces.initialsessionstate]::CreateDefault()
$RunspacePool = [runspacefactory]::CreateRunspacePool(1, $MaxThreads, $ISS, $Host)
$RunspacePool.Open()
$script = {
   Param ([string]$environment, [string]$serverName)
   $ErrorActionPreference = "Stop"
   try {
      $firstSet = @()
      $final = @()
      if (!$serverName.contains(" ")) { # one server
         $firstSet = Get-Service -computername $serverName | where {$_.DisplayName -like "*SQL*" -and $_.Status -eq "Stopped"} | select @{Name="Environment";Expression={$environment}}, @{Name="ServerName";Expression={$serverName}}, @{Name="Service";Expression={$_.DisplayName}}, Status
      }
      else {
         $parts = $serverName.split(" ")
         if ($parts.count -eq 2) { # one active node in the cluster
            $firstSet = Get-Service -computername $parts[1] | where {$_.DisplayName -like "*SQL*" -and $_.Status -eq "Stopped"} | select @{Name="Environment";Expression={$environment}}, @{Name="ServerName";Expression={$parts[1]}}, @{Name="Service";Expression={$_.DisplayName}}, Status
         }
         else { # more than one active node in the cluster
            for ($i = 1; $i -lt $parts.count; $i++) {
               $secondSet += Get-Service -computername $parts[$i] | where {$_.DisplayName -like "*SQL*"} | select @{Name="Environment";Expression={$environment}}, @{Name="ServerName";Expression={$parts[$i]}}, @{Name="Service";Expression={$_.DisplayName}}, Status
            }
            $thirdSet = $secondSet | where {$_.Status -ne "Running"}
            for ($i = 0; $i -lt $thirdSet.count; $i++) {
               $running = $secondSet | where {$_.Service -eq $thirdSet[$i].Service -and $_.Status -eq "Running"} | measure
               if ($running.count -eq 0) {
                  $firstSet += $thirdSet[$i];
               }
            }
         }
      }
      $m = $firstSet | measure
      if ($m.count -gt 0) {
         foreach ($row in $firstSet) {
            $props = Get-WmiObject win32_service -computer $row.ServerName -filter "Name='$($row.Service)'"
            if (!$props -or ($props.StartMode -ne "Disabled")) {
               $final += $row | select Environment, ServerName, Service, Status, @{Name="StartMode";Expression={$props.StartMode}}
            }
         }
      }
      "" #indicate there was no error
      $final #return results
   }
   catch {
      return $environment + " " + $serverName + ": " + $LastExitCode + " " + $_ | foreach { $_.ToString() } | Out-String #Get-WmiObject error, maybe permissions
   }
}
function CreateThread() {
   param ([string]$environment, [string]$serverName, [ref]$Jobs)
   $PowershellThread = [powershell]::Create().AddScript($script) #scriptToRun
   $PowershellThread.AddArgument($environment) | out-null
   $PowershellThread.AddArgument($serverName) | out-null
   $PowershellThread.RunspacePool = $RunspacePool
   $Handle = $PowershellThread.BeginInvoke()
   $Job = "" | select Handle, Thread, object
   $Job.Handle = $Handle; $Job.Thread = $PowershellThread
   $Jobs.value += $Job
}
$ResultTimer = Get-Date #start time
#start processing first task for each instance
for ($i=0; $i -lt $objects.length; $i++) {
   CreateThread $objects[$i].Environment $objects[$i].ServerName ([ref]$Jobs)
}
while (@($Jobs | where {$_.Handle -ne $Null}).count -gt 0) {
   #update completed jobs, get errors and result, and dispose them
   foreach ($Job in @($Jobs | where {$_.Handle -ne $Null -and $_.Handle.IsCompleted -eq $True})) {
      $results = $Job.Thread.EndInvoke($Job.Handle)
      if ($results[0] -and $results[0] -ne "") {
         $errors += $results[0]
      }
      for ($i = 1; $i -lt $results.count; $i++) {
         $output += $results[$i]
      }
      #end thread
      $Job.Thread.Dispose()
      $Job.Thread = $Null
      $Job.Handle = $Null
   }
   #show progress
   $inProgress = @($Jobs | where {$_.Handle.IsCompleted -eq $False}).count
   Write-Progress `
      -Activity "Gathering data" `
      -PercentComplete (($objects.length - $inProgress) * 100 / $objects.length) `
      -Status "$inProgress pending"
   #exit on timeout
   $currentTime = Get-Date
   if (($currentTime - $ResultTimer).totalseconds -gt $MaxResultTime) {
      Write-Error "Child script appears to be frozen, try increasing MaxResultTime"
      break
   }
   #sleep
   Start-Sleep -Milliseconds $SleepTimer
}
#dispose thread pools
$RunspacePool.Close() | Out-Null
$RunspacePool.Dispose() | Out-Null
$errors #return errors
$output | Format-Table -Auto #return results
			

Notes:

If you are running this on a machine that does not have the Failover Clustering features installed you may get an error. 

There are two options to fix this error:

  1. You can comment out this line "Import-Module FailoverClusters" in the PowerShell script using a # in front of the line.  If you take this approach you should mark all of your SQL Server instances as non-clustered in the Inventory table.
  2. You can install the Remote Server Administration Tools.  Here are two articles about this:
Next Steps
  • Modify the script for your inventory table to see which services are stopped.
  • Tweak the process to only check for certain services like the database engine and SQL Agent.
  • Tweak the process to allow you to ignore certain processes for specific instances.
  • You can set this script to run as a scheduled job and notify you periodically on stopped services. Check my other tip on sending a formatted email here.
  • You can learn more about Windows services for SQL Server here.
  • You can learn more about services isolation here.


Last Updated: 2018-03-21


get scripts

next tip button



About the author
MSSQLTips author Pablo Echeverria I've worked for more than 10 years as a software programmer and analyst. In 2016, I switched jobs to a DBA position, where I have implement new processes, created better monitoring tools and grown my data scientist skills.

View all my tips
Related Resources





Comments For This Article




Tuesday, April 23, 2019 - 12:47:58 PM - Pablo Echeverria Back To Top (79660)

Hi Waquar Ahmad you're welcome, please check this other tip I made where SQL commands are joined and executed at the end: https://www.mssqltips.com/sqlservertip/5408/check-current-patch-levels-for-all-sql-servers-in-enviornment/


Tuesday, April 23, 2019 - 3:34:44 AM - Waquar Ahmad Back To Top (79649)

 

This is a fantastic article. Thank you for sharing it Pablo. 

Can you pls help in inserting the powersheel output into a sql table. 


Wednesday, March 21, 2018 - 12:48:45 PM - Pablo Echeverria Back To Top (75495)

Hi Thomas, thanks for your comment! That may help others with the same issue.


Wednesday, March 21, 2018 - 11:59:54 AM - Thomas Flatley Back To Top (75491)

 

 Nice script - i had to change the import-module SqlPS to SqlServer to get around the error

SQL Server PowerShell provider error: Path SQLSERVER:\check_services.ps1 does not exist.

thanks

 



download





Recommended Reading

How to Read Log File in SQL Server using TSQL

How to setup SQL Server alerts and email operator notifications

SQL Server Wait Stats Monitoring with PowerShell

Posting SQL Server Notifications to Slack

SQL Server High CPU Query Use Monitoring with PowerShell








get free sql tips
agree to terms


Learn more about SQL Server tools