By: Tim Smith | Comments | Related: > Monitoring
Problem
We have a windows service that runs on several servers and interacts with a remote SQL Server database and during some deployments, we've noticed that the service will continue running after the deployment, but then stop or fail sometime afterwards without reporting any errors. Whether on the database side or on the machine running the service itself, do we have options to help us troubleshooting a windows service that suddenly stops without reporting an error so that we can track the specific time window?
Solution
In this tip, we'll look at a couple of solutions for this - using PowerShell to track if the service is currently running and alerting if not or saving information to SQL Server at regular intervals with the information we obtain from PowerShell, or logging information to the SQL Server directly from the service. Our solution to this problem will depend on how frequently the service interacts with the SQL Server and also what we find if the service fails. For an example, if the service fails when communicating to the database because it's unavailable, we might not see that immediately with our logging. If we have an email alert sent out when a service fails and we know it failed before an insert into the database that might help us uncover if the communication to the database caused the failure.
In the first example, we'll look at an email alert if the windows service returns a status other than running, which will be the only acceptable state. If the status is starting, then we should only receive an alert briefly if the status moves to a running state. While the individual logic may be different, I would want to know if the status is starting in case the server restarts and we're not getting messages from the server, provided that we're monitoring this on the machine itself (PowerShell allows the parameter -ComputerName for remote access, if the user running the script has remote permission).
$name = "OurService" $service = Get-Service -Name $name -ErrorAction SilentlyContinue if (!($service.Status -eq "Running")) { $message = "$name is not running!" Send-MailMessage -SmtpServer "" -To "" -From "" -Subject "$name Status" -Body $message }
This also handles the situation where a service may have been removed erroneously and the status would normally return nothing, since there is no service. We could set the above script as a scheduled task either on the server itself or on a remote server (with a user who has permissions to access the remote server) and monitor the service. If we want to avoid email and track this information by saving the status, we could log this information to a table:
$name = "OurService" $service = Get-Service -Name $name -ErrorAction SilentlyContinue if (!($service.Status -eq "Running")) { $saveservicestatus = "INSERT INTO tblServices VALUES ('$name','" + $service.Status + "',GETDATE())" Invoke-Sqlcmd -ServerInstance "OURINSTANCE\OURSERVER" -Database "OurDatabase" -Query $saveservicestatus }
Keep in mind that we're only monitoring the status in both of these examples; one with email alerts and the other by saving the information to a table. If the service does not log any information itself, we may have to go through the event logs to compare what may have happened - such as a database server restarted causing the service to crash, an update to the server required a restart, or another event. In general, I would recommend the service provide custom log information as well, as we may be able to identify what the service was running or doing before it crashed. However, there may be situations where you are only diagnosing why a service failed and are unable to develop the service, so this may not be an option.
Finally, we can scale this using a function for repeatability if we want. Using the second example of saving the information to a table:
Function Save-Service { Param( [Parameter(Mandatory=$true)][string]$name ) Process { $service = Get-Service -Name $name -ErrorAction SilentlyContinue if (!($service.Status -eq "Running")) { $saveservicestatus = "INSERT INTO tblServices VALUES ('$name','" + $service.Status + "',GETDATE())" Invoke-Sqlcmd -ServerInstance "OURINSTANCE\OURSERVER" -Database "OurDatabase" -Query $saveservicestatus } } } Save-Service -name "OurProcess"
We could then call the same function multiple times if we wanted to monitor multiple windows' services:
Save-Service -name "OurFirstProcess" Save-Service -name "OurSecondProcess" Save-Service -name "OurThirdProcess"
In situations where you may not have logging, or may have limited logging where you must interpret the events, these tools should help you track the specific time window of when a windows service moves to a separate state than running. I would recommend at most spacing out each time at five minutes, otherwise the window of time may be too large. Monitoring the service every minute may be required in some cases and for services that should never be disabled, stopped or removed, monitoring them frequently may be preferred.
Next Steps
- In troubleshooting this multiple times, I've seen issues such as failures due to connections to the database, failures due to code changes in the application, and failures due to firewall changes. You may discover other possible issues; identifying the specific time will help make solving this problem easier.
- If you are starting out, I would recommend using two methods from the above tip because the time alone will help, but knowing what the service was doing and what the service was about to do will also help in some situations.
- If the account running the SQL Server Agent Job has access to the status of Windows Services, as well as email permissions, these PowerShell scripts could also be run as a SQL Server Agent Job.
About the author
This author pledges the content of this article is based on professional experience and not AI generated.
View all my tips