By: Jeremy Kadlec | Last Updated: 2016-06-13 | Comments (1) | Performance Tuning
My team needs to quickly pinpoint the performance issues in my SQL Server environment. With all of the demands at work, our team needs to find and understand the issues, then review the code and take corrective actions. We need to ensure our production SQL Server environment is running smoothly, because when it is not, the entire company feels it.
The team is growing and we have SQL Server Professionals with a variety of skill levels. Everyone on the team needs to pitch in and help address the performance issues, not just our senior staff. For some team members, often it is very difficult to determine the root cause of the performance issue. Even the most senior performance focused DBAs struggle with intermittent problems consuming large amounts of their time. Sometimes it is a single process, but more often there is a chain of events that needs to be reviewed and correlated to find the answer. We are also faced with supporting multiple versions of SQL Server with a slightly varying tool set. Some of the servers are on premise and others are in the cloud. Here too the challenge is the need to manage both physical and virtual machines, which adds another wrinkle to the equation.
Based on these issues, do you have any suggestions to help us solve this problem?
Time is certainly of the essence when there are SQL Server performance issues that impact the entire organization. Many environments are similar to yours with numerous versions of SQL Server, physical and virtual servers in multiple locations as well as having the need for everyone on the team to be able to quickly pinpoint, dissect, correlate and correct the issue at hand.
There are a number of ways to address your SQL Server performance issues. SQL Server ships with a number of native tools such as the Dynamic Management Objects, Extended Events, Performance Monitor and more to help identify the issues. These tools are very helpful, but can become overwhelming with large amounts of data. The performance issues are in the data; it is just a matter of being able to train the team to find them in a rapid manner and communicate with your peers.
In some respects, there is a build vs. buy decision as well as an investment decision in training to manage a diverse environment. The small, yet growing team needs to support core business needs of the organization and conquer the performance issues. Trying to accomplish both with time constraints is a large challenge.
Based on the time constraints, business needs, build vs. buy decision and building your team, selecting a performance monitoring and tuning product may make the most sense. One product to consider is SolarWinds Database Performance Analyzer for SQL Server. Here is how this product can help you:
- Resolution – Pinpoint poorly performing code in a few clicks
- Simplicity – Understand poorly performing code in a graphical manner with supporting metrics to prevent finger pointing, but rather focus the team on solving the problem
- Recommendations – Valuable recommendations to correct problematic code
- Focus – Daily Advisors outlining the top five errors, warnings and informational messages to prioritize efforts to benefit the business
- Analysis – Real time and historical data to correct issues and determine trends, with the ability to drill down to data in one minute increments
- Consistency – Intuitive browser based interface to dissect performance issues against numerous versions of SQL Server, VMWare and other database platforms to share with Developers, Admins, QA and Management without compromising production security
- Reporting – Inherit reporting on each interface with the ability to send the results via email as well as a customizable reporting and alerting interface
- Low Impact – DPA places less than 1% load on the monitored servers
Let’s demonstrate how SolarWinds Database Performance Analyzer can solve SQL Server performance issues.
SolarWinds Database Performance Analyzer Enterprise Dashboard
When you login to SolarWinds Database Performance Analyzer (DPA), you begin with an enterprise wide dashboard providing insight across all of your monitored instances. This dashboard provides the following metrics:
- Key charts
- Instances with the Highest Wait Time
- Greatest Upward Wait Time Trends
- Environment wide metrics
- Monitored Instances
- Wait Time
- Query Advice
- Database instances list
- Name – Click on to directly access the server
- Action – Current Activity, Performance Trend, etc.
- Waits in the aggregate
- Current status (No issues, warnings or critical issues) for the following metrics:
- Instance version
A few noteworthy items on the DPA enterprise dashboard in the Database Instance section include:
- The ability to click on instance name to directly access the instance level performance metrics, which we will cover in the next section.
- The ability to click on the icon in the metric section (i.e. Queries, CPU, Memory, etc.) which will launch the specific interface based on the metric. This feature makes it easy to begin to troubleshoot the issue. As a point of reference, one of the instances has the “Red X” for memory, so when that icon is clicked on we see the screen shot below.
As a quick aside, as the mouse hovers over the chart with the critical error, you receive valuable pop-up information. Here is an example, related to the O/S Memory Utilization Percentage.
- A final aspect of this interface that cannot be overlooked is the Action menu, shown below. This tiny menu is another launching pad to dig into the performance details at the database instance level.
- Access to the three main interfaces at the instance level are the Current Activity, Performance Trend and Resources.
- There are also opportunities to troubleshoot performance issues with the Real Time Sessions, Reports and Alerts interfaces.
- From an administrative perspective, links are available to manage the application monitors, review the monitor log, manage blackout periods, configure advanced options and setup some virtual machine settings.
SolarWinds Database Performance Analyzer for SQL Server Instance Dashboard
Moving to the next level of detail with DPA for SQL Server, provides the data to correct performance issues. The instance level data is really where most SQL Server Professionals will spend most of their time analyzing, troubleshooting and correcting performance issues. At this level, there are three key data sets:
- Trend – Overall status for the last 30 days by default
- Current – Real time data over the last 60 minutes
- Resources – Charts with key metrics over the last hour, day, week, month, 6 months or year
The navigation for these data sets can be found on the top right of any of the instance level interfaces in DPA for SQL Server.
Let’s dive into each of these data sets to demonstrate how quickly performance issues can be diagnosed.
Trend Interface in the SolarWinds Database Performance Analyzer for SQL Server
The Trend interface in DPA provides SQL Server Professionals with the ability to:
- Intuitively determine issues based on the bar charts and a prioritized list of action items i.e. Advisors, Resources and SQL Text.
- Drill down from high level metrics to the actual code and query plan. At the query plan level recommendations are provided to correct the code.
- Fine tune the data exploration by using the drop down box on the top left of the interface. By default the last 30 days of data are displayed, but you have the ability to configure a custom date or select a particular date.
- Slice the data by T-SQL commands, Wait Statistics, Programs, Databases, Machines, Database Users and Query Plans. There is also a Timeslice option to review a cross section of this data to drill down to the minute to understand the performance implications.
Let’s review three of the tabs on the Trend interface: SQL, Waits and Plans that are pretty interesting.
SQL Tab of the Trend Interface in the SolarWinds DPA for SQL Server
When you launch the Trend interface, the first item that catches your eye is the bar chart for the T-SQL Statements. Intuitively, the tallest bar chart with the largest single color is the T-SQL code requiring the most attention. Each color corresponds to a single query plan and these are color coded across the days in the chart to easily see the impact. As you hover your mouse the name of the code, wait time, total wait time, percentage of wait time, average executions in seconds, executions and actual code are displayed as shown below.
Once one of the sets of code is clicked, the Top SQL Statements interface loads the data for the specific date which was clicked on in the chart to provide a visualization of how the code ran by the hour for the specific day. At the bottom of the screen there are also three tabs to check out the advisors (i.e. recommendations) for the day, the resource utilization and a prioritized list of T-SQL code to review. The combination of the visualization and the three tabs at the bottom of the interface help you prioritize the performance issues.
If you click on any of the bars, the Top SQL Statements interface loads. The data is now sliced based on the code on the left, duration at the top and wait stats in the center of the chart. This interface gives a prioritized list of code, the waits types impacting the code and the percentage based on the color coding of the bar.
Waits Tab of the Trend Interface in the SolarWinds DPA for SQL Server
Since we had a glimpse of the wait stats, let’s learn more. Another slice of the data that is very interesting is related to the instance level Wait Statistics on the Waits tab of the Trend interface. On this tab, the data is once again prioritized by the largest bars and the wait types are color coded to see the impact over the last 30 days by default. This data provides insight into the resources SQL Server is waiting on as well as provides a means to understand which code is responsible for the waits.
In this example, the red portions of the chart correspond to CPU usage. If we click on one of the red portions of a bar, the interface below loads which shows the Top Wait Statistics across the entire day on a per hour basis.
Clicking on one of these bars, you can see the aggregate wait stats over the time period to have a clear picture of which wait stats your SQL Server is waiting on.
Plans Tab of the Trend Interface in the SolarWinds DPA for SQL Server
As is the case with the previous tabs demonstrated, the Plan tab displays the last 30 days of data in a color coded manner with the focus being the largest bars. The difference on this tab is the data is all related to the query plans and the aggregate wait time for the time slice. The value of the query plan is that it is the breakdown of your code as the SQL Server query optimizer processes it. With this view of the code, you can see which portions that are using the most resources and make a determination if they can be improved.
Once a plan is clicked on at the 30 day summary level, the data is sliced by the day and hour for a more fine-tuned view of the data as shown below. As was the case with the previous tabs, you can hover your mouse over the graph to get additional details related to the wait time.
When you click on any of the query plans, the actual query plan loads as shown below. As you hover your mouse over each component (not shown) you are able to get additional information related to the Physical Operator, Logical Operator, IO and CPU statistics, etc. At this point you can also focus on the portions of the query plan with the highest cost which is shown on the bottom line of each box. Each box represents a physical operator in the query plan. In the example below, a red highlight was added to the image to show the worst performing portion of the code and a blue highlight was added to the image to show the recommendation to correct the issue. With these two pieces of information, you can begin to improve your code by adding an index as recommended in the Plan Advice section to stop table scanning and have SQL Server process the data as an index seek or scan.
As a final note, keep in mind we explored only three of the tabs available on the Trend Interface. There is a great deal of value with the other tabs as well. If you are having an issue with an application, machine(i.e. web server/middle tier), user or database you can drill into the issue with those tabs. There is even a Time Slice tab available as you drill into the data in order to fine tune the troubleshooting down to the minute. All of these tabs deliver a great deal of value based on how you want to slice the data and approach the performance problem. Be sure to explore all that they have to offer!
Current Interface in the SolarWinds Database Performance Analyzer for SQL Server
The Current interface is intended to report on performance data for the last hour and is divided into three main sections:
- First is the queries data with data from the last hour for the Highest Total Execution Time, Average Long Running queries over the last hour and an aggregate count of the number of executions for a query.
- The second section relates to the Real-Time sessions which includes the Currently Active, Currently Blocked and a URL to review all of the sessions with the spid, login name, application name, status, etc.
- The third portion of the interface are graphs with the Round-Trip Time, CPU Utilization, Memory Paging Rate and Disk Queue Length. There is also a URL i.e. more… in this section to review a complete list of resource graphs which we be demonstrated in the next section.
Similar to the Trend interface demonstrated earlier, there is the ability to review additional information when you hover your mouse over the bars in the Queries graph. Also, when you click on any of the bars in this graph the Top SQL Statements interface loads, which is an intersection of the query, time and wait stats.
From the Top SQL Statements interface you can click on any of the wait statistics to see the corresponding code as well as some basic metrics related to the number of executions, reads, writes and Historical Charts.
The Historical Charts URL on the top right is fairly interesting because you can review the following data:
- Total Wait Time for the statement over the last 30 days including the duration and wait statistic per day in a bar chart.
- Counts for the number of Executions, Physical Reads, Logical Reads and more as line graphs.
- Average Wait Time per Execution over the last 30 days with the ability to see the dates where the values exceeded the norm.
- Typical Day of Wait Statistics for the SQL Statement to show the average peak usage over the course of a day in order to get an understanding of when the code waits the most.
Resources Interface in the SolarWinds Database Performance Analyzer for SQL Server
The Resources Interface consists of five tabs with numerous line graphs to quickly determine the status of your SQL Server instance and be able to compare the results over the last hour, day, week, month or longer. The graphs include the following metrics:
- Signal Waits Percentage
- Instance CPU Utilization
- O/S CPU Utilization Percentage
- O/S CPU Queue Length
- Page Life Expectancy
- O/S Memory Utilization Percentage
- Pace Cache Size (MB)
- Buffer Cache Hit Ratio
- Buffer Cache Size (MB)
- Procedure Cache Hit Ratio
- Memory Paging Rate (Pages / Second)
- Log Bytes Flushed / Second
- Log Flushes / Second
- SQL Compilations / Second
- SQL Re-Compilations / Second
- Total I/O Wait Time
- Total Read I/O Wait Time
- Total Write I/O Wait Time
- O/S Disk Queue Length
- Physical I/O Rate (KB / Second)
- Physical Write Rate (KB / Second)
- Physical Read Rate (KB / Second)
- SQL Disk Read Latency
- SQL Read Write Latency
- Round-Trip Time (Milliseconds)
- Transaction Rate (Transactions / Second)
- Blocked Sessions
- Active Sessions
- Batch Requests / Second
Included on the graphs above are Warnings and Critical Alerts. These help draw your attention to issues in the environment that need to be addressed. These threshold values are pre-configured based on industry best practices, but each counter is fully customizable for the Critical and Warning thresholds based on your needs. See the screen shot below as a point of reference. This configuration can be accessed by clicking on the “Settings” URL from any graph as on the Resources Tab.
Similarly, if you are not familiar with the counter, you can click on the “Information” URL per counter to learn more about it as shown below for Page Life Expectancy.
Virtualization Performance Tuning in SolarWinds Database Performance Analyzer
As SQL Server DBAs, we are at the intersection of the database, code, operating system, storage, infrastructure, etc. and need to speak the proper language with the associated groups that manage each of these resources. Based on this situation we also need to understand the underlying infrastructure supporting SQL Server. With virtualization being the norm for so many environments, having insight into the VM and Host supporting our SQL Server instance is critical to put the pieces of the performance puzzle together.
With Database Performance Analyzer, the Virtualization and Performance metrics can easily be toggled by pressing either the Virtualization or Performance buttons at the top of the interface as shown below. The functionality of the Virtualization interface is very similar to the Performance interface with the SQL Server Metrics listed above. The performance issue discovery process and correlation is similar to the earlier descriptions with the SQL Server metrics and code serving as the foundation to troubleshoot the performance issues. The Virtualization data is divided into VM Layers, Current, VM Config and Resources on the top right of the interface similar to the SQL Server data in the previous section.
As you research data in this interface, keep in mind that although the data is related to the VM and Host, it is all in the context of the SQL Statements, Wait Statistics, Programs, Databases, etc. that we learned about in the previous sections. All of the metrics can help us as SQL Server Professionals to correlate the code being issued in SQL Server with the impact to the VM. In many respects DPA provides us with an accurate means to slice the data on virtual machines providing more insight to troubleshoot the issue that is difficult to capture with the native tools. The underlying infrastructure with a virtual environment is fluid as compared to a dedicated physical server where the resource allocation does not change. DPA provides a top-down approach when troubleshooting Top SQL statements: Database Instance, VM / Operating System, Physical Host and Storage.
As you browse the remainder of the DPA Virtualization interface it will be a familiar and intuitive process similar to the SQL Server Performance Tuning explained earlier and give you the confidence to correct performance issues on both virtualized and physical servers.
While DPA works for database running on all popular virtualization and cloud hosting providers, the extended virtualization capabilities are currently only available only for VMware. The virtualization capability is sold as an add on option, but is fully integrated within DPA.
Reporting with DPA
Although a large portion of the SolarWinds Database Performance Analyzer for SQL Server can be considered a reporting tool, there is a reporting module to build custom reports to meet your specific needs. The custom reports can be scheduled on a regular basis for delivery to particular team members. These can help for daily health checks as well as serve as a report to technical management to understand the overall status for the environment. The reports help to focus on the most resource intensive queries, databases, applications, users, machines, plans, etc. There are also options for average waits and typical waits for a day to obtain some trending data.
Below is a sample Top SQLs report with the top graph providing a visual representation of code over the time period and the bottom portion of the report including the detailed code.
DPA for SQL Server includes 20 predefined reports with the ability to customize a number of the settings based on the audience receiving the report to deliver the highest value.
Alerts in SolarWinds Data Performance Analyzer
Beyond the data collection, monitoring, analysis, recommendations and reporting from SolarWinds Database Performance Analyzer for SQL Server, alerts can also be configured for items outside the scope of just performance. These items include:
- Total Database Instance Wait Time
- Total Wait Time for a Single SQL
- Average Wait Time for a Single SQL
- Total SQL Wait Time for Single Wait
- Total SQL Wait Time for a Program, Database User, O/S User, Machine or Database
- Total Blocking Wait Time
- Database Instance Availability
- Database Freespace
- Database Instance Parameter Changes
- Transaction Log Freespace
- SQL Server not running
- SQL Server Error Log Alert
- SQL Server Ineffective Statistics
- SQL Server Job Failure
- SQL Server Log has Many Virtual Logs
- SQL Server Long Running Jobs
- Custom SQL Alert with either a Single Numeric, Multiple Numeric or Single Boolean Result
- Custom Procedure Alert with either a Single Numeric, Multiple Numeric or Single Boolean Result
Below is a sample Alert to give you a sense of the functionality:
With the time constraints your team faces, building an internal tool to match the functionality of SolarWinds Database Performance Analyzer for SQL Server would add a tremendous amount of work to your team’s plate. DPA for SQL Server is an end to end tool to collect, manage, report and recommend corrections based on your performance data. This is accomplished with less than 1% load on the monitored servers from DPA to collect the performance metrics, whether your SQL Server instance is on premise, in the cloud, virtualized or a physical server. The browser based interface is simple and consistent providing opportunities to slice and dice the data as needed for all of your team members i.e. Developers, Operations, QA, Management, etc. not just DBAs without compromising production security to troubleshoot a performance issue. The intuitive interface guides you to the issue and encourages exploration of the data to gain new insights for your environment. Speaking of the data, it is not just raw data, it is aggregated, consolidated and prioritized to see trending overtime with the flexibility to be able to break the data down to 1 minute granularity.
Consider SolarWinds Database Performance Analyzer for SQL Server to save your team time to efficiently resolve your SQL Server performance issues.
- Learn more about SolarWinds Database Performance Analyzer for SQL Server and get your free evaluation version to address your SQL Server needs.
- From a single installation of DPA you can also monitor VMware, Oracle, DB2, Sybase and AWS. Consider DPA as your first step to address these issues.
- Additional resources:
MSSQLTips.com Product Spotlight sponsored by SolarWinds, makers of Database Performance Analyzer for SQL Server.
Last Updated: 2016-06-13
About the author
View all my tips