By: Tim Smith | Comments | Related: > TSQL
Problem
We're in a smaller environment with multiple SQL Server's where space is limited for many of our servers. Unfortunately, most performance measuring tools are too much for the environment, but we know that we need to track and keep some of these metrics, especially ones we've identified as weak points in our system. We're trying to balance the challenge of measuring how much we're improving with our system capabilities. What are some ways that we can retain these metrics without using too much space, while also providing us a useful assessment for what's normal for our environment so that we can detect issues?
Solution
I heard Joe Polish once say that "What's measured improves, and what's measured and reported improves exponentially." Metrics tell us how well we're progressing to our goal, but some environments can't keep a large amount of metrics or may not have the software ability to support these metrics. For this reason, consolidating what we have into meaningful numbers that we can use for tracking normal behavior helps. Let's suppose that we collect 5000 data values for a day; what are some measurements that can retain a summary of those 5000 values?
About a decade ago, I stored market and demographic information and faced a similar problem where the information became too big for the software tool I used at the time, so I had to devise a way to keep meaningful data. More often than not, aggregating data helps, provided that the aggregates provide useful behavioral tracking. When I moved into database administration, I was pleasantly surprised that many of these measurements were just as helpful when benchmarking performance (among many other things too). So I've used the below measurements to really track how well an environment (server or database) is improving overall. Relative to your environment, a daily summary of these points saved in a table may be enough, but you may also want to keep a semi-daily summary.
Median. One of my favorite statistical measures, as it's one of the most robust measurements in statistics - it specifies what is the middle point of any data set. Unlike average (which is a weighed measure), it's seldom influenced by outlier values and we know the exact percentile of the value (50th percentile). Unfortunately, the function median doesn't exist in SQL Server, so you have to create a custom one yourself and this is relative to the data set. This article covers a few ways to calculate median.
Average. A measure that looks at a numeric summary of a data set based on the aggregate divided by the data points, though heavily influenced by outliers. While average can be helpful, be careful about the influence of outliers on its value. For instance, average household income and median household income in the United States are very different values (about $70,000 vs. $50,500), but not as different when you exclude the top 1% of household incomes. The same is true if you had 60 data points in one hour for CPU where you had 10 99% CPU values, and 50 1% or less values. The median for the data set would be 1%, while the average would be over 16%. The percentile of average may or may not be close to the 50th, relative to the data set.
SELECT AVG(DataColumn) FROM OurTrackingTable
Maximum and minimum. The highest and lowest values provide a great range for a day summary. When you know the lowest possible value - such as 0% disk space used and you see that the lowest value was 5% disk space used for the day, it provides a relative range of where you stand. With some measurements, the maximum and minimum become the official range of a particular database or server, even if that differs from other environments. For instance, a small replication environment's minimum replication transaction count may be 5,194, which would be different from a theoretical minimum of 0.
SELECT MAX(DataColumn), MIN(DataColumn) FROM OurTrackingTable
80th/20th percentile. A popular business rule is the 80-20 rule and I like to track the 80th-20th percentile thresholds for some measurements and in some environments. For an example, with Page Life Expectancy, tracking that 80% of the page life expectancy values for the day were above a certain value, such as 4300. Tracking this may not be helpful in other environments, especially considering that monitoring creates costs to an environment. The below code is an example of what I use to get this and allows me to get other values by percentile if I need:
;WITH EightTwenty AS( SELECT ROW_NUMBER() OVER (ORDER BY DataColumn ASC) ID , CAST(((SELECT COUNT(DataColumn) FROM OurTrackingTable)*0.80) AS INT) AS Eighty , CAST(((SELECT COUNT(DataColumn) FROM OurTrackingTable)*0.20) AS INT) AS Twenty , DataColumn FROM OurTrackingTable ) SELECT DataColumn FROM EightTwenty WHERE ID = Eighty OR ID = Twenty
Above A Limit. Like the 80th-20th percentile, in some situations, it's good to know another percentile, like, "95% of my queries used 2 minutes in time or less" which can help you benchmark an area where you want to improve. I generally will retain these special values in an extra note column if they differ in case from other measures.
Outliers. I like to track how many outliers appear each day - if at all - on daily values (noting here that I could compare it to the historical average as well), using the standard outlier equation of average added to three multiplied by the standard deviation.
DECLARE @outlier DECIMAL(9,4) SELECT @outlier = ((SELECT AVG(DataColumn) FROM OurTrackingTable)+(3*(SELECT STDEV(DataColumn) FROM OurTrackingTable))) SELECT COUNT(*) Outliers FROM OurTrackingTable WHERE DataColumn >= @outlier
Do the above measurements track every possible measure to summarize a daily data set? No, relative to environment, application, or process, you may want to look at other measurements, or leave out some of these measurements. For an example, you may find that median provides more meaning to your environment than average and decide to remove it. The key with consolidating these daily metrics is that I have a table that summarizes the day that is a fraction of the entire data set.
Next Steps
- In smaller environments, which of the above measures would be most useful?
- Using a SQL Server Stored Procedure or SQL Server Agent Job, use those measurements to consolidate the data points into a summary table.
About the author
This author pledges the content of this article is based on professional experience and not AI generated.
View all my tips