Problem
SQL Server query performance is something all database professionals are looking to improve. Many SQL Server DBAs and Developers focus on correctly indexed tables. This ensures queries perform as fast as possible. While that is very important many overlook the importance of statistics. Statistics affect query optimization, cardinality and performance issues. In this tutorial we will try to answer what, why, where, when of SQL Server Statistics.
Solution
All examples below are based on the AdventureWorks2017 database using Microsoft SQL Server 2019 (15.0.2070.41).
UPDATE STATISTICS in SQL Server
Statistics store information about the data distribution. This is based on column value(s) in your tables as well as the total number of rows. For indexes it stores the distribution of key values. SQL Server generates histograms for the first column in each statistics object. This maps the data distribution or cardinality. For multi-column statistics objects, information is also stored on the correlation of the values within the object. This creates a density vector for each column, which stores information about the number of duplicate values for each column.
SQL UPDATE STATISTICS Usage
The query optimizer uses existing statistics in order to generate the most efficient query plan. This determines index usage for query optimization and how to access indexes, how best to join tables, etc…
SQL Server Statistics Creation
Statistics are created a couple of different ways. The first way is when an index is created on a tables column(s). When this happens the Microsoft SQL Server database automatically creates a statistics object. This is for the column(s) that make up the index. If we look at the HumanResources.Department table, we can see there is one statistics object for each index on the table.

Auto Create Statistics
“AUTO CREATE STATISTICS” is another way to automatically create statistics. To check this setting run the following T-SQL.
SELECT name,is_auto_create_stats_on FROM sys.databases WHERE name = 'AdventureWorks2017';

Enable AUTO CREATE STATISTICS by running the following T-SQL. This setting should be enabled by default.
ALTER DATABASE AdventureWorks2017 SET AUTO_CREATE_STATISTICS ON;
When AUTO CREATE STATISTICS is enabled, anytime a column is used as a query predicate the Microsoft SQL Server engine will automatically generate statistics for the referenced column.
If we run the below query, SQL Server should create column statistics for the ModifiedDate column.
SELECT * FROM HumanResources.Department WHERE ModifiedDate > getdate()-3;
Running the query generates a new statistics object. If you right click on the statistics and select Properties, you can see the column is ModifiedDate.

Single and Multi-Column Statistics
It is important to note that SQL Server only creates single column statistics. To generated multi column statistics use the CREATE STATISTICS command.
Here is a simple example of how you can manually create statistics for the JobTitle column in the HumanResources.Department table.
CREATE STATISTICS STATS_Employee_JobTitle on HumanResources.Employee (JobTitle);

To create multi-column statistics, use the syntax below.
CREATE STATISTICS STATS_SalesTerritory_Group_CountryRegionCode on
Sales.SalesTerritory ([Group],CountryRegionCode);

You can also create filtered statistics if your data set has a subset of rows with a unique data distribution. To create filtered statistics, use a WHERE clause as shown below.
CREATE STATISTICS STATS_Address_StateProvinceIDFiltered_City
ON Person.Address ( City ) WHERE StateProvinceID = 79;

Options for manually created SQL Server Statistics
Manually creating statistics offers the option for sampling data. Sampling can have a big impact on the amount of time it takes to generate statistics for very large tables. The obvious drawback of this though is that the less data you sample the less accurate your statistics become. For large tables though, it is sometimes the best option as a full scan would take too much time.
Examples of the T-SQL for both of these options is below as well as the two options available when sampling data. These options include specifying the number of rows or a percentage of the current table size to sample.
CREATE STATISTICS STATS_Employee_JobTitle_fullscan
on HumanResources.Employee (JobTitle) WITH FULLSCAN;
CREATE STATISTICS STATS_Employee_JobTitle_sample10rows
on HumanResources.Employee (JobTitle) WITH SAMPLE 10 ROWS;
CREATE STATISTICS STATS_Employee_JobTitle_sample50pct
on HumanResources.Employee (JobTitle) WITH SAMPLE 50 PERCENT;
SQL Server Statistics Storage
SQL Server stores statistics in binary large objects (BLOBs). Access statistics data using the following system catalog views and DMVs.
- sys.stats
- sys.stats_columns
- sys.dm_db_stats_properties
- sys.dm_db_stats_histogram
A much easier method for view object statistics, however, is to use the DBCC SHOW_STATISTICS command. Below are two examples for looking at statistics created first for an index and second for a column on the Sales.SalesOrderDetail table.
DBCC SHOW_STATISTICS ("Sales.SalesOrderHeader",IX_SalesOrderHeader_CustomerID)
DBCC SHOW_STATISTICS ("Sales.SalesOrderHeader",OrderDate)
Explaining DBCC SHOW_STATISTICS output
Using the output from one of the examples from above we can see that the output is divided into 3 sections. The Statistics Header, Density Vector, and Histogram. Note: For this histogram I have only shown the first 5 buckets to save space. I won’t go into any more detail on each individual section as there is a very good tip which provides this information. How to Interpret DBCC SHOW_STATISTICS in SQL Server Management Studio.

SQL Server Statistics Updates
SQL Server provides two ways to update statistics, manually and automatically.
Manually Update SQL Server Statistics
Manually update statistics using the UPDATE STATISTICS command or the built-in stored procedure sp_updatestats.
The sp_updatestats stored procedures updates statistics for every user defined table in the database. The UPDATE STATISTICS command includes the same sampling options as the CREATE STATISTICS command.
Run the below examples in SQL Server Management Studio:
sp_updatestats
UPDATE STATISTICS HumanResources.Employee STATS_Employee_JobTitle_fullscan;
UPDATE STATISTICS HumanResources.Employee STATS_Employee_JobTitle_sample10rows WITH FULLSCAN;
UPDATE STATISTICS Person.Address;
Note: When statistics are updated, queries in the plan cache referencing these objects will recompile the next time they are called so you do have to be careful not to update statistics too often as the performance impact of frequently recompiling queries can hurt your overall system performance.
Auto Update Statistics
The second way statistics can be updated is automatically when the “AUTO UPDATE STATISTICS” database option is set. If this option is set SQL Server will check how many rows have been modified in the table and if a certain threshold is exceeded it will update the statistics and create a new execution plan for the plan cache. The rules for this threshold can be found in the SQL Server documentation and are as follows:

By default, with this option on, the statistics are updated synchronously. That means that when SQL Server detects out of date statistics it will first update the statistics, generate a new execution plan, and then execute the query. In some cases, you don’t really want your query waiting for the new statistics to be generated as the time it would take for the statistics to be generated and the query to execute with the new plan would take longer than just running it with the old query plan. To avoid this behavior, you can set the “AUTO UPDATE STATISTICS ASYNCHRONOUSLY” database option by running the following command in your database.
ALTER DATABASE AdventureWorks SET AUTO_UPDATE_STATISTICS_ASYNC ON
With this option set your query will run with the currently cached query plan and the statistics will be updated in the background. Once the statistics update is complete the next query executed will generate and use a new query plan using the updated statistics.
It is also possible to update statistics with SQL Server Maintenance Plans.
Are SQL Server Statistics out of date
Determining whether statistics are out of date is a really difficult question to answer. There are many different factors that can affect table, index or column statistics and cause them to generate poor performing query plans. You could have a table with many inserts, updates or deletes but depending on which values were affected it might not have much if any impact on the statistics.
For example, let’s take a table with an index on an identity column with say 10,000 rows that has statistics generated using a full scan. Now we could delete 1000 rows from the table (every 5th value, 5,10,15,…) You can see how with this example even though we updated 20% of the table based on the values we have removed; the table statistics really wouldn’t be affected. The opposite could also be true. If we have a table with an identity column that has 1,000,000 rows and has had a statistics update done and then we insert 10,000 new rows we have only changed the table by 1%. If queries are only looking for new values (e.g. values not in the histogram) it could have a big impact on query plans and performance.
I guess to answer the question, how to tell if statistics are out of date, the best answer one can give is you need to understand how DML queries access your tables and base your decision on that information. With that said we can use the sys.dm_db_stats_properties DMV view to see when our statistics were last updated and how many modifications have been made since that last update. Below is the TSQL for this query as well as sample output.
SELECT name AS StatsName, last_updated, rows, modification_counter
FROM sys.stats AS stat
CROSS APPLY sys.dm_db_stats_properties(stat.object_id, stat.stats_id) AS sp
ORDER BY modification_counter desc

SQL Server Update Statistics Script
As mentioned earlier in this tip you can use the sp_updatestats stored procedure to update the statistics for all the tables and indexes in your database but this brute force method really isn’t a good use of the resources on your system as there are many tables/indexes that do not have any DML executed against them and do not need their statistics updated. A better option is to use a query similar to the one we identified above when searching for outdated statistics and add some logic to only update the statistics when you deem it’s required. In the example below we add a cursor to find statistics that have either not been updated in a week (either automatically or by this procedure) or have had more than 10% of the table modified.
DECLARE @cmd NVARCHAR(2000)
DECLARE stats_cursor CURSOR FOR
SELECT 'UPDATE STATISTICS ' + OBJECT_SCHEMA_NAME(stat.object_id) + '.' + OBJECT_NAME(stat.object_id) + ' ' + name + ';' AS command
FROM sys.stats AS stat
CROSS APPLY sys.dm_db_stats_properties(stat.object_id, stat.stats_id) AS sp
WHERE last_updated < GetDate()-7 OR modification_counter > rows*.10
OPEN stats_cursor
FETCH NEXT FROM stats_cursor INTO @cmd
WHILE @@FETCH_STATUS = 0
BEGIN
EXEC sp_executesql @cmd
FETCH NEXT FROM stats_cursor INTO @cmd
END
CLOSE stats_cursor
DEALLOCATE stats_cursor
Using either of the above methods, combined with sp_msforeachdb to run these commands on all the databases in your instance,, you can create a SQL Server job that can be used to build yourself a scheduled statistics maintenance task to ensure statistics are always up to date.
Next Steps
- Read more on SQL Server Query Plan Tips
- Read more tips on statistics