Correct SQL Server TempDB Spills in Query Plans Caused by Outdated Statistics
Statistics are an integral part of SQL Server and query performance. In short, the query optimizer uses statistics to create query plans that will improve the overall performance of the queries ran. Each statistic object is created on a list of one or more table columns and includes a histogram displaying the distribution of values in the first column. The histogram can have up to 200 steps, but no more regardless of the number of rows in the column or index. Unfortunately, I have seen a warning in my query plan that reads: "Operator used tempdb to spill data during execution with spill level 1". How do I address this warning?
In this tip we’ll take a look at one specific performance issue that you might find in an execution plan of a query which is "Operator used tempdb to spill data during execution with spill level 1". If you’ve ever noticed the following warning, then this tip is for you:
Within the AdventureWorks2014 DB, I’ll use the following query for my example:
SELECT BusinessEntityID, FirstName, LastName, EmailPromotion FROM [AdventureWorks2014].[Person].[Person] WHERE EmailPromotion > 0 ORDER BY LastName
Looking at this query I can already tell contention may be present so I’ll go ahead and add a covering index:
CREATE NONCLUSTERED INDEX [IX_Person_EmailPromotion_INCLUDES] ON [Person].[Person] ([EmailPromotion]) INCLUDE ([BusinessEntityID],[FirstName],[LastName])
When adding the index above, statistics were automatically created and updated. Since the addition of this index I’ve added a few thousand rows to the Person table.
Let’s run the query and make sure the “Include Actual Execution Plan” button is selected.
After the query executes let’s take a look at the execution plan by clicking on the tab in the Results pane:
One of the first things you’ll probably notice is the warning sign on the Sort operation. If we hover our mouse over this operator you’ll see more information including the “Warnings: Operator used tempdb to spill data during the execution with spill level 1”
These warnings were added to SQL Server Management Studio 2012, so if you’re using an older version you may not see this. The spill data to TempDB warning means that the query was not granted enough memory to finish the operation and spilled over into the TempDB to complete the operation. We all know reading from memory is much faster than reading from disk and this is exactly what is happening here. The query read as much as it could from memory before moving over to the TempDB disk.
Let’s look into this further. Hover over the arrow leading to the Sort operation to view the Actual Number of Rows and Estimated Number of Rows:
Aha! The SQL Server database engine only estimated this query would return 568,136 rows, but we actually returned 1,128,192 rows. Seeing this type of discrepancy usually means that statistics are out of date.
Let’s run DBCC SHOW_STATISTICS against our new index to view the histogram:
DBCC SHOW_STATISTICS ('Person.Person', IX_Person_EmailPromotion_INCLUDES)
As you can see from the previous screenshot, DBCC SHOW_STATISTICS returns 3 different result sets.
- General information about the statistics object
- Density vector
We’re going to concentrate on the general information and histogram result sets for this tip. As you can see from the general information set, rows equal 319,552 and rows sampled equals 68,991
When we run the following query you’ll notice that there are a lot more records than the general information results are displaying:
SELECT COUNT(EmailPromotion) FROM [AdventureWorks2014].[Person].[Person]
Also, you can see from the column RANGE_HI_KEY, there are 177,518 records where the EmailPromotion column equals 0.
If you run the following query, however, you’ll see that 1,428,224 records exist where EmailPromotion is equal to 0:
SELECT COUNT(*) FROM [AdventureWorks2014].[Person].[Person] WHERE EmailPromotion = 0
Let’s update this statistic with FULLSCAN by running the following statement:
UPDATE STATISTICS Person.Person IX_Person_EmailPromotion_INCLUDES WITH FULLSCAN
Now when we run DBCC SHOW_STATISTICS we’ll notice that rows and rows sampled in the general information results are correct. We’ll also see that 1,428,224 records exist for the RANGE_HI_KEY 0. These are the results we were expecting to get.
Let’s run our query again and look at the execution plan:
Yes! The warning is gone.
- When a query is compiled SQL Server does not look at the data in the table or index. It uses statistics to estimate the number of rows that will be returned and that will determine the execution plan. This is why it’s important to keep statistics updated on a regular basis and included as part of your regular maintenance schedule.
- View more tips on SQL Server statistics here
About the author
View all my tips