By: Marios Philippopoulos | Comments (2) | Related: > Indexing
Problem
Index-usage statistics get re-initialized when the SQL Server instance in question gets restarted and, starting with SQL Server 2012, after an index rebuild. This can obscure the true impact of indexes in day-to-day operations and lead to false determinations of used vs. unused indexes. How can one persist this valuable data over a longer period of time and obtain a meaningful repository of information for performance tuning and troubleshooting?
Solution
In this tip I use the MERGE statement to maintain the incremental nature of index-usage statistics between SQL Server instance restarts and index rebuilds. This solution also accounts for newly added/deleted indexes. The end result is a reliable source of data that can serve as a tool for index analysis.
Setup
To demonstrate I use a copy of the AdventureWorks2012 database residing on a SQL Server 2012 instance. Two tables on a separate database, [IndexUsageStatsDB], are used to store the index-usage statistics: [IndexUsageStats] and [IndexUsageStatsSnap]. The [IndexUsageStats] table holds the long-term values persisted between instance restarts and index rebuilds, whereas the [IndexUsageStatsSnap] table records "snapshot" data obtained at the moment of collection. The [IndexUsageStatsSnap] table would need to get re-populated at scheduled intervals (eg. every hour) and is the basis for updating the [IndexUsageStats] table through a MERGE statement.
The two tables are listed below:
----------- --Script 1: ----------- USE [IndexUsageStatsDB]; GO CREATE TABLE [dbo].[IndexUsageStats] ( [IndexUsageStatsID] INTEGER IDENTITY(1,1) NOT NULL, [dbid] SNALLINT NOT NULL, [table_name] SYSNAME NOT NULL, [index_id] INTEGER NULL, [index_name] NVARCHAR(128) NOT NULL, [user_scans] BIGINT NOT NULL, [user_seeks] BIGINT NOT NULL, [user_lookups] BIGINT NOT NULL, [user_updates] BIGINT NOT NULL, [last_user_scan] DATETIME NULL, [last_user_seek] DATETIME NULL, [last_user_lookup] DATETIME NULL, [last_user_update] DATETIME NULL, Constraint [PK_IndexUsageStats] PRIMARY KEY ([IndexUsageStatsID]) ON [PRIMARY] ) ON [PRIMARY] GO CREATE TABLE [dbo].[IndexUsageStatsSnap] ( [IndexUsageStatsSnapID] INTEGER IDENTITY(1,1) NOT NULL, [dbid] SNALLINT NOT NULL, [table_name] SYSNAME NOT NULL, [index_id] INTEGER NULL, [index_name] NVARCHAR(128) NOT NULL, [user_scans] BIGINT NOT NULL, [user_seeks] BIGINT NOT NULL, [user_lookups] BIGINT NOT NULL, [user_updates] BIGINT NOT NULL, [last_user_scan] DATETIME NULL, [last_user_seek] DATETIME NULL, [last_user_lookup] DATETIME NULL, [last_user_update] DATETIME NULL, Constraint [PK_IndexUsageStatsSnap] PRIMARY KEY ([IndexUsageStatsSnapID]) ON [PRIMARY] ) ON [PRIMARY] GO
The following statement is used to re-populate the table [IndexUsageStatsSnap] at scheduled intervals:
----------- --Script 2: ----------- USE [AdventureWorks2012]; GO TRUNCATE TABLE [IndexUsageStatsDB].[dbo].[IndexUsageStatsSnap]; INSERT INTO [IndexUsageStatsDB].[dbo].[IndexUsageStatsSnap] ( [dbid] , [table_name] , [index_id] , [index_name] , [user_scans] , [user_seeks] , [user_lookups] , [user_updates] , [last_user_scan] , [last_user_seek] , [last_user_lookup] , [last_user_update] ) SELECT DB_ID() AS [dbid] , OBJECT_NAME(i.[object_id], DB_ID()) AS [table_name] , i.index_id , i.[name] , COALESCE(ddius.user_scans, 0) , COALESCE(ddius.user_seeks, 0) , COALESCE(ddius.user_lookups, 0) , COALESCE(ddius.user_updates, 0) , ddius.last_user_scan , ddius.last_user_seek , ddius.last_user_lookup , ddius.last_user_update FROM sys.indexes i LEFT OUTER JOIN sys.dm_db_index_usage_stats ddius ON ddius.index_id = i.index_id AND ddius.[object_id] = i.[object_id] WHERE (ddius.database_id = DB_ID() OR ddius.database_id IS NULL) -- this DB only AND OBJECTPROPERTY(i.[object_id], 'IsUserTable') = 1 AND i.index_id > 0 -- filter out heaps ORDER BY OBJECT_NAME(i.[object_id], DB_ID()) , i.index_id; GO
The MERGE statement
As mentioned earlier, a MERGE statement is used to update the [IndexUsageStats] table on the basis of data from table [IndexUsageStatsSnap]. In the first part of this section I describe the logic (pseudo code) behind the MERGE statement. In the second part I show the actual T-SQL code.
Logic and Pseudo code
The terms used in the pseudo code below are defined as follows:
- Counter1: counter value (number of scans/seeks/lookups/updates) of a specific index in the [IndexUsageStats] table.
- Counter2: counter value (number of scans/seeks/lookups/updates) of a specific index in the [IndexUsageStatsSnap] table.
- LastOccurence1: last time Counter1 changed in [IndexUsageStats] table (eg. last_user_seek value).
- LastOccurence2: last time Counter2 changed in [IndexUsageStatsSnap] table (eg. last_user_seek value).
Here is the pseudo code and associated logic:
----------- --Script 3: ----------- If Index In Both [IndexUsageStats] And [IndexUsageStatsSnap] Tables Then If LastOccurence2 More Recent Than LastOccurence1 And (Counter1 < Counter2 OR LastOccurence1 IS NULL) Then Set Counter1 Equal To Counter2 And Set LastOccurence1 Equal To LastOccurence2 (Logic: there was likely no instance restart/index rebuild since the last time the [IndexUsageStats] and [IndexUsageStatsSnap] tables were updated, as evidenced by the higher value of Counter2.) If LastOccurence2 More Recent Than LastOccurence1 And Counter1 >= Counter2 Then Increment Counter1 By Counter2 And Set LastOccurence1 Equal To LastOccurence2 (Logic: an instance restart/index rebuild since the last time the [IndexUsageStats and [IndexUsageStatsSnap] tables were updated must have caused the database engine to re-initialize the index-usage statistics, as evidenced by the lower value of Counter2.) If ( (LastOccurence2 Equal To LastOccurence1) Or (LastOccurence2 IS NULL) ) Then Do Nothing (Logic: index in question has not been used since the last time the [IndexUsageStats] and [IndexUsageStatsSnap] tables were updated. No change to the Counter1 and LastOccurence1 values is required.) Else If Index Only In [IndexUsageStatsSnap] Table Then Insert Record Into [IndexUsageStats] Table (Logic: newly created index; add to [IndexUsageStats] table.) If Index Only In [IndexUsageStats] Table Then Delete Record From [IndexUsageStats] Table (Logic: newly deleted index; remove from [IndexUsageStats] table.)
T-SQL Code
Here is the T-SQL code of the MERGE statement:
----------- --Script 4: ----------- USE [IndexUsageStatsDB]; MERGE [dbo].[IndexUsageStats] AS [is] USING [dbo].[IndexUsageStatsSnap] AS [iss] ON [is].[dbid] = [iss].[dbid] AND [is].[table_name] = [iss].[table_name] AND [is].[index_id] = [iss].[index_id] AND [is].[index_name] = [iss].[index_name] WHEN MATCHED THEN UPDATE SET [is].[user_scans] = CASE WHEN (([iss].[last_user_scan] > [is].[last_user_scan] OR [is].[last_user_scan] IS NULL) AND [iss].[user_scans] > [is].[user_scans]) THEN [iss].[user_scans] WHEN ([iss].[last_user_scan] > [is].[last_user_scan] AND [iss].[user_scans] <= [is].[user_scans]) THEN [is].[user_scans] + [iss].[user_scans] ELSE [is].[user_scans] -- DO NOTHING END , [is].[user_seeks] = CASE WHEN (([iss].[last_user_seek] > [is].[last_user_seek] OR [is].[last_user_seek] IS NULL) AND [iss].[user_seeks] > [is].[user_seeks]) THEN [iss].[user_seeks] WHEN ([iss].[last_user_seek] > [is].[last_user_seek] AND [iss].[user_seeks] <= [is].[user_seeks]) THEN [is].[user_seeks] + [iss].[user_seeks] ELSE [is].[user_seeks] -- DO NOTHING END , [is].[user_lookups] = CASE WHEN (([iss].[last_user_lookup] > [is].[last_user_lookup] OR [is].[last_user_lookup] IS NULL) AND [iss].[user_lookups] > [is].[user_lookups]) THEN [iss].[user_lookups] WHEN ([iss].[last_user_lookup] > [is].[last_user_lookup] AND [iss].[user_lookups] <= [is].[user_lookups]) THEN [is].[user_lookups] + [iss].[user_lookups] ELSE [is].[user_lookups] -- DO NOTHING END , [is].[user_updates] = CASE WHEN (([iss].[last_user_update] > [is].[last_user_update] OR [is].[last_user_update] IS NULL) AND [iss].[user_updates] > [is].[user_updates]) THEN [iss].[user_updates] WHEN ([iss].[last_user_update] > [is].[last_user_update] AND [iss].[user_updates] <= [is].[user_updates]) THEN [is].[user_updates] + [iss].[user_updates] ELSE [is].[user_updates] -- DO NOTHING END , [is].[last_user_scan] = COALESCE([iss].[last_user_scan], [is].[last_user_scan]) , [is].[last_user_seek] = COALESCE([iss].[last_user_seek], [is].[last_user_seek]) , [is].[last_user_lookup] = COALESCE([iss].[last_user_lookup], [is].[last_user_lookup]) , [is].[last_user_update] = COALESCE([iss].[last_user_update], [is].[last_user_update]) WHEN NOT MATCHED BY TARGET THEN INSERT ( [dbid] , [table_name] , [index_id] , [index_name] , [user_scans] , [user_seeks] , [user_lookups] , [user_updates] , [last_user_scan] , [last_user_seek] , [last_user_lookup] , [last_user_update] ) VALUES ( [iss].[dbid] ,[iss].[table_name] ,[iss].[index_id] ,[iss].[index_name] ,[iss].[user_scans] ,[iss].[user_seeks] ,[iss].[user_lookups] ,[iss].[user_updates] ,[iss].[last_user_scan] ,[iss].[last_user_seek] ,[iss].[last_user_lookup] ,[iss].[last_user_update] ) WHEN NOT MATCHED BY SOURCE THEN DELETE; GO
Procedure
Here are the two steps needed to maintain the data in table [IndexUsageStats]; this procedure would need to be repeated at regular intervals during the day, especially prior to scheduled maintenance work that includes SQL Server instance restarts or index rebuilds:
- Repopulate table [IndexUsageStatsSnap] to capture a snapshot of the statistics at that moment in time - script 2.
- Execute the MERGE statement to update the [IndexUsageStats] table with the latest data from [IndexUsageStatsSnap] - script 4.
Results
To demonstrate how this works I will run queries against the [AdventureWorks2012].[Sales].[SalesOrderDetail] table. I first restart my SQL Server instance, to reset all index usage statistics to 0, and then apply the following query to perform 10 seeks on index IX_SalesOrderDetail_ProductID and 10 lookups on index PK_SalesOrderDetail_SalesOrderID_SalesOrderDetailID:
----------- --Script 5: ----------- SELECT [SalesOrderID] , [SalesOrderDetailID] , [CarrierTrackingNumber] , [OrderQty] , [UnitPrice] , [LineTotal] FROM [AdventureWorks2012].[Sales].[SalesOrderDetail] WHERE [ProductID] = 897; GO 10
Next, I populate table IndexUsageStatsSnap (script 2) and execute the MERGE statement (script 4) to update table IndexUsageStats; the result, for both tables, is this:
index_name |
user_seeks |
user_lookups |
last_user_seek |
last_user_lookup |
---|---|---|---|---|
PK_SalesOrderDetail_ SalesOrderID_SalesOrderDetailID | 0 | 10 | NULL | 2015-04-29 06:54:51.740 |
AK_SalesOrderDetail_rowguid | 0 | 0 | NULL | NULL |
IX_SalesOrderDetail_ProductID | 10 | 0 | 2015-04-29 06:54:51.740 | 0 |
If I now rerun script 5 and repopulate table IndexUsageStatsSnap (script 2) the data in table IndexUsageStatsSnap will look like this:
index_name |
user_seeks |
user_lookups |
last_user_seek |
last_user_lookup |
---|---|---|---|---|
PK_SalesOrderDetail_ SalesOrderID_SalesOrderDetailID | 0 | 20 | NULL | 2015-04-29 07:21:11.147 |
AK_SalesOrderDetail_rowguid | 0 | 0 | NULL | NULL |
IX_SalesOrderDetail_ProductID | 20 | 0 | 2015-04-29 07:21:11.147 | 0 |
According to the rules outlined in the above pseudo code (script 3) the data in table [IndexUsageStats] would need to be updated to match the values in [IndexUsageStatsSnap]. So both tables should be made to show 20 lookups for index [PK_SalesOrderDetail_SalesOrderID_SalesOrderDetailID] and 20 seeks for index [IX_SalesOrderDetail_ProductID]. True enough, after running MERGE (script 4) both tables now hold the following values:
index_name |
user_seeks |
user_lookups |
last_user_seek |
last_user_lookup |
---|---|---|---|---|
PK_SalesOrderDetail_ SalesOrderID_SalesOrderDetailID | 0 | 20 | NULL | 2015-04-29 07:21:11.147 |
AK_SalesOrderDetail_rowguid | 0 | 0 | NULL | NULL |
IX_SalesOrderDetail_ProductID | 20 | 0 | 2015-04-29 07:21:11.147 | 0 |
Let me now rebuild index [IX_SalesOrderDetail_ProductID]:
----------- --Script 6: ----------- USE AdventureWorks2012; GO ALTER INDEX [IX_SalesOrderDetail_ProductID] ON [Sales].[SalesOrderDetail] REBUILD; GO
After refreshing table [IndexUsageStatsSnap] (script 2) the data in that table will show zeroed values for index [IX_SalesOrderDetail_ProductID] (highlighted in yellow):
index_name |
user_seeks |
user_lookups |
last_user_seek |
last_user_lookup |
---|---|---|---|---|
PK_SalesOrderDetail_ SalesOrderID_SalesOrderDetailID | 0 | 20 | NULL | 2015-04-29 07:21:11.147 |
AK_SalesOrderDetail_rowguid | 0 | 0 | NULL | NULL |
IX_SalesOrderDetail_ProductID | 0 | 0 | NULL | 0 |
This is expected behavior with index rebuilds in SQL Server 2012. Issuing the MERGE statement, I end up with the following data in table [IndexUsageStats]:
index_name |
user_seeks |
user_lookups |
last_user_seek |
last_user_lookup |
---|---|---|---|---|
PK_SalesOrderDetail_ SalesOrderID_SalesOrderDetailID | 0 | 20 | NULL | 2015-04-29 07:21:11.147 |
AK_SalesOrderDetail_rowguid | 0 | 0 | NULL | NULL |
IX_SalesOrderDetail_ProductID | 20 | 0 | 2015-04-29 07:21:11.147 | 0 |
The data is unchanged, as expected. The data in table [IndexUsageStats] has been shielded from the earlier index rebuild.
Rerunning the query against the [SalesOrderDetail] table (script 5), then refreshing table [IndexUsageStatsSnap] (script 2) and lastly running the MERGE statement (script 4), I end up with the following values in table [IndexUsageStats]:
index_name |
user_seeks |
user_lookups |
last_user_seek |
last_user_lookup |
---|---|---|---|---|
PK_SalesOrderDetail_ SalesOrderID_SalesOrderDetailID | 0 | 30 | NULL | 2015-04-30 14:52:20.713 |
AK_SalesOrderDetail_rowguid | 0 | 0 | NULL | NULL |
IX_SalesOrderDetail_ProductID | 30 | 0 | 2015-04-30 14:52:20.713 | 0 |
The counters for seeks and lookups have been incremented correctly and the times of the last seek and lookup have also been updated with the latest values.
Conclusion
In this tip I have shown how to use the MERGE statement to maintain index usage statistics information for a SQL Server instance even after the instance restarts and index rebuilds. The end result is a robust set of data that can be relied on to draw conclusions on used vs. unused indexes over long time spans. Using this methodology one can be certain that low/zero usage values for an index are truly an indication that this index is not needed and should probably be dropped, and not an artifact of frequent rebuilds of that index. New and dropped indexes are also dealt with in the same MERGE statement.
An implicit assumption (and limitation) of this approach is that higher values in the [IndexUsageStatsSnap] table, relative to the [IndexUsageStats] table, necessarily mean that no re-initialization of the index-usage statistics has taken place in the database engine since the last time the two tables were refreshed (see first If statement in pseudo code - script 3). This may lead to inaccuracies early on, when values in the 2 tables are of comparable magnitude. However, as the data in the [IndexUsageStats] table gradually spans a longer time range and individual data points increase in magnitude, this assumption will become increasingly more accurate.
Next Steps
- Apply the solution outlined in this tip to your environment to obtain a reliable repository of index usage statistical information.
- Review these related links:
- Using MERGE in SQL Server to insert, update and delete at the same time
- Comparing performance for the MERGE statement to SELECT, INSERT, UPDATE or DELETE
- SQL Server 2008 MERGE More than UPSERT
- Use Caution with SQL Server's MERGE Statement
- T-SQL Programming Part 9 - Using the MERGE Statement to Perform an UPSERT
- Collecting Index Usage Statistics
- Rebuilding Indexes Will Reset Index Usage Statistics in SQL Server 2012
About the author
This author pledges the content of this article is based on professional experience and not AI generated.
View all my tips