Persisting SQL Server Index-Usage Statistics with MERGE

Problem

Index-usage statistics get re-initialized when the SQL Server instance in question gets restarted and, starting with SQL Server 2012, after an index rebuild. This can obscure the true impact of indexes in day-to-day operations and lead to false determinations of used vs. unused indexes. How can one persist this valuable data over a longer period of time and obtain a meaningful repository of information for performance tuning and troubleshooting?

Solution

In this tip I use the MERGE statement to maintain the incremental nature of index-usage statistics between SQL Server instance restarts and index rebuilds. This solution also accounts for newly added/deleted indexes. The end result is a reliable source of data that can serve as a tool for index analysis.

Setup

To demonstrate I use a copy of the AdventureWorks2012 database residing on a SQL Server 2012 instance. Two tables on a separate database, [IndexUsageStatsDB], are used to store the index-usage statistics: [IndexUsageStats] and [IndexUsageStatsSnap]. The [IndexUsageStats] table holds the long-term values persisted between instance restarts and index rebuilds, whereas the [IndexUsageStatsSnap] table records “snapshot” data obtained at the moment of collection. The [IndexUsageStatsSnap] table would need to get re-populated at scheduled intervals (eg. every hour) and is the basis for updating the [IndexUsageStats] table through a MERGE statement.

The two tables are listed below:



———–
–Script 1:
———–

USE [IndexUsageStatsDB];
GO

CREATE TABLE [dbo].[IndexUsageStats]
(
[IndexUsageStatsID] INTEGER IDENTITY(1,1) NOT NULL,
[dbid] SNALLINT NOT NULL,
[table_name] SYSNAME NOT NULL,
[index_id] INTEGER NULL,
[index_name] NVARCHAR(128) NOT NULL,
[user_scans] BIGINT NOT NULL,
[user_seeks] BIGINT NOT NULL,
[user_lookups] BIGINT NOT NULL,
[user_updates] BIGINT NOT NULL,
[last_user_scan] DATETIME NULL,
[last_user_seek] DATETIME NULL,
[last_user_lookup] DATETIME NULL,
[last_user_update] DATETIME NULL,
Constraint [PK_IndexUsageStats] PRIMARY KEY ([IndexUsageStatsID]) ON [PRIMARY]
)
ON [PRIMARY]
GO

CREATE TABLE [dbo].[IndexUsageStatsSnap]
(
[IndexUsageStatsSnapID] INTEGER IDENTITY(1,1) NOT NULL,
[dbid] SNALLINT NOT NULL,
[table_name] SYSNAME NOT NULL,
[index_id] INTEGER NULL,
[index_name] NVARCHAR(128) NOT NULL,
[user_scans] BIGINT NOT NULL,
[user_seeks] BIGINT NOT NULL,
[user_lookups] BIGINT NOT NULL,
[user_updates] BIGINT NOT NULL,
[last_user_scan] DATETIME NULL,
[last_user_seek] DATETIME NULL,
[last_user_lookup] DATETIME NULL,
[last_user_update] DATETIME NULL,
Constraint [PK_IndexUsageStatsSnap] PRIMARY KEY ([IndexUsageStatsSnapID]) ON [PRIMARY]
)
ON [PRIMARY]
GO

The following statement is used to re-populate the table [IndexUsageStatsSnap] at scheduled intervals:



———–
–Script 2:
———–

USE [AdventureWorks2012];
GO

TRUNCATE TABLE [IndexUsageStatsDB].[dbo].[IndexUsageStatsSnap];

INSERT INTO
[IndexUsageStatsDB].[dbo].[IndexUsageStatsSnap]
(
[dbid]
,[table_name]
,[index_id]
,[index_name]
,[user_scans]
,[user_seeks]
,[user_lookups]
,[user_updates]
,[last_user_scan]
,[last_user_seek]
,[last_user_lookup]
,[last_user_update]
)
SELECT
DB_ID() AS [dbid]
,OBJECT_NAME(i.[object_id], DB_ID()) AS [table_name]
,i.index_id
,i.[name]
,COALESCE(ddius.user_scans, 0)
,COALESCE(ddius.user_seeks, 0)
,COALESCE(ddius.user_lookups, 0)
,COALESCE(ddius.user_updates, 0)
,ddius.last_user_scan
,ddius.last_user_seek
,ddius.last_user_lookup
,ddius.last_user_update
FROM
sys.indexes i
LEFT OUTER JOIN
sys.dm_db_index_usage_stats ddius
ON
ddius.index_id = i.index_id
AND ddius.[object_id] = i.[object_id]
WHERE
(ddius.database_id = DB_ID() OR ddius.database_id IS NULL) — this DB only
AND OBJECTPROPERTY(i.[object_id], ‘IsUserTable’) = 1
AND i.index_id > 0 — filter out heaps
ORDER BY
OBJECT_NAME(i.[object_id], DB_ID())
,i.index_id;
GO

The MERGE statement

As mentioned earlier, a MERGE statement is used to update the [IndexUsageStats] table on the basis of data from table [IndexUsageStatsSnap]. In the first part of this section I describe the logic (pseudo code) behind the MERGE statement. In the second part I show the actual T-SQL code.

Logic and Pseudo code

The terms used in the pseudo code below are defined as follows:

  • Counter1: counter value (number of scans/seeks/lookups/updates) of a specific index in the [IndexUsageStats] table.
  • Counter2: counter value (number of scans/seeks/lookups/updates) of a specific index in the [IndexUsageStatsSnap] table.
  • LastOccurence1: last time Counter1 changed in [IndexUsageStats] table (eg. last_user_seek value).
  • LastOccurence2: last time Counter2 changed in [IndexUsageStatsSnap] table (eg. last_user_seek value).

Here is the pseudo code and associated logic:


———–
–Script 3:
———–
If Index In Both [IndexUsageStats] And [IndexUsageStatsSnap] Tables
Then
If LastOccurence2 More Recent Than LastOccurence1
And
(Counter1 < Counter2 OR LastOccurence1 IS NULL)
Then Set Counter1 Equal To Counter2
And Set LastOccurence1 Equal To LastOccurence2
(Logic: there was likely no instance restart/index rebuild since
the last time the [IndexUsageStats] and [IndexUsageStatsSnap] tables were updated,
as evidenced by the higher value of Counter2.)

If LastOccurence2 More Recent Than LastOccurence1
And
Counter1 >= Counter2
ThenIncrement Counter1 By Counter2
And Set LastOccurence1 Equal To LastOccurence2
(Logic: an instance restart/index rebuild since the last time
the [IndexUsageStats and [IndexUsageStatsSnap] tables were updated must have
caused the database engine to re-initialize the index-usage statistics,
as evidenced by the lower value of Counter2.)

If ( (LastOccurence2 Equal To LastOccurence1) Or (LastOccurence2 IS NULL) )
Then Do Nothing
(Logic: index in question has not been used since the last time
the [IndexUsageStats] and [IndexUsageStatsSnap] tables were updated.
No change to the Counter1 and LastOccurence1 values is required.)

Else
If Index Only In [IndexUsageStatsSnap] Table
Then Insert Record Into [IndexUsageStats] Table
(Logic: newly created index; add to [IndexUsageStats] table.)

If Index Only In [IndexUsageStats] Table
Then Delete Record From [IndexUsageStats] Table
(Logic: newly deleted index; remove from [IndexUsageStats] table.)


T-SQL Code

Here is the T-SQL code of the MERGE statement:



———–
–Script 4:
———–

USE [IndexUsageStatsDB];

MERGE [dbo].[IndexUsageStats] AS [is]
USING [dbo].[IndexUsageStatsSnap] AS [iss]
ON
[is].[dbid] = [iss].[dbid]
AND [is].[table_name] = [iss].[table_name]
AND [is].[index_id] = [iss].[index_id]
AND [is].[index_name] = [iss].[index_name]
WHEN MATCHED
THEN UPDATE SET
[is].[user_scans] =
CASE WHEN (([iss].[last_user_scan] > [is].[last_user_scan] OR [is].[last_user_scan] IS NULL) AND [iss].[user_scans] > [is].[user_scans]) THEN [iss].[user_scans]
WHEN ([iss].[last_user_scan] > [is].[last_user_scan] AND [iss].[user_scans] <= [is].[user_scans]) THEN [is].[user_scans] + [iss].[user_scans] ELSE [is].[user_scans] -- DO NOTHING END , [is].[user_seeks] = CASE WHEN (([iss].[last_user_seek] > [is].[last_user_seek] OR [is].[last_user_seek] IS NULL) AND [iss].[user_seeks] > [is].[user_seeks]) THEN [iss].[user_seeks]
WHEN ([iss].[last_user_seek] > [is].[last_user_seek] AND [iss].[user_seeks] <= [is].[user_seeks]) THEN [is].[user_seeks] + [iss].[user_seeks] ELSE [is].[user_seeks] -- DO NOTHING END , [is].[user_lookups] = CASE WHEN (([iss].[last_user_lookup] > [is].[last_user_lookup] OR [is].[last_user_lookup] IS NULL) AND [iss].[user_lookups] > [is].[user_lookups]) THEN [iss].[user_lookups]
WHEN ([iss].[last_user_lookup] > [is].[last_user_lookup] AND [iss].[user_lookups] <= [is].[user_lookups]) THEN [is].[user_lookups] + [iss].[user_lookups] ELSE [is].[user_lookups] -- DO NOTHING END , [is].[user_updates] = CASE WHEN (([iss].[last_user_update] > [is].[last_user_update] OR [is].[last_user_update] IS NULL) AND [iss].[user_updates] > [is].[user_updates]) THEN [iss].[user_updates]
WHEN ([iss].[last_user_update] > [is].[last_user_update] AND [iss].[user_updates] <= [is].[user_updates]) THEN [is].[user_updates] + [iss].[user_updates] ELSE [is].[user_updates] -- DO NOTHING END , [is].[last_user_scan] = COALESCE([iss].[last_user_scan], [is].[last_user_scan]) , [is].[last_user_seek] = COALESCE([iss].[last_user_seek], [is].[last_user_seek]) , [is].[last_user_lookup] = COALESCE([iss].[last_user_lookup], [is].[last_user_lookup]) , [is].[last_user_update] = COALESCE([iss].[last_user_update], [is].[last_user_update]) WHEN NOT MATCHED BY TARGET THEN INSERT ( [dbid] , [table_name] , [index_id] , [index_name] , [user_scans] , [user_seeks] , [user_lookups] , [user_updates] , [last_user_scan] , [last_user_seek] , [last_user_lookup] , [last_user_update] ) VALUES ( [iss].[dbid] ,[iss].[table_name] ,[iss].[index_id] ,[iss].[index_name] ,[iss].[user_scans] ,[iss].[user_seeks] ,[iss].[user_lookups] ,[iss].[user_updates] ,[iss].[last_user_scan] ,[iss].[last_user_seek] ,[iss].[last_user_lookup] ,[iss].[last_user_update] ) WHEN NOT MATCHED BY SOURCE THEN DELETE; GO

Procedure

Here are the two steps needed to maintain the data in table [IndexUsageStats]

  1. Repopulate table [IndexUsageStatsSnap] to capture a snapshot of the statistics at that moment in time – script 2.
  2. Execute the MERGE statement to update the [IndexUsageStats] table with the latest data from [IndexUsageStatsSnap] – script 4.

Results

To demonstrate how this works I will run queries against the [AdventureWorks2012].[Sales].[SalesOrderDetail] table. I first restart my SQL Server instance, to reset all index usage statistics to 0, and then apply the following query to perform 10 seeks on index IX_SalesOrderDetail_ProductID and 10 lookups on index PK_SalesOrderDetail_SalesOrderID_SalesOrderDetailID:



———–
–Script 5:
———–

SELECT
[SalesOrderID]
,[SalesOrderDetailID]
,[CarrierTrackingNumber]
,[OrderQty]
,[UnitPrice]
,[LineTotal]
FROM
[AdventureWorks2012].[Sales].[SalesOrderDetail]
WHERE
[ProductID] = 897;
GO 10

Next, I populate table IndexUsageStatsSnap (script 2) and execute the MERGE statement (script 4) to update table IndexUsageStats

index_name

user_seeks

user_lookups

last_user_seek

last_user_lookup

PK_SalesOrderDetail_ SalesOrderID_SalesOrderDetailID010NULL2015-04-29 06:54:51.740
AK_SalesOrderDetail_rowguid00NULLNULL
IX_SalesOrderDetail_ProductID1002015-04-29 06:54:51.7400

If I now rerun script 5 and repopulate table IndexUsageStatsSnap (script 2) the data in table IndexUsageStatsSnap will look like this:

index_name

user_seeks

user_lookups

last_user_seek

last_user_lookup

PK_SalesOrderDetail_ SalesOrderID_SalesOrderDetailID020NULL2015-04-29 07:21:11.147
AK_SalesOrderDetail_rowguid00NULLNULL
IX_SalesOrderDetail_ProductID2002015-04-29 07:21:11.1470

According to the rules outlined in the above pseudo code (script 3) the data in table [IndexUsageStats] would need to be updated to match the values in [IndexUsageStatsSnap]. So both tables should be made to show 20 lookups for index [PK_SalesOrderDetail_SalesOrderID_SalesOrderDetailID] and 20 seeks for index [IX_SalesOrderDetail_ProductID]. True enough, after running MERGE (script 4) both tables now hold the following values:

index_name

user_seeks

user_lookups

last_user_seek

last_user_lookup

PK_SalesOrderDetail_ SalesOrderID_SalesOrderDetailID020NULL2015-04-29 07:21:11.147
AK_SalesOrderDetail_rowguid00NULLNULL
IX_SalesOrderDetail_ProductID2002015-04-29 07:21:11.1470

Let me now rebuild index [IX_SalesOrderDetail_ProductID]:



———–
–Script 6:
———–
USE AdventureWorks2012;
GO
ALTER INDEX [IX_SalesOrderDetail_ProductID] ON [Sales].[SalesOrderDetail] REBUILD;
GO

After refreshing table [IndexUsageStatsSnap] (script 2) the data in that table will show zeroed values for index [IX_SalesOrderDetail_ProductID] (highlighted in yellow):

index_name

user_seeks

user_lookups

last_user_seek

last_user_lookup

PK_SalesOrderDetail_ SalesOrderID_SalesOrderDetailID020NULL2015-04-29 07:21:11.147
AK_SalesOrderDetail_rowguid00NULLNULL
IX_SalesOrderDetail_ProductID00NULL0

This is expected behavior with index rebuilds in SQL Server 2012. Issuing the MERGE statement, I end up with the following data in table [IndexUsageStats]:

index_name

user_seeks

user_lookups

last_user_seek

last_user_lookup

PK_SalesOrderDetail_ SalesOrderID_SalesOrderDetailID020NULL2015-04-29 07:21:11.147
AK_SalesOrderDetail_rowguid00NULLNULL
IX_SalesOrderDetail_ProductID2002015-04-29 07:21:11.1470

The data is unchanged, as expected. The data in table [IndexUsageStats] has been shielded from the earlier index rebuild.

Rerunning the query against the [SalesOrderDetail] table (script 5), then refreshing table [IndexUsageStatsSnap] (script 2) and lastly running the MERGE statement (script 4), I end up with the following values in table [IndexUsageStats]:

index_name

user_seeks

user_lookups

last_user_seek

last_user_lookup

PK_SalesOrderDetail_ SalesOrderID_SalesOrderDetailID030NULL2015-04-30 14:52:20.713
AK_SalesOrderDetail_rowguid00NULLNULL
IX_SalesOrderDetail_ProductID3002015-04-30 14:52:20.7130

The counters for seeks and lookups have been incremented correctly and the times of the last seek and lookup have also been updated with the latest values.

Conclusion

In this tip I have shown how to use the MERGE statement to maintain index usage statistics information for a SQL Server instance even after the instance restarts and index rebuilds. The end result is a robust set of data that can be relied on to draw conclusions on used vs. unused indexes over long time spans. Using this methodology one can be certain that low/zero usage values for an index are truly an indication that this index is not needed and should probably be dropped, and not an artifact of frequent rebuilds of that index. New and dropped indexes are also dealt with in the same MERGE statement.

An implicit assumption (and limitation) of this approach is that higher values in the [IndexUsageStatsSnap] table, relative to the [IndexUsageStats] table, necessarily mean that no re-initialization of the index-usage statistics has taken place in the database engine since the last time the two tables were refreshed (see first If statement in pseudo code – script 3). This may lead to inaccuracies early on, when values in the 2 tables are of comparable magnitude. However, as the data in the [IndexUsageStats] table gradually spans a longer time range and individual data points increase in magnitude, this assumption will become increasingly more accurate.

Next Steps

Leave a Reply

Your email address will not be published. Required fields are marked *