Persisting SQL Server Index-Usage Statistics with MERGE

Problem

Index-usage statistics get re-initialized when the SQL Server instance in question gets restarted and, starting with SQL Server 2012, after an index rebuild. This can obscure the true impact of indexes in day-to-day operations and lead to false determinations of used vs. unused indexes. How can one persist this valuable data over a longer period of time and obtain a meaningful repository of information for performance tuning and troubleshooting?

Solution

In this tip I use the MERGE statement to maintain the incremental nature of index-usage statistics between SQL Server instance restarts and index rebuilds. This solution also accounts for newly added/deleted indexes. The end result is a reliable source of data that can serve as a tool for index analysis.

Setup

To demonstrate I use a copy of the AdventureWorks2012 database residing on a SQL Server 2012 instance. Two tables on a separate database, [IndexUsageStatsDB], are used to store the index-usage statistics: [IndexUsageStats] and [IndexUsageStatsSnap]. The [IndexUsageStats] table holds the long-term values persisted between instance restarts and index rebuilds, whereas the [IndexUsageStatsSnap] table records “snapshot” data obtained at the moment of collection. The [IndexUsageStatsSnap] table would need to get re-populated at scheduled intervals (eg. every hour) and is the basis for updating the [IndexUsageStats] table through a MERGE statement.

The two tables are listed below:



———–

–Script 1:

———–
USE [IndexUsageStatsDB];

GO
CREATE TABLE [dbo].[IndexUsageStats]

(

[IndexUsageStatsID] INTEGER IDENTITY(1,1) NOT NULL,

[dbid] SNALLINT NOT NULL,

[table_name] SYSNAME NOT NULL,

[index_id] INTEGER NULL,

[index_name] NVARCHAR(128) NOT NULL,

[user_scans] BIGINT NOT NULL,

[user_seeks] BIGINT NOT NULL,

[user_lookups] BIGINT NOT NULL,

[user_updates] BIGINT NOT NULL,

[last_user_scan] DATETIME NULL,

[last_user_seek] DATETIME NULL,

[last_user_lookup] DATETIME NULL,

[last_user_update] DATETIME NULL,

Constraint [PK_IndexUsageStats] PRIMARY KEY ([IndexUsageStatsID]) ON [PRIMARY]

)

ON [PRIMARY]

GO
CREATE TABLE [dbo].[IndexUsageStatsSnap]

(

[IndexUsageStatsSnapID] INTEGER IDENTITY(1,1) NOT NULL,

[dbid] SNALLINT NOT NULL,

[table_name] SYSNAME NOT NULL,

[index_id] INTEGER NULL,

[index_name] NVARCHAR(128) NOT NULL,

[user_scans] BIGINT NOT NULL,

[user_seeks] BIGINT NOT NULL,

[user_lookups] BIGINT NOT NULL,

[user_updates] BIGINT NOT NULL,

[last_user_scan] DATETIME NULL,

[last_user_seek] DATETIME NULL,

[last_user_lookup] DATETIME NULL,

[last_user_update] DATETIME NULL,

Constraint [PK_IndexUsageStatsSnap] PRIMARY KEY ([IndexUsageStatsSnapID]) ON [PRIMARY]

)

ON [PRIMARY]

GO

The following statement is used to re-populate the table [IndexUsageStatsSnap] at scheduled intervals:



———–

–Script 2:

———–



  


USE [AdventureWorks2012];

GO
TRUNCATE TABLE [IndexUsageStatsDB].[dbo].[IndexUsageStatsSnap];
INSERT INTO

[IndexUsageStatsDB].[dbo].[IndexUsageStatsSnap]

(

[dbid]

,[table_name]

,[index_id]

,[index_name]

,[user_scans]

,[user_seeks]

,[user_lookups]

,[user_updates]

,[last_user_scan]

,[last_user_seek]

,[last_user_lookup]

,[last_user_update]

)

SELECT

DB_ID() AS [dbid]

,OBJECT_NAME(i.[object_id], DB_ID()) AS [table_name]

,i.index_id

,i.[name]

,COALESCE(ddius.user_scans, 0)

,COALESCE(ddius.user_seeks, 0)

,COALESCE(ddius.user_lookups, 0)

,COALESCE(ddius.user_updates, 0)

,ddius.last_user_scan

,ddius.last_user_seek

,ddius.last_user_lookup

,ddius.last_user_update

FROM

sys.indexes i

LEFT OUTER JOIN

sys.dm_db_index_usage_stats ddius

ON

ddius.index_id = i.index_id

AND ddius.[object_id] = i.[object_id]

WHERE

(ddius.database_id = DB_ID() OR ddius.database_id IS NULL) — this DB only

AND OBJECTPROPERTY(i.[object_id], ‘IsUserTable’) = 1

AND i.index_id > 0 — filter out heaps

ORDER BY

OBJECT_NAME(i.[object_id], DB_ID())

,i.index_id;

GO

The MERGE statement

As mentioned earlier, a MERGE statement is used to update the [IndexUsageStats] table on the basis of data from table [IndexUsageStatsSnap]. In the first part of this section I describe the logic (pseudo code) behind the MERGE statement. In the second part I show the actual T-SQL code.

Logic and Pseudo code

The terms used in the pseudo code below are defined as follows:

Counter1: counter value (number of scans/seeks/lookups/updates) of a specific index in the [IndexUsageStats] table.
Counter2: counter value (number of scans/seeks/lookups/updates) of a specific index in the [IndexUsageStatsSnap] table.
LastOccurence1: last time Counter1 changed in [IndexUsageStats] table (eg. last_user_seek value).
LastOccurence2: last time Counter2 changed in [IndexUsageStatsSnap] table (eg. last_user_seek value).

Here is the pseudo code and associated logic:



———–

–Script 3:

———–

If Index In Both [IndexUsageStats] And [IndexUsageStatsSnap] Tables

Then

If LastOccurence2 More Recent Than LastOccurence1

And

(Counter1 < Counter2 OR LastOccurence1 IS NULL)

Then Set Counter1 Equal To Counter2

And Set LastOccurence1 Equal To LastOccurence2

(Logic: there was likely no instance restart/index rebuild since

the last time the [IndexUsageStats] and [IndexUsageStatsSnap] tables were updated,

as evidenced by the higher value of Counter2.)
If LastOccurence2 More Recent Than LastOccurence1

And

Counter1 >= Counter2

ThenIncrement Counter1 By Counter2

And Set LastOccurence1 Equal To LastOccurence2

(Logic: an instance restart/index rebuild since the last time

the [IndexUsageStats and [IndexUsageStatsSnap] tables were updated must have

caused the database engine to re-initialize the index-usage statistics,

as evidenced by the lower value of Counter2.)
If ( (LastOccurence2 Equal To LastOccurence1) Or (LastOccurence2 IS NULL) )

Then Do Nothing

(Logic: index in question has not been used since the last time

the [IndexUsageStats] and [IndexUsageStatsSnap] tables were updated.

No change to the Counter1 and LastOccurence1 values is required.)
Else

If Index Only In [IndexUsageStatsSnap] Table

Then Insert Record Into [IndexUsageStats] Table

(Logic: newly created index; add to [IndexUsageStats] table.)
If Index Only In [IndexUsageStats] Table

Then Delete Record From [IndexUsageStats] Table

(Logic: newly deleted index; remove from [IndexUsageStats] table.)

T-SQL Code

Here is the T-SQL code of the MERGE statement:



———–

–Script 4:

———–
USE [IndexUsageStatsDB];
MERGE [dbo].[IndexUsageStats] AS [is]

USING [dbo].[IndexUsageStatsSnap] AS [iss]

ON

[is].[dbid] = [iss].[dbid]

AND [is].[table_name] = [iss].[table_name]

AND [is].[index_id] = [iss].[index_id]

AND [is].[index_name] = [iss].[index_name]

WHEN MATCHED

THEN UPDATE SET

[is].[user_scans] =

CASE WHEN (([iss].[last_user_scan] > [is].[last_user_scan] OR [is].[last_user_scan] IS NULL) AND [iss].[user_scans] > [is].[user_scans]) THEN [iss].[user_scans]

 WHEN ([iss].[last_user_scan] > [is].[last_user_scan] AND [iss].[user_scans] <= [is].[user_scans]) THEN [is].[user_scans] + [iss].[user_scans] 
 ELSE [is].[user_scans] -- DO NOTHING
END
, [is].[user_seeks] = 
CASE WHEN (([iss].[last_user_seek] > [is].[last_user_seek] OR [is].[last_user_seek] IS NULL) AND [iss].[user_seeks] > [is].[user_seeks]) THEN [iss].[user_seeks]

 WHEN ([iss].[last_user_seek] > [is].[last_user_seek] AND [iss].[user_seeks] <= [is].[user_seeks]) THEN [is].[user_seeks] + [iss].[user_seeks] 
 ELSE [is].[user_seeks] -- DO NOTHING
END
, [is].[user_lookups] = 
CASE WHEN (([iss].[last_user_lookup] > [is].[last_user_lookup] OR [is].[last_user_lookup] IS NULL) AND [iss].[user_lookups] > [is].[user_lookups]) THEN [iss].[user_lookups]

 WHEN ([iss].[last_user_lookup] > [is].[last_user_lookup] AND [iss].[user_lookups] <= [is].[user_lookups]) THEN [is].[user_lookups] + [iss].[user_lookups] 
 ELSE [is].[user_lookups] -- DO NOTHING
END
, [is].[user_updates] = 
CASE WHEN (([iss].[last_user_update] > [is].[last_user_update] OR [is].[last_user_update] IS NULL) AND [iss].[user_updates] > [is].[user_updates]) THEN [iss].[user_updates]

 WHEN ([iss].[last_user_update] > [is].[last_user_update] AND [iss].[user_updates] <= [is].[user_updates]) THEN [is].[user_updates] + [iss].[user_updates]
 ELSE [is].[user_updates] -- DO NOTHING
END
, [is].[last_user_scan] = COALESCE([iss].[last_user_scan], [is].[last_user_scan])
, [is].[last_user_seek] = COALESCE([iss].[last_user_seek], [is].[last_user_seek])
, [is].[last_user_lookup] = COALESCE([iss].[last_user_lookup], [is].[last_user_lookup])
, [is].[last_user_update] = COALESCE([iss].[last_user_update], [is].[last_user_update])
WHEN NOT MATCHED BY TARGET
THEN INSERT 
(
[dbid]
, [table_name]
, [index_id]
, [index_name]
, [user_scans]
, [user_seeks]
, [user_lookups]
, [user_updates]
, [last_user_scan]
, [last_user_seek]
, [last_user_lookup]
, [last_user_update]
)
VALUES
(
[iss].[dbid]
,[iss].[table_name]
,[iss].[index_id]
,[iss].[index_name]
,[iss].[user_scans]
,[iss].[user_seeks]
,[iss].[user_lookups]
,[iss].[user_updates]
,[iss].[last_user_scan]
,[iss].[last_user_seek]
,[iss].[last_user_lookup]
,[iss].[last_user_update]
)
WHEN NOT MATCHED BY SOURCE
THEN DELETE;
GO

Procedure

Here are the two steps needed to maintain the data in table [IndexUsageStats]

Repopulate table [IndexUsageStatsSnap] to capture a snapshot of the statistics at that moment in time – script 2.
Execute the MERGE statement to update the [IndexUsageStats] table with the latest data from [IndexUsageStatsSnap] – script 4.

Results

To demonstrate how this works I will run queries against the [AdventureWorks2012].[Sales].[SalesOrderDetail] table. I first restart my SQL Server instance, to reset all index usage statistics to 0, and then apply the following query to perform 10 seeks on index IX_SalesOrderDetail_ProductID and 10 lookups on index PK_SalesOrderDetail_SalesOrderID_SalesOrderDetailID:

———–
–Script 5:
———–

SELECT
[SalesOrderID]
,[SalesOrderDetailID]
,[CarrierTrackingNumber]
,[OrderQty]
,[UnitPrice]
,[LineTotal]
FROM
[AdventureWorks2012].[Sales].[SalesOrderDetail]
WHERE
[ProductID] = 897;
GO 10

Next, I populate table IndexUsageStatsSnap (script 2) and execute the MERGE statement (script 4) to update table IndexUsageStats

index_name	user_seeks	user_lookups	last_user_seek	last_user_lookup
PK_SalesOrderDetail_ SalesOrderID_SalesOrderDetailID	0	10	NULL	2015-04-29 06:54:51.740
AK_SalesOrderDetail_rowguid	0	0	NULL	NULL
IX_SalesOrderDetail_ProductID	10	0	2015-04-29 06:54:51.740	0

If I now rerun script 5 and repopulate table IndexUsageStatsSnap (script 2) the data in table IndexUsageStatsSnap will look like this:

index_name	user_seeks	user_lookups	last_user_seek	last_user_lookup
PK_SalesOrderDetail_ SalesOrderID_SalesOrderDetailID	0	20	NULL	2015-04-29 07:21:11.147
AK_SalesOrderDetail_rowguid	0	0	NULL	NULL
IX_SalesOrderDetail_ProductID	20	0	2015-04-29 07:21:11.147	0

According to the rules outlined in the above pseudo code (script 3) the data in table [IndexUsageStats] would need to be updated to match the values in [IndexUsageStatsSnap]. So both tables should be made to show 20 lookups for index [PK_SalesOrderDetail_SalesOrderID_SalesOrderDetailID] and 20 seeks for index [IX_SalesOrderDetail_ProductID]. True enough, after running MERGE (script 4) both tables now hold the following values:

index_name	user_seeks	user_lookups	last_user_seek	last_user_lookup
PK_SalesOrderDetail_ SalesOrderID_SalesOrderDetailID	0	20	NULL	2015-04-29 07:21:11.147
AK_SalesOrderDetail_rowguid	0	0	NULL	NULL
IX_SalesOrderDetail_ProductID	20	0	2015-04-29 07:21:11.147	0

Let me now rebuild index [IX_SalesOrderDetail_ProductID]:

———–
–Script 6:
———–
USE AdventureWorks2012;
GO
ALTER INDEX [IX_SalesOrderDetail_ProductID] ON [Sales].[SalesOrderDetail] REBUILD;
GO

After refreshing table [IndexUsageStatsSnap] (script 2) the data in that table will show zeroed values for index [IX_SalesOrderDetail_ProductID] (highlighted in yellow):

index_name	user_lookups	last_user_seek	last_user_lookup
PK_SalesOrderDetail_ SalesOrderID_SalesOrderDetailID	20	NULL	2015-04-29 07:21:11.147
AK_SalesOrderDetail_rowguid	0	NULL	NULL
IX_SalesOrderDetail_ProductID	0	NULL	0

This is expected behavior with index rebuilds in SQL Server 2012. Issuing the MERGE statement, I end up with the following data in table [IndexUsageStats]:

index_name	user_seeks	user_lookups	last_user_seek	last_user_lookup
PK_SalesOrderDetail_ SalesOrderID_SalesOrderDetailID	0	20	NULL	2015-04-29 07:21:11.147
AK_SalesOrderDetail_rowguid	0	0	NULL	NULL
IX_SalesOrderDetail_ProductID	20	0	2015-04-29 07:21:11.147	0

The data is unchanged, as expected. The data in table [IndexUsageStats] has been shielded from the earlier index rebuild.

Rerunning the query against the [SalesOrderDetail] table (script 5), then refreshing table [IndexUsageStatsSnap] (script 2) and lastly running the MERGE statement (script 4), I end up with the following values in table [IndexUsageStats]:

index_name	user_seeks	user_lookups	last_user_seek	last_user_lookup
PK_SalesOrderDetail_ SalesOrderID_SalesOrderDetailID	0	30	NULL	2015-04-30 14:52:20.713
AK_SalesOrderDetail_rowguid	0	0	NULL	NULL
IX_SalesOrderDetail_ProductID	30	0	2015-04-30 14:52:20.713	0

The counters for seeks and lookups have been incremented correctly and the times of the last seek and lookup have also been updated with the latest values.

Conclusion

In this tip I have shown how to use the MERGE statement to maintain index usage statistics information for a SQL Server instance even after the instance restarts and index rebuilds. The end result is a robust set of data that can be relied on to draw conclusions on used vs. unused indexes over long time spans. Using this methodology one can be certain that low/zero usage values for an index are truly an indication that this index is not needed and should probably be dropped, and not an artifact of frequent rebuilds of that index. New and dropped indexes are also dealt with in the same MERGE statement.

An implicit assumption (and limitation) of this approach is that higher values in the [IndexUsageStatsSnap] table, relative to the [IndexUsageStats] table, necessarily mean that no re-initialization of the index-usage statistics has taken place in the database engine since the last time the two tables were refreshed (see first If statement in pseudo code – script 3). This may lead to inaccuracies early on, when values in the 2 tables are of comparable magnitude. However, as the data in the [IndexUsageStats] table gradually spans a longer time range and individual data points increase in magnitude, this assumption will become increasingly more accurate.

Next Steps

Apply the solution outlined in this tip to your environment to obtain a reliable repository of index usage statistical information.
Review these related links:
Using MERGE in SQL Server to insert, update and delete at the same time
Comparing performance for the MERGE statement to SELECT, INSERT, UPDATE or DELETE
SQL Server 2008 MERGE More than UPSERT
Use Caution with SQL Server’s MERGE Statement
T-SQL Programming Part 9 – Using the MERGE Statement to Perform an UPSERT
Collecting Index Usage Statistics
Rebuilding Indexes Will Reset Index Usage Statistics in SQL Server 2012

Marios Philippopoulos

Marios Philippopoulos has been a SQL Server DBA and developer for over 10 years. In a previous life he was a computational chemist conducting simulations of protein dynamics. He is based in the Toronto area, Ontario, Canada.