By: Ben Snaidero | Comments (6) | Related: > Indexing
Problem
Although not as much of a SQL Server performance issue as missing indexes where they could be used, having too many indexes defined on a table can cause a lot of unnecessary IO on your SQL Server database as well as use up additional disk space. This tip will provide a query that will identify any tables that have indexes that share leading columns so you can decide which ones are not actually needed and remove them.
Solution
Sample SQL Server Table Setup
In order to provide you with some test cases I've included some T-SQL to create a couple of sample tables. These two tables provide examples of the following test cases:
- Exact duplicate indexes
- Indexes that shares leading columns
- Indexes that share first column, but subsequent columns are different
Of course you can always skip this step and test the queries below in one of your own development environments. Either way here is the T-SQL to setup these sample tables.
-- Create sample table and indexes CREATE TABLE testtable1 ([col1] [int] NOT NULL primary key clustered, [col2] [int] NULL, [col3] [int] NULL, [col4] [varchar](50) NULL); CREATE INDEX idx_testtable1_col2col3 on testtable1 (col2 asc, col3 asc); CREATE INDEX idx_testtable1_col2col4 on testtable1 (col2 asc, col4 asc); CREATE INDEX idx_testtable1_col3 on testtable1 (col3 asc); CREATE INDEX idx_testtable1_col3col4 on testtable1 (col3 asc, col4 asc); CREATE TABLE testtable2 ([col1] [int] NOT NULL primary key clustered, [col2] [int] NULL, [col3] [int] NULL, [col4] [varchar](50) NULL); CREATE INDEX idx_testtable2_col3col4 on testtable2 (col3 asc, col4 asc); CREATE INDEX idx_testtable2_col3col4_1 on testtable2 (col3 asc, col4 asc);
Build Script
In an ideal world you would have a naming convention for your SQL Server indexes that would identify which columns make up the index. Since these types of standards don't always exist it often happens that duplicate indexes can get created on tables and in order to identify these indexes we'll have to use the following catalog views:
To build this query we will break the problem down into two steps. The first thing we will need is a list of all the tables and indexes with their associated columns. For the columns that make up the index we will create a comma separated list of values as is described in this link using the STUFF function and the FOR XML PATH clause. Here is the T-SQL to retrieve this listing. Note the WHERE clause which excludes any tables that are not user created.
select distinct object_name(i.object_id) tablename,i.name indexname, (select distinct stuff((select ', ' + c.name from sys.index_columns ic1 inner join sys.columns c on ic1.object_id=c.object_id and ic1.column_id=c.column_id where ic1.index_id = ic.index_id and ic1.object_id=i.object_id and ic1.index_id=i.index_id order by index_column_id FOR XML PATH('')),1,2,'') from sys.index_columns ic where object_id=i.object_id and index_id=i.index_id) as columnlist from sys.indexes i inner join sys.index_columns ic on i.object_id=ic.object_id and i.index_id=ic.index_id inner join sys.objects o on i.object_id=o.object_id where o.is_ms_shipped=0
Now that we have a complete list of all our tables and indexes (including their columns) we can use that output and join it to itself comparing the columnlist output column. All we have to do is join on table name and compare the columnlist column from the first result set with the substring of the columnlist column from the second result set using the length of the first columnlist column. Of course we will omit exact matches except where the index name is also different. Here is the T-SQL for this query.
select t1.tablename,t1.indexname,t1.columnlist,t2.indexname,t2.columnlist from (select distinct object_name(i.object_id) tablename,i.name indexname, (select distinct stuff((select ', ' + c.name from sys.index_columns ic1 inner join sys.columns c on ic1.object_id=c.object_id and ic1.column_id=c.column_id where ic1.index_id = ic.index_id and ic1.object_id=i.object_id and ic1.index_id=i.index_id order by index_column_id FOR XML PATH('')),1,2,'') from sys.index_columns ic where object_id=i.object_id and index_id=i.index_id) as columnlist from sys.indexes i inner join sys.index_columns ic on i.object_id=ic.object_id and i.index_id=ic.index_id inner join sys.objects o on i.object_id=o.object_id where o.is_ms_shipped=0) t1 inner join (select distinct object_name(i.object_id) tablename,i.name indexname, (select distinct stuff((select ', ' + c.name from sys.index_columns ic1 inner join sys.columns c on ic1.object_id=c.object_id and ic1.column_id=c.column_id where ic1.index_id = ic.index_id and ic1.object_id=i.object_id and ic1.index_id=i.index_id order by index_column_id FOR XML PATH('')),1,2,'') from sys.index_columns ic where object_id=i.object_id and index_id=i.index_id) as columnlist from sys.indexes i inner join sys.index_columns ic on i.object_id=ic.object_id and i.index_id=ic.index_id inner join sys.objects o on i.object_id=o.object_id where o.is_ms_shipped=0) t2 on t1.tablename=t2.tablename and substring(t2.columnlist,1,len(t1.columnlist))=t1.columnlist and (t1.columnlist<>t2.columnlist or (t1.columnlist=t2.columnlist and t1.indexname<>t2.indexname))
And there you have it. A query that will provide a complete listing of all indexes that share leading columns. Please note that the way the query is written if you have two indexes that are exact duplicates it will show up twice in the result set. The only thing left to do now is to decide which indexes we should remove so that we can reclaim some space and reduce the IO required for any DML statements executed against these tables. The following tips can be used to help you determine which indexes you should remove.
- How to get index usage information in SQL Server
- Deeper insight into unused indexes for SQL Server
- What indexes are used most often or least often in my server
Next Steps
- Extend this query to also compare the indexes included columns (if any)
- Check you database for any possible missing indexes:
- Read other tips on indexing
About the author
This author pledges the content of this article is based on professional experience and not AI generated.
View all my tips