By: Ben Snaidero | Comments (10) | Related: > Indexing
Problem
There is a delicate balance on performance when it comes to setting up the indexes on a table. Too many indexes and your INSERT / UPDATE / DELETE performance will suffer, but not enough indexing will impact your SELECT performance. This tip will look at the order of the columns in your index and how this order impacts query plans and performance.
Solution
Sample SQL Server Table and Data Population
For this example we will setup two sample tables and populate each of the tables with data. Here is the code:
-- Table creation logic CREATE TABLE [dbo].[TABLE1] ([col1] [int] NOT NULL,[col2] [int] NULL,[col3] [int] NULL,[col4] [varchar](50) NULL) GO CREATE TABLE [dbo].[TABLE2] ([col1] [int] NOT NULL,[col2] [int] NULL,[col3] [int] NULL,[col4] [varchar](50) NULL) GO ALTER TABLE dbo.TABLE1 ADD CONSTRAINT PK_TABLE1 PRIMARY KEY CLUSTERED (col1) GO ALTER TABLE dbo.TABLE2 ADD CONSTRAINT PK_TABLE2 PRIMARY KEY CLUSTERED (col1) GO -- Populate tables DECLARE @val INT SELECT @val=1 WHILE @val < 1000 BEGIN INSERT INTO dbo.Table1(col1, col2, col3, col4) VALUES(@val,@val,@val,'TEST') INSERT INTO dbo.Table2(col1, col2, col3, col4) VALUES(@val,@val,@val,'TEST') SELECT @val=@val+1 END GO -- Create multi-column index on table1 CREATE NONCLUSTERED INDEX IX_TABLE1_col2col3 ON dbo.TABLE1 (col2,col3) WITH (STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY] GO
Before we run the following commands include the actual execution plan (Ctrl + M) and also turn on Statistics IO by running "SET STATISTICS IO ON" in the query window.
Single Table Query Example
For our first example let's take a look at a simple query with one column in the WHERE clause. The first query uses the second column (col3) of the index in the WHERE clause and the second query uses the first column (col2) of the index in the WHERE clause. Note we are using "DBCC DROPCLEANBUFFERS" to ensure the cache is empty before each query is executed. Here is the code:
DBCC DROPCLEANBUFFERS GO SELECT * FROM dbo.TABLE1 WHERE col3=88 GO DBCC DROPCLEANBUFFERS GO SELECT * FROM dbo.TABLE1 WHERE col2=88 GO
Before running the command turn on the execution plan. Looking at the explain plans for these two queries we can see that the query that uses the second column (col3) of the index in the WHERE clause is performing an index scan on PK_TABLE1 and not even using the index we created on the column. The second query which has the first column (col2) of the index in the WHERE clause is performing a seek on the index we created and even on this small table uses fewer resources, 6 reads as compared to 4 reads, to execute this query. You can probably guess that this increase in performance only increases as the table becomes larger.
Table 'TABLE1'. Scan count 1, logical reads 6, physical reads 0
-- query 2
Table 'TABLE1'. Scan count 1, logical reads 4, physical reads 0
Two Table Join Query Example
For our next example let's take a look a query that has the same WHERE clause but also adds an inner join to another table. We will again have two queries. The first query has the second column (col3) of the index in the WHERE clause and the first column (col2) of the index is used to join the table. The second query has the first column (col2) of the index in the WHERE clause and second column (col3) of the index is used to join the table. Again we are using "DBCC DROPCLEANBUFFERS" to ensure the cache is empty before each query is executed. Here is the code:
DBCC DROPCLEANBUFFERS GO SELECT * FROM dbo.TABLE1 INNER JOIN dbo.TABLE2 ON dbo.TABLE1.col2 = dbo.TABLE2.col1 WHERE dbo.TABLE1.col3=255 GO DBCC DROPCLEANBUFFERS GO SELECT * FROM dbo.TABLE1 INNER JOIN dbo.TABLE2 ON dbo.TABLE1.col3 = dbo.TABLE2.col1 WHERE dbo.TABLE1.col2=255 GO
Looking at the explain plans for these two queries we can see that when the column that the tables are joined on appears first in the index the query does an index scan of this table (as it did in the first example). The second query which has the first column of the index in the WHERE clause is performing a seek on the index we created. Again this second query uses fewer resources to complete, 6 reads as compared to the 8 reads required by the first query doing the scan. This increase in performance will also become greater as more data is added to these tables, as was the case in the previous example.
Table 'TABLE2'. Scan count 0, logical reads 2, physical reads 0
Table 'TABLE1'. Scan count 1, logical reads 6, physical reads 0
-- query 2
Table 'TABLE2'. Scan count 0, logical reads 2, physical reads 0
Table 'TABLE1'. Scan count 1, logical reads 4, physical reads 0
Summary
You can see from these simple examples above that the ordering of the columns in your index does have an impact on how any queries against the tables will behave. A best practice that I've followed that has been quite helpful when creating indexes is to make sure you are always working with the smallest result set possible. This means that your indexes should start with any columns in your WHERE clause followed by the columns in your JOIN condition. A couple more steps that also help query performance is to include any columns in an ORDER BY clause as well as making the index a covering index by including the columns in your SELECT list, this will remove the lookup step to retrieve data. One thing I mentioned earlier that I want to re-interate here is that by adding more indexes to tables as well as adding more columns to an existing index it will require more resources to update these indexes when inserting, updating and deleting data, so finding a balance is important.
Next Steps
- Read more tips on indexing
- Follow these best practices for index creation
- Make sure you know how to interpret explain plans correctly
- Increase the number of rows inserted into these tables from 1,000 to 100,000 and take a look at how the reads increase
About the author
This author pledges the content of this article is based on professional experience and not AI generated.
View all my tips