By: Ben Snaidero
Overview
The order that the columns are specified in your indexes has an effect on whether or not the entire index can be used when the SQL Optimizer parses your query.
Explanation
When looking at an explain plan for a query you'll notice that the SQL Optimizer first parses the WHERE clause, then the JOIN clause, followed by the ORDER BY clause and finally it processes the data being selected. Based on this fact it makes sense that you would need to specify the columns in your index in this order if you want the entire index to be used. This is especially true if you are trying to create a covering index. Let's look at the following simple query as an example.
SELECT P.ParentID,C.ChildID,C.IntDataColumn,C.VarcharDataColumn FROM [dbo].[Parent] P INNER JOIN [dbo].[Child] C ON P.ParentID=C.ParentID WHERE C.IntDataColumn=32433 ORDER BY ChildID
And we'll use the following index statement to show how progessively adding columns to the index in the order we mentioned above, WHERE-JOIN-ORDER BY-SELECT, will improve the queries performance. A couple things to note. First, I've included the entire index statement here but you can add the columns one at a time to see the difference in each step. Second, the second create index statement is just an alternative to adding the SELECT columns directly to the index, instead they are part of an INCLUDE clause.
CREATE NONCLUSTERED INDEX idxChild_JOINIndex ON [dbo].[Child] ([IntDataColumn],[ParentID],[ChildID],[VarcharDataColumn]) CREATE NONCLUSTERED INDEX idxChild_JOINIndex ON [dbo].[Child] ([IntDataColumn],[ParentID],[ChildID]) INCLUDE ([VarcharDataColumn]) -- cleanup statements DROP INDEX Child.idxChild_JOINIndex
Let's first take a look at the explain plans for each of these queries as we progessively add columns to the index.
No Index
WHERE Index
WHERE,JOIN Index
WHERE,JOIN,ORDER BY Index
WHERE,JOIN,ORDER BY, SELECT Index
WHERE,JOIN,ORDER BY, INCLUDE Index
It's hard to tell just from the explain plans if each step will see an improvement or not except for maybe just adding the initial index which eliminated the index scan so let's take a look at the SQL Profiler results to see the actual performance benefit.
Table Type | CPU | Reads | Writes | Duration |
---|---|---|---|---|
No Index | 110 | 14271 | 0 | 103 |
WHERE Index | 0 | 129 | 0 | 2 |
WHERE, JOIN Index | 0 | 117 | 0 | 0 |
WHERE, JOIN, ORDER BY Index | 0 | 117 | 0 | 0 |
WHERE, JOIN, ORDER BY, SELECT Index | 0 | 60 | 0 | 0 |
WHERE, JOIN, ORDER BY, INCLUDE Index | 0 | 60 | 0 | 0 |
We can see from these results that as we add each column we do see the SQL engine has to perform less reads to execute the query thereby executing a little faster. The only exception to this is the step where we added the ORDER BY to the index but this can be attributed to the fact that we are ordering by ChildID which is a primary key so it's already sorted. The other thing we should note is that there isn't really a performance difference between adding the SELECT column directly to the index vs. using the INCLUDE clause.