![]() |
|
|
By: Armando Prato | Read Comments (5) | Print Armando has over 24 years of industry experience and has been working with SQL Server since version 6.5. Related Tips: More |
|
Problem
I'm trying to design indexes on a new Customer table and I have three separate query searches to satisfy. One query search is by lastname. The 2nd query search is by last name and first name. The last query search is by lastname and an active flag. How should I create my indexes?
Solution
I've come across more instances than I'd like of data models where the original database developer would solve this problem by creating three separate indexes as follows:
| create index ix_customer_lastname on dbo.Customer(lastname) GO create index ix_customer_lastname_firstname on dbo.Customer(lastname, firstname) create index ix_customer_lastname_activesw on dbo.Customer(lastname, activesw) |
The reality is, in this case you only need one!
The following queries
| select * from dbo.Customer where lastname = 'Washington' GO select * from dbo.Customer where lastname = 'Washington' and activesw = 1 select * from dbo.Customer where firstname = 'George' and lastname = 'Washington' select * from dbo.Customer |
can all be satisfied using the following single index
| create index ix_customer_lastname on dbo.Customer(lastname, firstname, activesw) GO |
The lead column of each index is lastname and this is the most important one. SQL Server keeps frequency distribution statistics in the form of a histogram on the lead column of every index. It will also keep secondary selectivity information for the additional index column permutations.
Having redundant indexes in your database wastes SQL Server system resources because the database engine has to maintain more data structures than it needs to. Consider a new Customer being added to the database. The new last name inserted would have to be accounted for in the logical ordering of each index resulting in unnecessary overhead to maintain each of these indexes. Furthermore, redundant indexes waste disk space.
Let's create a Customer table and some indexes to illustrate these concepts. This script will create the table and 10,000 customer rows.
| set nocount on go create table dbo.Customer (customerid int identity primary key, firstname char(40), lastname char(40), address1 char(500), city char(40), state char(20), activesw bit) declare @counter int, @id varchar(4) select @counter = 1 while (@counter <= 10000) begin select @id = cast(@counter as varchar) insert into customer(firstname, lastname, address1, city, activesw) select 'f'+@id, 'l'+@id, 'a'+@id, 'c'+@id, 1 select @counter = @counter + 1 end create index ix_customer_lastname on dbo.Customer(lastname) create index ix_customer_lastname_firstname on dbo.Customer(lastname, firstname) create index ix_customer_lastname_activesw on dbo.Customer(lastname, activesw) go |
Now let's examine the space used by the table's indexes
We see that the indexes take up 2kb of disk space
Now let's run the following queries and examine the optimizer's index selections
| set showplan_text on go select customerid from dbo.Customer where lastname = 'l22' select customerid from dbo.Customer where lastname = 'l22' and activesw = 1 select customerid from dbo.Customer where firstname = 'f22' and lastname = 'l22' select customerid from dbo.Customer where firstname = 'f22' and lastname = 'l22' and activesw = 1 |
Your query plans may look similar to the following:
Now these are interesting. The first and second queries used index ix_customer_lastname_activesw. The third query used index ix_customer_lastname_firstname. For the last query, the optimizer decided to use ix_customer_lastname_firstname as well and decided to do a lookup to get the activesw value. If you haven't noticed yet, index ix_customer_lastname was not used to satisfy any of these queries. It's just sitting around, literally taking up space.
Now let's drop these indexes and add a single index to satisfy the same four queries
| drop index dbo.Customer.ix_customer_lastname drop index dbo.Customer.ix_customer_lastname_firstname drop index dbo.Customer.ix_customer_lastname_activesw create index ix_customer_lastname on dbo.Customer(lastname, firstname, activesw) |
If we now re-run sp_spaceused, we'll see that the index space used by the table has been reduced by over 100%
Now let's re-run the queries we ran earlier and examine the new execution plans
As we now see, the single index satisfies all four queries. In addition, the last query that performed a lookup for the activesw now reads it directly from the index and does not resort to searching the clustered table for the value.
By carefully defining your indexes you can save the database engine some work, save some space on your disks, and still have high performing queries.
Next Steps
| Share: | Share | Tweet |
|
![]() |
![]() |
Free SQL Server Learning |
| Thursday, June 19, 2008 - 8:32:47 PM - ROd1 | Read The Tip |
|
I definitely agree on your post, the less redundancy the better --- Now, dealing with deadlocks i've read that sometimes when several transactions/queries are being run at once (concurrent users), certain table operations can lock the indexes of tables, causing every other query wanting to write to that table to wait for getting exclusive access to that index (as they need to be also updated). Then, i remember the tip to reduce this kind of waits was to have yet an extra index (smaller) including some of the most selective columns of that table as an extra path for SQL Server to use in case the first index is locked. What do you think? ROd@Mx
|
|
| Sunday, June 22, 2008 - 6:36:27 PM - aprato | Read The Tip |
|
I think I know to what you're referring.... covering indexes? Using a covering index, you can eliminate a deadlock in READ COMMITTED isolation by covering a query such that a bookmark lookup is not needed against a clustered table that is in process of being updated. Higher isolation levels, may have a different result, however. If the logical ordering is not important, you could also use the new index INCLUDE feature of SQL Server 2005. |
|
| Monday, June 23, 2008 - 6:37:57 AM - grobido | Read The Tip |
|
Take a look at this tip, I think this is what you are referring to. http://www.mssqltips.com/tip.asp?tip=1485
|
|
| Saturday, March 14, 2009 - 4:40:36 AM - ranjan | Read The Tip |
|
I have a table with three colum as
If this is correct can u send any link for the proof. |
|
| Tuesday, March 17, 2009 - 7:39:21 PM - aprato | Read The Tip |
|
I'm not sure why you'd need to query on both id and date? Can you elaborate to why? I would think you would only need to use one or the other. If the clustered index is id, then the id will be located at the leaf level of the non-clustered index. You wouldn't need to specify id in the index definition if it's just to be returned in the output because you're already getting it for free a query like select id would become a covered query where just the non-clustered index structure is read. If you read this tip, you'll be able to drill into the internals of the pages which can http://www.mssqltips.com/tip.asp?tip=1578
|
|
|
privacy | disclaimer | copyright | advertise | about authors | contribute | feedback | giveaways | user groups Some names and products listed are the registered trademarks of their respective owners. Edgewood Solutions LLC | MSSharePointTips.com | MSSQLTips.com |