solving sql server problems for millions of dbas and developers since 2006



SQL Server DBA Tips SQL Server Developer Tips SQL Server Business Intelligence Tips SQL Server Career Tips SQL Server Tip Categories SQL Server Tutorials SQL Server Webcasts SQL Server Whitepapers SQL Server Tools SQL Server Questions and Answers MSSQLTips Authors About MSSQLTips SQL Server User Groups MSSLQTips Giveaways MSSQLTips Advertising Options

MSSQLTips Facebook Page MSSQLTips LinkedIn Page MSSQLTips RSS Feed MSSQLTips Twitter Page MSSQLTips Google+ Page

SQL Product Highlight

Idera - SQL safe backup

Idera's SQL safe provides a high-performance backup and recovery solution for Microsoft SQL Server. SQL safe saves money by reducing database backup time by up to 50% over native backups and reducing backup disk space requirements by up to 95%. SQL safe also enables complete "hands-free" automated backup of your entire SQL Server infrastructure and ensures compliance with your organization's backup and recovery policies. From implementations with tens of SQL servers to enterprises with hundreds of servers spread around the globe, SQL safe is the only SQL Server backup and recovery solution that scales to meet the challenge.

Learn more!








Avoid Index Redundancy in SQL Server Tables

By: | Read Comments (5) | Print

Armando has over 24 years of industry experience and has been working with SQL Server since version 6.5.

Related Tips: More

Problem
I'm trying to design indexes on a new Customer table and I have three separate query searches to satisfy. One query search is by lastname. The 2nd query search is by last name and first name. The last query search is by lastname and an active flag. How should I create my indexes?

Solution
I've come across more instances than I'd like of data models where the original database developer would solve this problem by creating three separate indexes as follows:

create index ix_customer_lastname on dbo.Customer(lastname)
GO

create index ix_customer_lastname_firstname on dbo.Customer(lastname, firstname)
GO

create index ix_customer_lastname_activesw on dbo.Customer(lastname, activesw)
GO

The reality is, in this case you only need one!

The following queries

select * from dbo.Customer where lastname = 'Washington'
GO

select * from dbo.Customer where lastname = 'Washington' and activesw = 1
GO

select * from dbo.Customer where firstname = 'George' and lastname = 'Washington'
GO

select * from dbo.Customer
where firstname = 'George' and lastname = 'Washington' and activesw = 1
GO


can all be satisfied using the following single index

create index ix_customer_lastname on dbo.Customer(lastname, firstname, activesw)
GO

The lead column of each index is lastname and this is the most important one. SQL Server keeps frequency distribution statistics in the form of a histogram on the lead column of every index. It will also keep secondary selectivity information for the additional index column permutations.

Having redundant indexes in your database wastes SQL Server system resources because the database engine has to maintain more data structures than it needs to. Consider a new Customer being added to the database. The new last name inserted would have to be accounted for in the logical ordering of each index resulting in unnecessary overhead to maintain each of these indexes. Furthermore, redundant indexes waste disk space.

Let's create a Customer table and some indexes to illustrate these concepts. This script will create the table and 10,000 customer rows.

set nocount on
go
create table dbo.Customer (customerid int identity primary key,
firstname char(40), lastname char(40),
address1 char(500), city char(40), state char(20), activesw bit)

declare @counter int, @id varchar(4)
select @counter = 1
while (@counter <= 10000)
begin
select @id = cast(@counter as varchar)
insert into customer(firstname, lastname, address1, city, activesw)
select 'f'+@id, 'l'+@id, 'a'+@id, 'c'+@id, 1

select @counter = @counter + 1
end

create index ix_customer_lastname on dbo.Customer(lastname)
create index ix_customer_lastname_firstname on dbo.Customer(lastname, firstname)
create index ix_customer_lastname_activesw on dbo.Customer(lastname, activesw)
go

Now let's examine the space used by the table's indexes


We see that the indexes take up 2kb of disk space

Now let's run the following queries and examine the optimizer's index selections

set showplan_text on
go
select customerid from dbo.Customer where lastname = 'l22'
select customerid from dbo.Customer where lastname = 'l22' and activesw = 1
select customerid from dbo.Customer where firstname = 'f22' and lastname = 'l22'
select customerid from dbo.Customer where firstname = 'f22' and lastname = 'l22' and activesw = 1

Your query plans may look similar to the following:


Now these are interesting. The first and second queries used index ix_customer_lastname_activesw. The third query used index ix_customer_lastname_firstname. For the last query, the optimizer decided to use ix_customer_lastname_firstname as well and decided to do a lookup to get the activesw value.  If you haven't noticed yet, index ix_customer_lastname was not used to satisfy any of these queries. It's just sitting around, literally taking up space.

Now let's drop these indexes and add a single index to satisfy the same four queries

drop index dbo.Customer.ix_customer_lastname
drop index dbo.Customer.ix_customer_lastname_firstname
drop index dbo.Customer.ix_customer_lastname_activesw

create index ix_customer_lastname on dbo.Customer(lastname, firstname, activesw)

If we now re-run sp_spaceused, we'll see that the index space used by the table has been reduced by over 100%

Now let's re-run the queries we ran earlier and examine the new execution plans

As we now see, the single index satisfies all four queries. In addition, the last query that performed a lookup for the activesw now reads it directly from the index and does not resort to searching the clustered table for the value.

By carefully defining your indexes you can save the database engine some work, save some space on your disks, and still have high performing queries.

Next Steps



Related Tips: More | Become a paid author


Last Update: 3/6/2008

Share: Share 






Comments and Feedback:

Thursday, June 19, 2008 - 8:32:47 PM - ROd1 Read The Tip

I definitely agree on your post, the less redundancy the better --- Now, dealing with deadlocks i've read that sometimes when several transactions/queries are being run at once (concurrent users), certain table operations can lock the indexes of tables, causing every other query wanting to write to that table to wait for getting exclusive access to that index (as they need to be also updated).  Then, i remember the tip to reduce this kind of waits was to have yet an extra index (smaller) including some of the most selective columns of that table as an extra path for SQL Server to use in case the first index is locked. What do you think? ROd@Mx

 


Sunday, June 22, 2008 - 6:36:27 PM - aprato Read The Tip

I think I know to what you're referring.... covering indexes?   Using a covering index, you can eliminate a deadlock in READ COMMITTED isolation by covering a query such that a  bookmark lookup is not needed against a clustered table that is in process of being updated.  Higher isolation levels, may have a different result, however.  If the logical ordering is not important, you could also use the new index INCLUDE feature of SQL Server 2005. 



Monday, June 23, 2008 - 6:37:57 AM - grobido Read The Tip

Take a look at this tip, I think this is what you are referring to.

http://www.mssqltips.com/tip.asp?tip=1485

 


Saturday, March 14, 2009 - 4:40:36 AM - ranjan Read The Tip

I have a table with three colum as
id,date1,date2

Three index is created as
ClusterIndex1 = id
Nonclusterindex1 = date1,id
Nonclusterindex2 = date2,id

I have written 2 different query,in that id, date1 is in one where clause id , date2 is another where clause.

I suggest index creation should be

ClusterIndex1 = id
Nonclusterindex1 = date1
Nonclusterindex2 = date2

Unnecessary adding some columns in index creation will affect performance. But the index space used by both scenario is same. What would be the different between those two creation of index.

 

If this is correct can u send any link for the proof.


Tuesday, March 17, 2009 - 7:39:21 PM - aprato Read The Tip

I'm not sure why you'd need to query on both id and date?  Can you elaborate to why?  I would think you would only need to use one or the other.

If the clustered index is id,  then the id will be located at the leaf level of the non-clustered index.   You wouldn't need to specify id in the index definition if it's just to be returned in the output because you're already getting it for free

 a query like

select id
from mytable
where date1 = <some date>

would become a covered query where just the non-clustered index structure is read.
The engine will grab the id right from the leaf.

 If you read this tip, you'll be able to drill into the internals of the pages which can
be an eye opener.

http://www.mssqltips.com/tip.asp?tip=1578

 



Post a Comment or Question

Keep it clean and stay on the subject or we may delete your comment.
Your email address is not published. Required fields are marked with an asterisk (*)

*Name   *Email   Notify for updates
Comments
*Enter Code refresh code


 
Sponsor Information
Try the award winning SQL diagnostic manager as a free 14-day trial!

SQL Monitor, server monitoring so easy, your boss could do it. Try it online.

Need SQL Server help and not sure where to turn? Reach out to expert consultants for a Health Check.

Free Trial: Get Proactive Insight with Spotlight® for SQL Server Enterprise.

Solving SQL Server problems for millions of DBAs and Devs since 2006. Join now.

The SQL Server Security THREAT - It’s Closer Than You Think


Copyright (c) 2006-2012 Edgewood Solutions, LLC All rights reserved
privacy | disclaimer | copyright | advertise | about
authors | contribute | feedback | giveaways | user groups
Some names and products listed are the registered trademarks of their respective owners.


Edgewood Solutions LLC | MSSharePointTips.com | MSSQLTips.com