When SQL Server Nonclustered Indexes Are Faster Than Clustered Indexes

By:   |   Comments (6)   |   Related: > Indexing


Problem

SQL Server Clustered indexes can have enormous implications for performance of operations on a table. But are there times when a SQL Server non-clustered index would perform better than a clustered index for the same operation?  Are there any trade-offs to consider?  Check out this tip to learn more.

Solution

Effective Clustered Indexes can often improve the performance of many operations on a SQL Server table. However, there are times when a non-clustered index can improve the performance of certain queries more than a clustered index. Because of that, there are some situations where the performance of some queries can be improved by created a non-clustered index which is identical to the clustered index.

To be clear, having a non-clustered index along with the clustered index on the same columns will degrade performance of updates, inserts, and deletes, and it will take additional space on the disk. These drawbacks of adding additional indexes need to be considered before adding any new nonclustered index.

Some Background on SQL Server Clustered vs. NonClustered Indexes

Clustered indexes logically organize the entire table based on the indexing key (there is a common belief that the table is physically organized by the indexing key of a clustered index, but the reality is more complicated). When a clustered index exists, it affects the way the table is stored, if there is no clustered index the data is stored in an unordered heap.

Nonclustered indexes, on the other hand, exist separately from the table. They store pointers to the rows that contain the complete data. So, when there are times when it can be faster to get the complete information needed for that smaller index rather than resorting to the clustered index, which is organizing the entire underlying table.

SQL Server Performance Testing for SQL Server Clustered vs. NonClustered Indexes

In order to demonstrate the performance differences between the different types of indexes, I set up a few tests. So, I created four tables with identical junk data. They tables are:

  • No indexes
  • Clustered index created along with the primary key
  • Only a non-clustered index
  • Both clustered and non clustered indexes

Here is my test script:

/*
This script will generate the tables used for testing clustered and nonclustered indexes and
populate them.  This can take quite a while to run.  Almost 20 minutes on my test machine.
*/

use TestDb
GO

-- First, create a table with no indexes as a control.

if OBJECT_ID('noIndex', 'U') is not NULL
 drop table dbo.noIndex

CREATE TABLE dbo.noIndex(
 n int NOT NULL, -- part of index
 singleChar char(1) NOT NULL, -- part of index
 stringData char(2000) NOT NULL,
 bigIntData bigint NOT NULL,
 decimalData decimal(18, 0)NOT NULL
) 

GO

--Populate the table with data
insert into dbo.noIndex(n, singleChar, stringData, bigIntData, decimalData)
select top 1000000 --1 Million, large enough to take some time on searches
 row_number() over (order by s1.name) as n,
 CHAR((row_number() over (order by s1.name) % 89) + 33) as singleChar,
 REPLICATE(CHAR((row_number() over (order by s1.name) % 89) + 33), 2000) as stringData,
 row_number() over (order by s1.name) * 1000000000 as bigIntData,
 row_number() over (order by s1.name) *1.1 as decimalData
from master.dbo.syscolumns s1,
 master.dbo.syscolumns  s2

------
--Now create one with a primary key, which will automatically create the clustered index

GO

if OBJECT_ID('pkIndex', 'U') is not NULL
 drop table dbo.pkIndex

CREATE TABLE dbo.pkIndex(
 n int NOT NULL,
 singleChar char(1) NOT NULL,
 stringData char(2000) NOT NULL,
 bigIntData bigint NOT NULL,
 decimalData decimal(18, 0) NOT NULL,
 constraint PK_pkIndextable primary key clustered (n, singleChar) 
) 

GO

--Populate the table with data
insert into 
 dbo.pkIndex(n, singleChar, stringData, bigIntData, decimalData)
select 
 n, singleChar, stringData, bigIntData, decimalData
from 
 dbo.noIndex


--------------
--Just the non_clustered index 
GO

if OBJECT_ID('nonclusteredIdx', 'U') is not NULL
 drop table dbo.nonclusteredIdx

CREATE TABLE dbo.nonclusteredIdx(
 n int NOT NULL,
 singleChar char(1) NOT NULL,
 stringData char(2000) NOT NULL,
 bigIntData bigint NOT NULL,
 decimalData decimal(18, 0) NOT NULL
) 

GO

--Populate the table with data
insert into 
 dbo.nonclusteredIdx(n, singleChar, stringData, bigIntData, decimalData)
select 
 n, singleChar, stringData, bigIntData, decimalData
from 
 dbo.pkIndex

 
create unique nonclustered index nonclusteredIdx_n
on dbo.nonclusteredIdx (n, singleChar)

--------------
--Just table with both indexes
GO

if OBJECT_ID('bothIdx', 'U') is not NULL
 drop table dbo.bothIdx

GO

CREATE TABLE dbo.bothIdx(
 n int NOT NULL,
 singleChar char(1) NOT NULL,
 stringData char(2000) NOT NULL,
 bigIntData bigint NOT NULL,
 decimalData decimal(18, 0) NOT NULL,
 constraint PK_bothIdx primary key clustered (n, singleChar) 
) 

--Populate the table with data
insert into 
 dbo.bothIdx(n, singleChar, stringData, bigIntData, decimalData)
select 
 n, singleChar, stringData, bigIntData, decimalData
from 
 dbo.pkIndex

create unique nonclustered index both_nonclusteredIdx_n
on dbo.bothIdx (n, singleChar)

--------------------------------------------
--Playing around with some queries

select n, singleChar
from dbo.bothIdx
where n % 10 = 0

select n, singleChar
from dbo.bothIdx
where singlechar = 'a'

select n, singleChar
from dbo.noIndex
where n % 10 = 0

select n, singleChar, bigIntData
from dbo.nonclusteredIdx
where n % 10 = 0

select n, singleChar, bigIntData
from dbo.pkIndex
where n % 10 = 0
order by n desc

SQL Server Execution Plans for SQL Server Clustered vs. NonClustered Indexes

With those tables available, I found it helpful to look at some of the execution plans for queries run on those tables.

For instance, SQL Server estimates that running a simple count on the table with the clustered index is substantially more expensive than running it on the table with only the nonclustered index:

SQL Server estimates that running a simple count on the table with the clustered index is substantially more expensive than running it on the table

This is born out by the fact that the optimizer will use the nonclustered index when this query is run against a table with both as well as by some time trials I ran on my machine with Python that compared the execution time.  Here is that sample code followed by some of the results:

# -*- coding: utf-8 -*-
"""
@author: TimothyAWiseman

Not using timeit here since we want to clear out the cache before 
every single run.  Given the need to clear out the cache each time
and the desire to not include the time to clear the cache in the measurements
timeit is less than optimal.  Therefore using a timing wrapper instead.

There is more on clearing the cache for SQL Server at:
    http://www.mssqltips.com/sqlservertip/1360/clearing-cache-for-sql-server-performance-testing/
    
Not all of these tests made it into the tip.

"""

#import useful modules
import pyodbc #to connect to SQL Server
import matplotlib.pyplot as plt #graph the results
import numpy as np
import time

##########################################################
#Establish parameters that will be used throughout the running of the script
#These are, in effect, settings.  If this were meant as a complete program
#I would shut these off into a configuration file or make them user 
#selectable.  As a custom, test script it makes sense to just include them here.


#Must configure the conn string for your instance of SQL Server
sqlConnStr = (#Insert your SQL Conn Str Here ) 

#Leaving off the schema from the table names for ease of
#labeling graphs.

tableNames = ['noIndex', 'pkIndex', 'nonclusteredIdx', 'bothIdx']

#########################################################
#Create the utility functions

def time_wrapper(func):
    def wrapper(*arg, **kw):
        start = time.time()
        result = func(*arg, **kw)
        end = time.time()
        return (end - start), result
    return wrapper
    
def clearCache(curs):
    curs.execute('checkpoint')
    curs.execute('dbcc dropcleanbuffers')
    
def getRowCount(curs, tableName):
    """Gets the rowcount of the table named by tableName."""
    sql = """select count(*) 
            from {}""".format(tableName)
    curs.execute(sql)
    return curs.fetchone()
    
def getInfoFromIndex(curs, tableName):
    """Gets information from that is entirely contained within the index
    key.
    
    The where clause is included to reduce the amount of raw data returned to
    keep the execution times reasonable and to prevent the network trips
    from swamping out the execution time of the query on the server.
    """
    sql = """select n, singlechar
            from {}
            where n%1000 = 0""".format(tableName)
    curs.execute(sql)
    return curs.fetchall()
    
def getInfoOutsideIndex(curs, tableName):
    """This pulls in a column that requires information that is not in the 
    index."""
    
    sql = """select n, singleChar, bigIntData, decimalData
            from {}
            where n%1000 = 0""".format(tableName)
    curs.execute(sql)
    return curs.fetchall()
    
def getTop100(curs, tableName):
    """This pulls in a column that requires information that is not in the 
    index."""
    
    sql = """select top 100 *
            from {}""".format(tableName)
    curs.execute(sql)
    return curs.fetchall()
    
def getTop100Order(curs, tableName):
    """This pulls in a column that requires information that is not in the 
    index."""
    
    sql = """select top 100 *
            from {}
            order by n""".format(tableName)
    curs.execute(sql)
    return curs.fetchall()
    
    
def getTop100OrderDesc(curs, tableName):
    """This pulls in a column that requires information that is not in the 
    index."""
    
    sql = """select top 100 *
            from {}
            order by n desc""".format(tableName)
    curs.execute(sql)
    return curs.fetchall()
    
    
#Originally I tried to do a lookup for just one value, but the results
#were essentially random as other factors swamped out the time for SQL to 
#execute, and most results were essentially 0.  This worked better,
#but I found I needed to do more reps to get anything reliable.   Also,
#this comes across best if the no index table is left out since it is
#so dramatically slower than any of the others. 
def specificValueLookups(curs, tableName):
    """Looks for just one row"""
    sql = """select n, singleChar, stringData, bigIntData, decimalData
            from {}
            where n in (1234, 456, 789, 1, 101112, 2, 131415, 5)""".format(tableName)
    curs.execute(sql)
    return curs.fetchall()
    
def getRangeIndexOnly(curs, tableName):
    """Gets a range looking for only values that are part of the index.
    Expect this to favor the nonclustered index."""
    sql = """select n, singleChar
            from {}
            where n between 100 and 20000
            and singleChar between 'A' and 'Z'""".format(tableName)
    curs.execute(sql)
    return curs.fetchall()    
    
def getRangeAllValues(curs, tableName):
    """Gets a range looking for only values that are part of the index.
    Expect this to favor the nonclustered index."""
    sql = """select *
            from {}
            where n between 100 and 20000
            and singleChar between 'A' and 'Z'""".format(tableName)
    curs.execute(sql)
    return curs.fetchall()  
    
    
def makeGraphForTimes(funcToTime, curs, tableNames, graphFileName, graphTitle='', numReps = 6):
    """Times the repeated iteration of the function and makes a graph out of it.
    The function needs to take the paramaters (cursor, tableName) or this will generate an error."""
    timedFunc = time_wrapper(funcToTime)
    resultsDict = {}
    for tableName in tableNames: 
        resultsDict[tableName] = []
        for i in range(numReps):
            clearCache(curs)
            thisIterTime, result = timedFunc(curs, tableName)
            resultsDict[tableName].append(thisIterTime)
    #print resultsDict #used for testing
    avgs = {x: np.mean(resultsDict[x]) for x in resultsDict}
    plt.figure()
    width = .6
    plt.bar(np.arange(len(avgs)), avgs.values(), align = 'center')
    plt.xticks(np.arange(len(avgs) + width/2), avgs.keys(), rotation = 17, size = 'small')
    plt.ylabel('Time in Seconds')
    plt.title(graphTitle)
    plt.savefig(graphFileName)
    plt.close('all')
    
    
            


########################################################
#Execute the main script

if __name__ == '__main__':
    #autocommit 
    sqlConn = pyodbc.connect(sqlConnStr, autocommit = True)
    curs = sqlConn.cursor()
    makeGraphForTimes(getRowCount, curs, tableNames, 'RowCountTest.jpg', 'Row Count Execution Time')
    makeGraphForTimes(getInfoFromIndex, curs, tableNames, 'DataFromIndex.jpg', 'Query Covered by Index')
    makeGraphForTimes(getInfoOutsideIndex, curs, tableNames, 'DataOutOfIndex.jpg', 'Query not covered by Index')
    
    #Removing the noIndex for some of these because it makes the graph hard to read
    tnMinusNoIndex = [x for x in tableNames if x != 'noIndex']
    
    
    makeGraphForTimes(getTop100, curs, tnMinusNoIndex, 'Top100.Jpg', 'Top 100 No Conditions', numReps = 20)
    makeGraphForTimes(getTop100Order, curs, tnMinusNoIndex, 'Top100Order.Jpg', 'Top 100 Ordered By', numReps = 20)
    makeGraphForTimes(getTop100OrderDesc, curs, tnMinusNoIndex, 'Top100OrderDesc.Jpg', 'Top 100 Descending', numReps = 20)
    
    makeGraphForTimes(specificValueLookups, curs, tnMinusNoIndex, 'SpecificValueLookup.Jpg', 'Lookup One Row', numReps = 500)    
    makeGraphForTimes(getRangeIndexOnly, curs, tnMinusNoIndex, 'RangeIdxOnly.Jpg', 'Lookup Range for Values in Index', numReps = 50)
    makeGraphForTimes(getRangeAllValues, curs, tnMinusNoIndex, 'RangeAllValues.Jpg', 'Lookup Range for All Values', numReps = 100)

 

Let's start digging into the results.  You can see with the row count example, the table with the non clustered indexes performed the best.

the optimizer will use the nonclustered index when this query is run against a table

Similarly, nonclustered indexes do better when the select statement is entirely covered by the index.

nonclustered indexes do better when the select statement is entirely covered by the index

Of course, clustered indexes shine in retrieving the entire row for specific values, as the optimizer shows if we look at the execution plans for a simple query like "select * from {tablename} where n = 5":

clustered indexes shine in retrieving the entire row for specific values

Conclusion

Clustered indexes and nonclustered indexes often perform best under different circumstances. Therefore it can be beneficial for certain types of operations to create a nonclustered index which is identical to the clustered index on a table. However, the benefits in retrieval need to be balanced against the additional overhead that maintaining that index will cause.

Next Steps


sql server categories

sql server webinars

subscribe to mssqltips

sql server tutorials

sql server white papers

next tip



About the author
MSSQLTips author Tim Wiseman Tim Wiseman is a SQL Server DBA and Python Developer in Las Vegas.

This author pledges the content of this article is based on professional experience and not AI generated.

View all my tips



Comments For This Article




Thursday, January 15, 2015 - 8:33:27 PM - Geoff Back To Top (35961)

"It is pretty obvious that a covering non cluster will out perform a cluster on a subset of columns especially when you are leaving the big columns out of the projection list."  Really?  How so?  This implies that creating a non-clustered index on the clustering key should be a best practice for the purpose of improving queries that rely on the clustering key, where the data being retrieved is on a subset of columns ...  Queries like those cover an awful lot of ground, but I have never heard of any such best practice, and I've been a SQL-Server DBA for more years than I care to admit (was a MSFT employee / software engineer from the earliest days of SQL-Server). 

However, I don't necessarily disagree, Here is what I have, which brought me to this web page:

  • A table whose clustered index is just an identity int. 
  • The table is a very wide table containing millions of rows. 
  • Someone else created (for reasons not known to me) a non-clustered index on the clustering key with one include.  
  • I noticed that a particular query in a sproc was responsible for the lion's share of the sproc's run-time, so I looked at the query plan.
  • The plan revealed that it was doing an index seek on the non-clustered index with a Key lookup.  53% was dedicated to the seek, and 7 % to the key lookup (other stuff is happening in the plan too).  This was part of a loop-join.

I was struck by the fact that it was choosing to use this non-clustred index with a key lookup.  The non-clustered index did not cover the query.  Otherwise, it could have just used the clustered index (and obviously no key lookup). 

I gave it an index hint to force it to use the clustered index, and that did speed it up some (3 min 30 seconds to 3 min 15 seconds).  The plan changed to a hash join.

I decided to see what would happen if I adjusted the index to make it perfectly cover the query, and added the column as an additional include for which it was using the key lookup.  The runtime went from 3 min 30 seconds to just 10 seconds!!!!!  That's for the entire sproc.

That's a huge difference.  The change in the query plan did not make sense either.  Remember, without the  extra included column it was doing a loop-join with an index SEEK and key lookup.  With the included column, it did a hash join with an index SCAN and no key lookup.  I would expect this to be slower, not MUCH FASTER.  Remember, the key lookup was only 7%, and the index seek was 53%. 

If you or anyone else has any insight to the behavior and results I am seeing, I would love to hear it. 

Specifically, I would love to know:

1. Under what circumstances does it generally make sense to create a non-clustered index on the clustering key?  This is not a topic covered by books online or anywhere else, and I really don't think there is anything obvious about it.  I'm hoping for some sort of rule of thumb here, and WHY.

2. Why would a covering non-clustered index be sooooo much faster than the clustered index, where the clustered index also completely covers the query? 

Thanks.


Thursday, October 10, 2013 - 7:08:28 AM - Len Esterhuyse Back To Top (27098)

select n, singleChar--, bigIntData
from dbo.nonclusteredIdx
where n % 10 = 0

(100000 row(s) affected)
Table 'nonclusteredIdx'. Scan count 1, logical reads 1983, physical reads 0, read-ahead reads 260, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

(1 row(s) affected)

select n, singleChar--, bigIntData
from dbo.pkIndex
where n % 10 = 0

(100000 row(s) affected)
Table 'pkIndex'. Scan count 3, logical reads 335288, physical reads 6004, read-ahead reads 333490, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

(1 row(s) affected)

select n, singleChar, bigIntData
from dbo.nonclusteredIdx
where n % 10 = 0

(100000 row(s) affected)
Table 'nonclusteredIdx'. Scan count 1, logical reads 101983, physical reads 2, read-ahead reads 18398, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

(1 row(s) affected)

select n, singleChar, bigIntData
from dbo.pkIndex
where n % 10 = 0

(100000 row(s) affected)
Table 'pkIndex'. Scan count 1, logical reads 334654, physical reads 5744, read-ahead reads 333356, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

(1 row(s) affected)


Monday, September 30, 2013 - 10:48:47 PM - TimothAWiseman Back To Top (26998)

RUwin  Yes, I see what you are saying.  Making all the non-key columns included would for all intents and purposes recreate the clustered index.  That would create a new B-tree which included the entire contents of the table in the structure of the B-tree...which is in a sense what a clustered index turns the table into.  

I suppose I could have been clearer in saying that it is precisely because the non-clustered index is not encumbered by all the other columns that it is faster in some cases.  That is one of those intrinsic differences between clustered and non-clustered indexes I was talking about.  

And I didn't mean to imply you were being condescending.  It is fairly obviously if you have a deep knowledge of how indexes work and take the time to think about it.  But a lot of DBAs, especially those in the "accidental DBA" category don't have that, at least not at first.  I had been a DBA for a decent amount of time before I stumbled over indications of this kind of thing experimentally and I didn't understand why until I started reading about how indexes worked. It is a topic I am still learning more about.  I haven't read that particular book, but I highly recommend Itzik Ben-Gan as an author, I have been a fan for a long time.  Gail Shaw also has some great pieces on SQL Server indexes.

As for the "SET STATISTICS IO ON", I played with some of that with these queries while I was writing this article (and use it often when I try to test and optimize queries).  But I thought the graphs made for a better presentation in this article, especially since I wanted to be able to repeat the trials many times and average the results.  Python let me do that fairly easily by directly measuring the time for the entire query.  The trends play out about the same way in either method, though of course the exact numbers you get are somewhat difference since the method I chose includes the time for Python to send over the query and receive the results.

Thanks again for the feedback. 


Monday, September 30, 2013 - 4:53:31 PM - RUnwin Back To Top (26997)

Sorry I didn't mean to sound condecending or anything by saying it was obvious. Yes I understand indexing quite well.

If you want to mirror the cluster you would need to include all the non-key columns which would yield an index stucture that mirrors the cluster and therefore not generate any performance benefit because your reads would be identical, as would the page structures. If you haven't used it I suggest using SET STATISTICS IO ON; at the top of the queries and looking at the output in messages. You'll see the actual number of reads done in the query. You'll then see why the cluster or non-cluster was used.

You will benefit even more from making the non clustered indexes fully covering in which case you'll see even more benefits, no lookups into the heap or cluster. If you're interested then T-SQL Querying by Itzic Ben Gan (Not sure on his names spelling) is a book that looks at this in some detail in one of the chapters - I think you can now download the 2008 version for free. Indexes broadly speaking are most to least efficient in the following order (Although it does depend on the projection list and predicates obviously):-

non-clustered seek (Covering index)

non-clustered seek with lookup (Not covering, not clustered key predicate)

clustered seek

non-clustered seek and scan (Covering index)

non-clustered seek and scan (Not covering with lookups, not clustered key predicate - before you get to the flipping point of scanning the cluster)

clustered scan

Note:- Once that the non-cluster partial scan and lookup (Which goes through the full depth of the cluster for every record retrieved from the non-cluster) will do more reads (Not strictly true due to how SQL Server does costing but close enough :) ) will do more reads then just scanning the cluster then a cluster scan is preferred and used over a non-clustered scan and lookup. The bit about it not being strictly true is that SQL Server estimates IO costs differently for consecutive reads and random IO so it's not quite as simple as I made it out to be.

At least that's how I remember it :) Anyone who wants to correct me feel free. It's just a high level overview though and the details can get complicated very quickly. Even more so with filtered indexes and the likes which I didn;t mention at all.


Monday, September 30, 2013 - 3:25:35 PM - TimothyAWiseman Back To Top (26994)

Hello, RUnwin.  Thank you for the constructive feedback.

You are right that there is a huge difference between indexes with the same key columns and the same index.  Here, I tried to create a nonclustered index that was comparable to the clustered index.  I used the same key columns in the same order, added the unique requirement, and left the sort direction to the default in all cases (which is ASC).  The intrinsic differences between clustered and nonclustered indexes (with clustered indexes logically ordering the table and changing it from a heap where nonclustered indexes create a separate b-tree that is stored separately) is part of the point of the article.  If you see any other differences or any ways to make the non-clustered index more comparable to my clustered index I would love to hear about it and I would work with the fantastic editors here to include any updates it might yield.

I'm glad you find it obvious.  I take that to indicate that you are a smart person who has researched how indexes work.  Unfortunately, it wasn't obvious to me when I first found out and I thought it was surprising.  I stumbled over it a couple of years ago when I was doing performance testing to optimize a routine.  I used non-clustered indexes extensively, but I used them on non-key columns to supplement the clustered one.   Before I stumbled over that, I would never have thought to create a non-clustered index that was almost identical to the clustered one.  It wasn't until I took some courses on database theory and the theory of database design that I understood why it was true.  And it wasn't until quite a while after that I got around to doing systematic testing and wrote up this article.  If it is obvious to you, hopefully it will help someone who is a bit less experienced.

Thanks again for the feedback and please let me know if you have any more thoughts.


Monday, September 30, 2013 - 10:44:45 AM - RUnwin Back To Top (26992)

Those non clusters are not identical to the cluster. If they were you would see no performance benefit. The is a huge difference between the same key columns and the same index. I can create many different indexes on those tables with the same key columns.

Baring that obvious flaw it is pretty obvious that a covering non cluster will out perform a cluster on a subset of columns especially when you are leaving the big columns out of the projection list.

Bit simplistic and needs to explain those points but otherwise a nice explanation. Also you should at least provide stats io and show why it's better to have decent non clusters.















get free sql tips
agree to terms