SQL Server Indexing Basics
I have seen your tips on indexing and I think they are great. I am relatively new to SQL Server and need to understand the basics. I need some additional background information to understand what sorts of indexing options are available. Hopefully that can lead me into using some of the additional tips. Can you help me?
As you begin to design and develop databases, be sure to include indexing in that effort. A well thought out indexing strategy can significantly improve the data access layer and overall user experience. However, caring for the database indexes is not a one time event. As your application changes as well as when users start to use the application in new and different ways, your indexes need to reflect these changes. In addition, as with anything else, indexes need to be maintained. So be sure to have your indexes included in the regularly scheduled maintenance process.
With all of this being said, let's start to cover some of the basic indexing information. In this tip, we will try to simplify a number of complex and interdependent topics. As such, the goal of this tip is to introduce the topics and provide additional references for an expanded information. If you are looking for more intermediate and advanced information, check out the indexing category. In addition, check out these resources if you are looking into Full Text Search or XML indexes as opposed to traditional relational engine indexes. With that being said, let's jump in!
A clustered index stores the data for the table based on the columns defined in the create index statement. As such, only one clustered index can be defined for the table because the data can only be stored and sorted one way per table. Due to the storage and sorting impacts, be sure to carefully determine the best column for this index.
Although many implementations only have a single column for the clustered index, in reality a clustered index can have multiple columns. Just be careful to select the correct columns based on how the data is used. The number of columns in the clustered (or non clustered) index can have significant performance implications with heavy INSERT, UPDATE and DELETE activity in your database.
An age old question is whether or not a table must have a clustered index. The answer is no, but in most cases, it is a good idea to have a clustered index on the table to store the data in a specific order. If a table only has non-clustered indexes it is called a heap. For more information check out Clustered Tables vs Heap Tables.
Another age old question is whether or not the Primary Key for the table must be the clustered index. The answer once again is no. In many implementations the Primary Key is also the clustered index, but it does not have to be.
Non Clustered Indexes
A non clustered index can consist of one or more columns, but the data storage is not dependent on this create index statement as is the case with the clustered index. For a table without a clustered index, which is called a heap, the non clustered index points the row (data). In the circumstance where the table has a clustered index, then the non clustered index points to the clustered index for the row (data).
In terms of the number of non clustered indexes, a single table can have up to 249 non clustered indexes. Although, too much of a good thing can become bad. Keep in mind that SQL Server needs to keep the indexes updated as you INSERT, UPDATE and/or DELETE data. As such, it is necessary to strike a balance with the number of indexes created for each table based on the activity on the table i.e. SELECT, INSERT, UPDATE and/or DELETE transactions.
When you create an index the fill factor option indicates how full the leaf level pages are when the index is created or rebuilt. Valid values are 0 to 100. These values represent a percentage of the leaf level pages being used when the index is created or rebuilt. A fill factor of 0 means that all of the leaf level pages are full. The same is true with a fill factor equal to 100.
Here are a few aspects to take into consideration when you select your fill factor:
- Depending on how the data is inserted, updated and deleted in the table dictates how full the leaf level pages should be in the table. To fine tune this setting typically takes some testing and analysis. This could be critical for large active tables in your database.
- If data is always inserted at the end of the table, then the fill factor could be between 90 to 100 percent since the data will never be inserted into the middle of a page. UPDATE and DELETE statements may expand (UPDATE) or decrease (DELETE) the space needed for each leaf level page. This should be fine tuned based on testing.
- If the data can be inserted anywhere in the table then a fill factor of 60 to 80 percent could be appropriate based on the INSERT, UPDATE and DELETE activity. However, it is necessary to conduct some testing and analysis to determine the appropriate settings for your environment.
- With all things being equal i.e. table size, SQL Server versions, options, etc., the lower the fill factor percentage the more storage that could be needed as compared to a higher fill factor where the pages are more compact.
- Another aspect to take into consideration is your index rebuild schedule. If you cannot rebuild your indexes on a regular schedule and if you have a high level of INSERT, UPDATE and DELETE activity throughout the table, one consideration may be to have a lower fill factor to limit the fragmentation. The trade-off may be that more storage is needed.
For more information on fill factor considerations, check out:
- Fragmentation Station - Stop #4 - How to avoid it
- SQL Server script to rebuild all indexes for all tables and all databases
- SQL Server Index Checklist
Covering indexes typically consist of 2 or more columns and have all of the indexes needed to fulfill a query. One scenario where covering indexes are created is to prevent a bookmark lookup.
The index order can either be in ascending or descending order. Depending on how your queries return the data dictates how the index should be created. Having the correct index order can improve the performance of queries in many circumstances especially when you have a covering index. Check out eight examples in the Building SQL Server Indexes in Ascending vs Descending Order tip for detailed performance metrics.
Indexes, just like just about everything else in the world, need maintenance. Indexes need maintenance because they become fragmented. Fragmentation is caused by INSERT, UPDATE and DELETE activity on a table splitting pages so that the logical and physical order of the pages do not match. To resolve this issue, fragmented indexes should be rebuilt on a regular basis.
At MSSQLTips, we care a great deal about fragmentation, so check out these tips:
- Identify Database Fragmentation in SQL Server 2000 vs SQL Server 2005
- Index Fragmentation Report in SQL Server 2005 and 2008
- Managing SQL Server Database Fragmentation
- SQL Server Index Maintenance Checklist
- Index Rebuilds in SQL Server 2000 vs SQL Server 2005
- Custom Index Defrag / Rebuild Procedures
- Rebuilding Clustered Indexes Efficiently with the DROP_EXISTING command in SQL Server
- Fixing Index Fragmentation in SQL Server 2005 and SQL Server 2008
- Performing maintenance tasks in SQL Server
- Fragmentation Station - 8 part series by Chad Boyd
Although we have only scratched the surface with indexes, we also want to make sure you are aware of the following indexing related topics:
- Dynamic Management Views - Introduced with SQL Server 2005, the DMV's provide insight into index usage, missing indexes, index operations, etc.
- Additional Information
- Included columns - Introduced with SQL Server 2005, previously un-indexable columns can now be indexed.
- Additional Information - Improve Performance with SQL Server 2005 Covering Index Enhancements
- Filtered Indexes - Introduced with SQL Server 2008, a WHERE clause can be used to specify the index definition.
- Additional Information - SQL Server Filtered Indexes - What They Are, How to Use and Performance Advantages
- File Groups - Non clustered indexes can be created in separate file groups which can reside on separate disk drives to improve the data access i.e. I/O operations.
- Additional Information - Hard Drive Configurations for SQL Server
- Disabling Indexes - After you have gone through all of the trouble to design you indexes, you have the option to disable them as opposed to just dropping them.
- Additional Information - Disabling Indexes in SQL Server 2005 and SQL Server 2008
- Once you have a baseline set of knowledge about indexes, one of the first steps you should take is building a plan to either build, rebuild or re-design your indexes. This plan should include expanding your knowledge before just jumping into managing your database indexes. Indexing can easily make or break a user experience due to poor application performance. Indexing can also become expensive in terms of new hardware. Our team has been in situations where properly designed and maintained indexes can extend the life of a hardware platform.
- As we indicated at the beginning of this tip, this information was not intended to cover all of the aspects of indexing, but rather serve as a baseline set of information. As such, here are some additional tips to continue to learn about indexing:
- Do you have more questions above indexing? Did we miss a topic you were expecting? If so, please visit the link to the forums below and post your question. Your question may become a future tip. Happy indexing!
Last Updated: 2009-07-14
About the author
View all my tips