Improve Query Performance when SQL Server Ignores Nonclustered Index
Despite our efforts to optimize query performance by creating a non-clustered index, SQL Server seems to prioritize a clustered index scan over a non-clustered index seek followed by a key lookup. Surprisingly, the clustered index scan proves to be more efficient for SQL Server in this particular situation. However, we can still utilize a method to improve the query performance?
Indexing is one of the most important ways to improve query performance. A non-clustered index in a database is designed to enhance query performance by providing a fast lookup mechanism. It differs from a clustered index in that it doesn't determine the physical order of data in a table. Instead, it establishes a separate structure that holds the indexed columns and a pointer leading to the corresponding data rows. By indexing the columns, you create a data structure that allows for efficient searching and retrieval of information from a database.
Non-clustered indexes use a B-Tree structure. A B-tree index is a powerful tree-based data structure employed in databases to optimize search operations. The name "B-tree" stems from its balanced nature, where all leaf nodes reside at the same level, and the branching factor is intentionally kept low to expedite access to leaf nodes.
Once we have covered the basics of non-clustered indexes and the B-tree structure, it is essential to explore another significant concept: the tipping point. The tipping point represents a critical threshold in query planning, where the query optimizer switches from utilizing a non-covering, non-clustered index to performing a scan on the clustered index or heap. In this article, I will demonstrate a method to enhance query performance in cases where SQL Server disregards the non-clustered index.
Set Up Test Environment
I'll use the StackOverflow database and create an index on the Location column in the Users table. StackOverflow database is an open-source database from Stack Overflow.com.
Use StackOverflow GO Alter Database Current Set Compatibility_Level = 160 GO Create Index IX_Location On dbo.Users (Location) GO
To get IO statistics, use the command below:
SET STATISTICS IO ON GO
By executing the following query, you can retrieve a list of users who reside in India and Germany. To view the actual execution plan, simply press Ctrl + M.
Select * From dbo.Users u Where u.Location In (N'India', N'Germany') GO
In the image below, you can see that SQL Server has chosen to ignore the non-clustered index and has instead performed a scan on the clustered index.
As depicted in the image below, the number of logical reads exceeds 141000.
I want to test a solution to reduce the number of logical reads. The following script generates a temporary table and populates it with values for India and Germany.
Drop Table If Exists #Location Create Table #Location ([Location] Nvarchar(100) Collate SQL_Latin1_General_CP1_CI_AS) Insert Into #Location Values (N'India'), (N'Germany') GO Select u.* From dbo.Users u Inner Join #Location l On l.Location = u.Location GO
The image below shows that SQL Server employed a non-clustered index seek operation followed by a key lookup.
Take note of the number of logical reads depicted in the image below.
The number of logical reads is more than 224000. The number of logical reads has been increased. It has a detrimental effect on performance when dealing with concurrency.
In SQL Server, you can use the SET STATISTICS TIME statement to display the time it takes to execute a T-SQL statement.
SET STATISTICS TIME ON GO
The elapsed time for the clustered index scan operation is displayed in the image below. As indicated in the image, the elapsed time exceeds 2 seconds, and the CPU time exceeds 7 seconds.
Below we can see the elapsed time when SQL Server performs a non-clustered index seek operation followed by a key lookup. As indicated in the image, the elapsed time exceeds 1300 milliseconds, and the CPU time exceeds 700 milliseconds.
One approach results in lower elapsed time and CPU time but higher logical reads, while the other leads to higher elapsed time and CPU time but lower logical reads.
So, what is the solution?
It is important to avoid using SELECT * and fetching all columns whenever possible. While it is possible to include all columns of a table in a covered index, it is uncommon for this practice to be adopted. It negatively impacts the performance of DML operations and increases the overall size of the table, which is not wise to do. By retrieving fewer columns from a table, the likelihood of utilizing a covered index increases. I am going to modify the non-clustered index:
Drop Index IX_Location On dbo.Users Create Index IX_Location_Include On dbo.Users (Location) Include (DisplayName, Reputation, LastAccessDate) GO
I want to change the query as follows:
Select u.Location, u.DisplayName, u.Reputation, u.LastAccessDate From dbo.Users u Where Location in (N'India', N'Germany') GO
In the image below, you can observe that SQL Server utilized a non-clustered index seek.
The image below indicates that the elapsed time is 362 milliseconds, and the CPU time is 78 milliseconds. That is a speedup factor of about 4x.
Pay attention to the number of logical reads. It's only 541. That is amazing!
By including all columns returned in a query, a covering index avoids additional reads for data retrieval. This effectively reduces IO operations and enhances query performance. Again, including too many columns in a covering index should be avoided. Monitoring an index after creating it on a table is essential, as SQL Server is not guaranteed to utilize it. At times, SQL Server may employ a non-clustered index, but it can lead to a high number of logical reads. In general, avoid using SELECT * solely for the sake of convenience in writing code.
Indexing is one of the best ways to improve query performance and reduce IO. Sometimes, despite our efforts to enhance query performance by creating a non-clustered index, SQL Server disregards the index and fails to utilize it. Occasionally, SQL Server may use a non-clustered index, which can significantly increase logical reads. One possible solution in this scenario is to reduce the number of columns in the select list and utilize a covered index.
- Improve SQL Server Performance with Covering Index Enhancements
- SQL Server Indexes with Key and Non-Key Columns as Covering Indexes to Improve Performance
- SQL IN Operator
About the author
View all my tips
Article Last Updated: 2023-10-18