![]() |
|
|
By: Arshad Ali | Read Comments (3) | Print Arshad is a SQL and BI Developer focusing on Data Warehousing projects for Microsoft. Related Tips: More |
|
In this tip, I am going to walk through what a Filtered Index is, how it differs from other indexes, its usage scenario, its benefits and limitations.
Solution
A Filtered Index, which is an optimized non-clustered index, allows us to define a filter predicate, a WHERE clause, while creating the index. The B-Tree containing rows from the filtered index will contain only those rows which satisfy the filter criteria used while creating the index. This optimized index offers several benefits over a full table non-clustered index as follows:
How it differs from Indexed views
In prior versions of SQL Server, to get a similar benefit we had an option to use indexed views or partitioning the table. Though the approach of indexed view looks similar to a filtered index there are some significant differences between these two concepts, I have summarized some of them in the below table:
| Criteria | Filtered Index | Indexed Views |
| Only One Table | A Filtered Index is created on column(s) of a particular table. |
Index Views can be created on column(s) from multiple base tables. |
| Simple WHERE criteria | A Filtered Index can not use complex logic in its WHERE clause, for example the LIKE clause is not allowed, only simple comparison operators are allowed. | This limitation does not apply to indexed views and you can design your criteria as complex as you want. |
| Can do Online Rebuild | A Filtered Index can be rebuilt online. | Indexed views cannot be rebuilt online. |
| Non-Unique or Unique | You can create your Filtered Index as a non-unique index. | Indexed views can only be created as unique index. |
Usage Scenario
The benefit of using a filtered index is apparent in the scenario when you only select a subset of records from a huge table, for example I am providing some usage scenarios below where it will be of beneficial use:
Example
In Script #1, I am creating an Employee table which will have two columns HireDate and DOJ of data type DATE (a new data type in SQL Server 2008 to store only the date component). I am creating a full table non-clustered index on the HireDate column and a non-clustered filtered index on the DOJ column. Further I am adding 10,000 records into this table, the HireDate and DOJ columns will have 90% NULL values and only 10% NOT NULL values (both columns will have same data for demonstration purposes).
|
Script #1 |
-- Create Employee table with a clustered index on primary key CREATE TABLE Employee ( EmpID INT PRIMARY KEY CLUSTERED, EmpName VARCHAR(100) NOT NULL, HireDate DATE NULL, --DATE is a new data type in SQL Server 2008 DOJ DATE NULL, --DATE is a new data type in SQL Server 2008 ) GO -- Creating a non clustered index on HireDate column CREATE NONCLUSTERED INDEX NC_Employee_HireDate ON Employee(HireDate) GO -- Creating a non clustered Filetered index on DOJ column CREATE NONCLUSTERED INDEX FI_Employee_DOJ ON Employee(DOJ) WHERE DOJ IS NOT NULL --Notice here the filter criteria for the index GO DECLARE @intCounter INT = 1 --With SQL 2008, you can assign value in variable declaration as well WHILE @intCounter <= 10000 BEGIN IF (@intCounter % 100) = 0 BEGIN INSERT INTO Employee(EmpID, EmpName, HireDate, DOJ) VALUES (@intCounter, 'ABC' + CAST(@intCounter AS VARCHAR), CAST((GETDATE() - @intCounter) AS DATE) , CAST((GETDATE() - @intCounter) AS DATE)) END ELSE BEGIN INSERT INTO Employee(EmpID, EmpName, HireDate, DOJ) VALUES (@intCounter, 'ABC' + CAST(@intCounter AS VARCHAR), NULL, NULL) END SET @intCounter = @intCounter + 1 END |
Script #2 will show you, how the two columns of the same data type, having same amount of data need different amount of storage space for the full table non-clustered index vs the filtered non-clustered index. If you look closely at the image below, you will notice the filtered index has only 100 rows (only those rows matching the filter criteria) whereas the full table non clustered index has all rows of the table; in other words the filtered index contains only those rows which meet the filter criteria and since the filtered index contains only rows matching the filter criteria it requires less space (index pages) to store those records compared to the full table non-clustered index.
|
Script #2 |
SELECT ind.index_id, ind.name, ind.type_desc, par.reserved_page_count, par.used_page_count,
par.row_count, ind.filter_definition FROM sys.dm_db_partition_stats par
INNER JOIN sys.indexes ind ON par.object_id = ind.object_id AND par.index_id = ind.index_id
WHERE par.object_id = OBJECT_ID('Employee')Query Result: |
Script #3 will show you the performance benefits of using a filtered index. First it will execute a checkpoint and a DBCC DROPCLEANBUFFERS to flush the changes to disk and clear the buffers. Further, to see IO statistics I am setting STATISTICS IO to ON. The next two queries are the same except the index name is used as a query hint. The first query uses the query hint to instruct the Query optimizer to use the full table non-clustered index whereas the second query uses a query hint to instruct the Query optimizer to use the filtered index, by providing the index name as input to the index query hint.
If you look at the IO statistics result of these queries, you will notice that the query which uses the full table non-clustered index has high IO operations compared with the query which uses the filtered index. Further if you look at the execution plan for these queries, you will notice the first query (which uses the full table non-clustered index) has 93% relative query cost compared to only 7% for the second query (which uses the filtered index).
|
Script #3 |
CHECKPOINT DBCC DROPCLEANBUFFERS -- It clears out the SQL Server buffer cache GO SET STATISTICS IO ON SELECT * FROM Employee WITH(INDEX(NC_Employee_HireDate)) WHERE DOJ IS NOT NULL SELECT * FROM Employee WITH(INDEX(FI_Employee_DOJ)) WHERE DOJ IS NOT NULL SET STATISTICS IO OFF IO Statistics result: Execution Plan: |
The execution plan generated for Script #3 above shows that both the queries use Key Lookups to pull additional columns requested in the SELECT statement which are not part of the indexes.
Script #4 shows you how you can INCLUDE additional columns in your index itself and improve the query performance if your index key does not have all the columns queried within the SELECT statement. First you need to drop the indexes we created above and recreate the indexes, as shown in Script #4, with the INCLUDE clause to include remaining columns (Please note, I am not including the primary key column, because the primary key is automatically included in all full table non-clustered and filtered indexes). Once you have created the indexes with the INCLUDE, run the next query scripts (same as Script #3) and observe the IO statistics results and execution plans. You will notice a significant improvement this time compared to the previous run without the INCLUDE option. The IO and CPU cost dramatically reduce with the inclusion of the additional columns in the index, though on the maintenance side of things it would be a bit costlier, so a trade-off is needed.
|
Script #4 |
-- Creating a non clustered index on DOJ column CREATE NONCLUSTERED INDEX NC_Employee_HireDate ON Employee(HireDate) INCLUDE(EmpName, DOJ) --Including remaining columns in the index GO -- Creating a non clustered Filetered index on DOJ column CREATE NONCLUSTERED INDEX FI_Employee_DOJ ON Employee(DOJ) INCLUDE(EmpName, HireDate) --Including remaining columns in the index WHERE DOJ IS NOT NULL --Notice here the filter criteria for the index GO --------------------------------------------------------------------------------------- CHECKPOINT DBCC DROPCLEANBUFFERS -- It clears out the SQL Server buffer cache GO SET STATISTICS IO ON SELECT * FROM Employee WITH(INDEX(NC_Employee_HireDate)) WHERE DOJ IS NOT NULL SELECT * FROM Employee WITH(INDEX(FI_Employee_DOJ)) WHERE DOJ IS NOT NULL SET STATISTICS IO OFF IO Statistics result: Execution Plan: |
Notes:
| Share: | Share | Tweet |
|
![]() |
![]() |
Free SQL Server Learning |
| Sunday, June 12, 2011 - 11:53:13 AM - Amar | Read The Tip |
|
Great introductory article on Filetered Index that covered the topic from several angles. I found the layout and explanation well thought out as well. Thanks for taking the time to write about this!
|
|
| Thursday, December 29, 2011 - 3:31:44 PM - Jeff Roughgarden | Read The Tip |
|
Excellent explanation and demonstration of the benefits of filtered indexes. Thank you. |
|
| Thursday, March 08, 2012 - 10:14:33 AM - Vardan | Read The Tip |
|
Great article, thank you very much! |
|
|
privacy | disclaimer | copyright | advertise | about authors | contribute | feedback | giveaways | user groups Some names and products listed are the registered trademarks of their respective owners. Edgewood Solutions LLC | MSSharePointTips.com | MSSQLTips.com |