By: Aaron Bertrand | Updated: 2021-11-03 | Comments | Related: > Indexing
Problem
Index hints should be a last resort, but we all know how that tends to go in SQL Server. If you have committed to using index hints, be aware that at some point you may have to update those explicit references. A specific index may get renamed or dropped, making the hints invalid, or a new and better index may be created. How do we find these references to remove, update, or at least document them? I see a lot of queries out there that will search the plan cache for index hints (often using fuzzy matching), or that just happen to use a specific index, but what about queries with explicit index hints that aren't currently in the plan cache, and is index usage really what we're after?
Solution
It is important to be able to identify queries with index hints because, if the index is renamed or dropped, those queries will fail. Note that using a specific index is different from hinting a specific index. If the index is dropped, say, both queries will need to get new plans. The query that was merely using the dropped index will just get a new plan; the query that explicitly hints that index will fail:
Msg 308, Level 16, State 1
Index 'index_name' on table 'dbo.table_name' (specified in the FROM clause) does not exist.
Many of the queries out there return all of the plan cache entries that use a specific index, which is potentially more noise and less useful; we want to be sure to focus on only the queries that use explicit hints.
To set up an example, let's say we have this database and set of objects:
CREATE DATABASE IndexHintsTest;
GO USE IndexHintsTest;
GO CREATE TABLE dbo.t1(keycol int PRIMARY KEY, INDEX IX_keycol (keycol));
GO INSERT dbo.t1(keycol)
SELECT TOP (100) ROW_NUMBER() OVER (ORDER BY @@SPID)
FROM sys.all_objects AS o
CROSS JOIN sys.all_objects i
ORDER BY o.object_id;
GO CREATE TABLE dbo.t2(keycol int PRIMARY KEY, INDEX IX_t2(keycol));
GO INSERT dbo.t2(keycol)
SELECT TOP (20) keycol
FROM dbo.t1
ORDER BY NEWID();
GO CREATE PROCEDURE dbo.t1_get_rare
AS
SELECT TOP (5) keycol FROM dbo.t1 WITH (INDEX (IX_keycol));
GO CREATE PROCEDURE dbo.t1_get_common
AS
SELECT TOP (6) keycol FROM dbo.t1 WITH (INDEX (IX_keycol)) ORDER BY keycol;
GO
Now, let's generate a few plans referencing these tables to give us something in the plan cache to go after:
DBCC FREEPROCCACHE; /* please not in production */
GO /* this should show up: */
GO
SELECT TOP (1) keycol
FROM dbo.t1 WITH (INDEX(IX_keycol))
ORDER BY keycol;
GO /* false positive that might show up from parsing text: */
GO
SELECT TOP (2) keycol FROM dbo.t1 AS IX_keycol /* not an INDEX */ ORDER BY keycol;
GO /* this should show up: */
GO
SELECT TOP (3) t1.keycol
FROM dbo.t1 WITH (INDEX(IX_keycol), FORCESEEK)
INNER JOIN dbo.t2 ON t1.keycol = t2.keycol
ORDER BY t1.keycol;
GO /* false positive that might show up if just looking for specific keywords
and the hinted index is not the one you're looking for */
GO
SELECT TOP (4) t1.keycol
FROM dbo.t1 WITH (FORCESEEK /*, INDEX(IX_keycol) */)
INNER JOIN dbo.t2 WITH (INDEX(IX_t2))
ON t1.keycol = t2.keycol ORDER BY t1.keycol;
GO /* this should show up: */
GO
EXEC dbo.t1_get_common;
GO
Note I'm being careful to not pollute the plan cache with the query from
t1_get_rare
to demonstrate how relying on just the
plan cache is unreliable. If we think about how we track queries, there are two
main sources: modules (views, procedures, triggers, functions) and ad hoc queries
that come from outside SQL Server (which I'll refer to as "Queries in
app code," but really this means any query from any source other than a module
inside SQL Server). Queries that have run recently are available in the plan cache,
and the set of all queries can be represented by this diagram:
We can use a query like this (a mash-up of queries by Jonathan Kehayias here and Kendra Little here) to locate those entries in the plan cache with an index hint specifically for IX_keycol:
SET TRANSACTION ISOLATION LEVEL READ UNCOMMITTED;
DECLARE @IndexName nvarchar(128) = QUOTENAME(N'IX_keycol'); ;WITH XMLNAMESPACES(DEFAULT 'http://schemas.microsoft.com/sqlserver/2004/07/showplan')
SELECT Statement = LTRIM(SUBSTRING(sqltext.text, (qs.statement_start_offset / 2) + 1,
(CASE qs.statement_end_offset
WHEN -1 THEN DATALENGTH(sqltext.text)
ELSE qs.statement_end_offset
END - qs.statement_start_offset) / 2 + 1)),
FullStatement = stmt.value(N'(@StatementText)[1]', N'nvarchar(max)'),
[Database] = obj.value (N'(@Database)[1]', N'nvarchar(128)'),
[Schema] = obj.value (N'(@Schema)[1]', N'nvarchar(128)'),
[Table] = obj.value (N'(@Table)[1]', N'nvarchar(128)'),
[Index] = obj.value (N'(@Index)[1]', N'nvarchar(128)'),
SourceObject = qp.objectid,
QueryPlan = qp.query_plan
FROM sys.dm_exec_query_stats as qs
CROSS APPLY sys.dm_exec_query_plan(qs.plan_handle) AS qp
CROSS APPLY query_plan.nodes
(N'/ShowPlanXML/BatchSequence/Batch/Statements/StmtSimple') AS batch(stmt)
CROSS APPLY stmt.nodes
(N'.//IndexScan[@ForcedIndex=1]/Object[@Index=sql:variable("@IndexName")]') AS idx(obj)
CROSS APPLY sys.dm_exec_sql_text(qs.plan_handle) AS sqltext
OPTION(MAXDOP 1, RECOMPILE);
The key differentiator from many of the queries out there is that the expression
against stmt.nodes
is performing two different filters
- it is making sure that we only return the index with the given name and that has
a parent index operation that was forced. Here are the results:
If we want to be more generic and capture all queries with explicit index hints, we can just change this one line with the dual filter:
CROSS APPLY stmt.nodes
(N'.//IndexScan[@ForcedIndex=1]/Object[@Index=sql:variable("@IndexName")]') AS idx(obj)
To the following, which removes the filter against the specific index name:
CROSS APPLY stmt.nodes(N'.//IndexScan[@ForcedIndex=1]/Object') AS idx(obj)
And here are the new results, which now include the hint against IX_t2:
That's great for the ad hoc queries that are currently in the cache, and procedures that have been called recently. However, in addition to ad hoc queries and queries from procedures and other modules that are actively in the plan cache, there are also two other categories of queries that might use index hints:
- queries in procedures or other modules that haven't been executed
recently (e.g.
t1_get_rare
); and, - queries from user application code or ad hoc query windows that haven't been executed yet.
You could represent what we have solved so far with a slight modification to the earlier diagram:
For procedures that contain index hints but haven't been called recently enough to be in the plan cache, you could consider parsing all of the bodies to find them. However, as I've pointed out before on this site, this is cumbersome and unreliable. Instead, we can turn to the TSqlFragmentVisitor pattern, and build a list of objects and queries from the plan cache, using PowerShell.
Prerequisites, aside from PowerShell:
- sp_ineachdb, a stored procedure to help execute a command across all databases
- Microsoft.SqlServer.TransactSql.ScriptDom.dll, from SqlPackage
- If you want Out-GridView on Mac or Linux, you'll need something like PowerShell.GraphicalTools
I won't get into all of the details of everything going on here. The code is lengthy and lacks error handling, but gets the job done as a rough first cut:
$global:server = "127.0.0.1";
$global:index_hint_to_find = ""; Add-Type -Path "<path>/Microsoft.SqlServer.TransactSql.ScriptDom.dll"; $Parser = [Microsoft.SqlServer.TransactSql.ScriptDom.TSql150Parser]($true)::New();
$Errors = [System.Collections.Generic.List[Microsoft.SqlServer.TransactSql.ScriptDom.ParseError]]::New(); $filter = ".//IndexScan[@ForcedIndex=1]/Object[@Index=sql:variable(`"@IndexName`")]"; if ($global:index_hint_to_find -eq "")
{
$filter = ".//IndexScan[@ForcedIndex=1]/Object";
} $query = @"
DECLARE @sql nvarchar(max) = N'
DECLARE @IndexName nvarchar(128) = N''$global:index_hint_to_find''; CREATE TABLE #queries
(
dbname nvarchar(128),
schemaname nvarchar(128),
tablename nvarchar(128),
indexname nvarchar(128),
modulename nvarchar(600),
moduleid int,
handle varchar(64),
statement nvarchar(max),
fullstatement nvarchar(max)
); DECLARE @sql nvarchar(max) = N''
SET TRANSACTION ISOLATION LEVEL READ UNCOMMITTED; INSERT #queries
SELECT
[Database] = QUOTENAME(DB_NAME()),
[Schema] = NULL,
[Table] = NULL,
[Index] = NULL,
SourceModule = s.name COLLATE DATABASE_DEFAULT
+ N''''.'''' + o.name COLLATE DATABASE_DEFAULT
+ N'''' ('''' + RTRIM(o.type) + N'''')'''',
SourceModuleID = m.object_id,
handle = NULL,
Statement = NULL,
FullStatement = m.definition
FROM sys.sql_modules AS m
INNER JOIN sys.objects AS o
ON m.object_id = o.object_id
INNER JOIN sys.schemas AS s
ON o.[schema_id] = s.[schema_id]
WHERE m.definition LIKE N''''%INDEX%(%'' + @IndexName + N''%'''';''; EXEC master.dbo.sp_ineachdb @sql; SET @IndexName = QUOTENAME(@IndexName); SET TRANSACTION ISOLATION LEVEL READ UNCOMMITTED; ;WITH XMLNAMESPACES(DEFAULT ''http://schemas.microsoft.com/sqlserver/2004/07/showplan'')
INSERT #queries
SELECT [Database] = COALESCE(obj.value (N''(@Database)[1]'', N''nvarchar(128)''),
QUOTENAME(DB_NAME(sqltext.dbid))),
[Schema] = obj.value (N''(@Schema)[1]'', N''nvarchar(128)''),
[Table] = obj.value (N''(@Table)[1]'', N''nvarchar(128)''),
[Index] = obj.value (N''(@Index)[1]'', N''nvarchar(128)''),
SourceModule = NULL,
SourceModuleID = qp.objectid,
handle = CONVERT(varchar(64), qs.plan_handle, 1),
Statement = LTRIM(SUBSTRING(sqltext.text, (qs.statement_start_offset / 2) + 1,
(CASE qs.statement_end_offset
WHEN -1 THEN DATALENGTH(sqltext.text)
ELSE qs.statement_end_offset
END - qs.statement_start_offset) / 2 + 1)),
FullStatement = stmt.value(N''(@StatementText)[1]'', N''nvarchar(max)'')
FROM sys.dm_exec_query_stats as qs
CROSS APPLY sys.dm_exec_query_plan(qs.plan_handle) AS qp
CROSS APPLY query_plan.nodes
(N''/ShowPlanXML/BatchSequence/Batch/Statements/StmtSimple'') AS batch(stmt)
CROSS APPLY stmt.nodes
(N''$filter'') AS idx(obj)
CROSS APPLY sys.dm_exec_sql_text(qs.plan_handle) AS sqltext
OPTION(MAXDOP 1, RECOMPILE); DECLARE @exec nvarchar(255), @dbname sysname, @c cursor; SET @c = CURSOR FORWARD_ONLY READ_ONLY FOR
SELECT dbname FROM #queries
WHERE modulename IS NULL
AND moduleid IS NOT NULL
GROUP BY dbname; OPEN @c; FETCH NEXT FROM @c INTO @dbname; SET @sql = N''UPDATE q
SET modulename = s.name + N''''.'''' + o.name
+ N'''' ('''' + RTRIM(o.type) + N'''')''''
FROM sys.objects AS o
INNER JOIN sys.schemas AS s
ON o.schema_id = s.schema_id
INNER JOIN #queries AS q
ON q.moduleid = o.object_id
AND q.modulename IS NULL;''; WHILE @@FETCH_STATUS = 0
BEGIN
SET @exec = @dbname + N''.sys.sp_executesql'';
EXEC @exec @sql;
FETCH NEXT FROM @c INTO @dbname;
END SELECT * FROM #queries;'; EXEC sys.sp_executesql @sql;
"@ $Conn = New-Object System.Data.SqlClient.SqlConnection;
$Conn.ConnectionString = "Server=$($global:server);Database=tempdb;";
$Conn.ConnectionString += "Trusted_Connection=Yes; Integrated Security=SSPI;"
$Conn.Open(); $Command = $Conn.CreateCommand();
$Command.CommandText = $query;
$Command.CommandTimeout = 0; $Reader = $Command.ExecuteReader(); $global:table = New-Object system.Data.DataTable;
$col_dbname = New-Object system.Data.DataColumn dbname, ([string]);
$global:table.columns.add($col_dbname);
$col_schemaname = New-Object system.Data.DataColumn schemaname, ([string]);
$global:table.columns.add($col_schemaname);
$col_tablename = New-Object system.Data.DataColumn tablename, ([string]);
$global:table.columns.add($col_tablename);
$col_indexname = New-Object system.Data.DataColumn indexname, ([string]);
$global:table.columns.add($col_indexname);
$col_modulename = New-Object system.Data.DataColumn modulename, ([string]);
$global:table.columns.add($col_modulename);
$col_moduleid = New-Object system.Data.DataColumn moduleid, ([string]);
$global:table.columns.add($col_moduleid);
$col_handle = New-Object system.Data.DataColumn handle, ([string]);
$global:table.columns.add($col_handle);
$col_statement = New-Object system.Data.DataColumn statement, ([string]);
$global:table.columns.add($col_statement);
$col_fullstatement = New-Object system.Data.DataColumn fullstatement,([string]);
$global:table.columns.add($col_fullstatement);
$col_details = New-Object system.Data.DataColumn details, ([string]);
$global:table.columns.add($col_details); while ($Reader.Read())
{
$global:next = $false;
$global:collect_row = $false;
$global:details = "";
[string]$fullstatement = $Reader.GetValue(8).ToString(); $fragment = $parser.Parse([System.IO.StringReader]::New($fullstatement), [ref]$errors);
$visitor = [Visitor]::New();
$fragment.Accept($visitor); if ($global:collect_row -eq $true)
{
$row = $global:table.NewRow();
$row.dbname = $Reader.GetValue(0).ToString();
$row.schemaname = $Reader.GetValue(1).ToString();
$row.tablename = $Reader.GetValue(2).ToString();
$row.indexname = $Reader.GetValue(3).ToString();
$row.modulename = $Reader.GetValue(4).ToString();
$row.moduleid = $Reader.GetValue(5).ToString();
$row.handle = $Reader.GetValue(6).ToString();
$row.statement = $Reader.GetValue(7).ToString();
$row.fullstatement = $fullstatement;
$row.details = $global:details;
$global:table.Rows.Add($row);
}
} $global:table; # on Windows: $global:table | Out-GridView; class Visitor: Microsoft.SqlServer.TransactSql.ScriptDom.TSqlFragmentVisitor
{
[void]Visit ([Microsoft.SqlServer.TransactSql.ScriptDom.TSqlFragment] $fragment)
{
$ftn = $fragment.GetType().Name;
$hint = $global:index_hint_to_find; if ($global:next -eq $true -and $ftn -in ("Identifier", "IdentifierOrValueExpression"))
{
$s = "";
for ($i = $fragment.FirstTokenIndex; $i -le $fragment.LastTokenIndex; $i++)
{
$token = $fragment.ScriptTokenStream[$i];
$s += $token.Text;
}
if ($s.Substring(0,1) -eq "[")
{
$s = $s.Substring(1,$s.Length-2);
}
if ($hint.Length -gt 0)
{
if ($hint.Substring(0,1) -eq "[")
{
$hint = $hint.Substring(1,$hint.Length-2);
}
}
if ($s -ieq $hint -or $global:index_hint_to_find -eq "")
{
$global:collect_row = $true;
$global:details = "Index hint on line $($fragment.StartLine): $($s)";
}
} if ($ftn -eq "IndexTableHint") # next identifier or value expression is our hint name
{
$global:next = $true;
}
else
{
$global:next = $false;
}
}
}
Partial results:
That script identified all index hints. If you want to only find a specific index, you can change the following line 3:
$global:index_hint_to_find = "";
To specify an index name:
$global:index_hint_to_find = "IX_keycol";
This will currently return duplicates for statements in modules that also happen to be in the plan cache, and it may also return unwanted results if there are multiple indexes with the same name. These duplicates shouldn't be difficult to filter out, but I will leave that for a follow-up article. In the meantime, the following diagram represents the scope of what we have solved so far:
There is still the matter of ad hoc queries from applications or other sources. Until they get executed and are present in the plan cache, though, you simply won't know what you don't know. If you can extract queries from your applications and pass them through any sort of test rig, you could process them there with PowerShell. Otherwise, you'll be in the dark – particularly if they are constructed dynamically or fed by user input. On the plus side, ad hoc runtime errors because an index has been dropped or renamed are easy to diagnose by the user manually running the query:
Msg 308, Level 16, State 1
Index 'IX_keycol' on table 'dbo.t1' (specified in the FROM clause) does not exist.
If an application breaks down, though, that's a much more serious scenario, and will require more manual legwork.
Conclusion
Index hint usage can be a difficult problem to solve when things inevitably change. With this PowerShell script, you can identify how pervasive they are in your environment and, at the very least, document where those hints are used. Then you can move on to either tuning the queries so that hints aren't required, creating new indexes that will be used without the hint, or weigh the risk of leaving things as is.
Next Steps
See these tips and other resources involving index hints, exploring the plan cache, and parsing T-SQL:
- Exploring the Plan Cache to Optimize SQL Server Performance
- SQL Server Queries With Hints
- Session State Settings for Cached SQL Server Query Plans
- How to find compiled parameter values for SQL Server cached plans
- Always have a good plan! What's in your SQL Server Plan Cache?
- Identify SQL Server Query Plans with Scans on Nonclustered Hash Indexes
- How to Find Queries Using an Index (and Queries Using Index Hints)
- Finding what queries in the plan cache use a specific index
- Link to problematic NOLOCK part 1 | part 2 | part 3 | part 4
About the author
This author pledges the content of this article is based on professional experience and not AI generated.
View all my tips
Article Last Updated: 2021-11-03