By: Matteo Lorini | Updated: 2010-10-12 | Comments (7) | Related: > Performance Tuning
Problem
Batches or store procedures that execute join operations on table variables may experience performance problems if the table variable contains a large number of rows.
Solution
Table variable were introduced in SQL Server with the intention to reduce recompiles, however if they are used in batches or store procedures they may cause a performance issue.
Let's see how this can occur with an example.
First let's create a permanent table and populate it with 100,000 rows.
SET STATISTICS profile OFF GO USE tempdb GO IF OBJECT_ID ('TestPermTable') IS NOT NULL DROP TABLE TestPermTable GO CREATE TABLE TestPermTable (a2 VARCHAR(10),c2 INT) GO CREATE INDEX index_TestPermTable ON TestPermTable(c2) GO --insert 100,000 rows into the perm table SET NOCOUNT ON BEGIN TRAN DECLARE @i INT SET @i = 0 WHILE @i < 100000 BEGIN INSERT INTO TestPermTable VALUES ('asdfghjkli',@i) SET @i = @i + 1 END COMMIT TRAN GO --update stats UPDATE STATISTICS TestPermTable WITH fullscan GO
Let's join the permanent table TestPermTable with a variable table containing 100,000 rows.
SET NOCOUNT ON DECLARE @VariableTable TABLE (c1 INT) BEGIN TRAN DECLARE @i INT SET @i = 0 WHILE @i < 100000 BEGIN INSERT INTO @VariableTable VALUES (@i) SET @i = @i + 1 END COMMIT TRAN SET STATISTICS profile ON SET STATISTICS time ON SELECT @i=TestPermTable.c2 FROM @VariableTable INNER JOIN TestPermTable ON c1=c2 GO SET STATISTICS time OFF SET STATISTICS profile OFF GO
As we can see below, when we started the batch the variable table was empty, so the optimizer estimated a low number of rows. The EstimateRows value for @VariableTable is 1 and therefore the optizmer selected a Nested Loops join.
An under estimation of rows caused the optimizer to pick an inefficient plan, as we can see a Nested Loops join was used with the table variable as the outer table. The Nested Loops join uses one join input as the outer input and one as the inner input. The outer loop consumes the outer input table row by row. The inner loop, executed for each outer row, searches for matching rows in the inner input table. A Nested Loops join is particularly effective if the outer input is small and the inner input is preindexed and large. In many small transactions, such as those affecting only a small set of rows, index Nested Loops joins are superior to both Merge joins and Hash joins. In large queries however, Nested Loops joins are often not the optimal choice.
The solution to our problem, is to add the recompile option to the statement that involves the table variable joining with the tables. By doing this, SQL Server will be able to detect the number of rows at recompile, because the rows have already been populated. This option is only available in SQL Server 2005 and later.
SET NOCOUNT ON DECLARE @VariableTable TABLE (c1 INT) BEGIN TRAN DECLARE @i INT SET @i = 0 WHILE @i < 100000 BEGIN INSERT INTO @VariableTable VALUES (@i) SET @i = @i + 1 END COMMIT TRAN SET STATISTICS profile ON SET STATISTICS time ON SELECT @i=TestPermTable.c2 FROM @VariableTable INNER JOIN TestPermTable ON c1=c2 OPTION (recompile) GO SET STATISTICS time OFF SET STATISTICS profile OFF GO
As you can see, the optimizer knows that our variable table contains 100,000 rows and consequently picks a Hash join which for this example is much faster as the statistics time shows below.
Next Steps
- Read more about Query Hints (Transact-SQL) http://technet.microsoft.com/en-us/library/ms181714(SQL.90).aspx
- Read more about SQL Server Physical Joins
About the author
This author pledges the content of this article is based on professional experience and not AI generated.
View all my tips
Article Last Updated: 2010-10-12