How Join Order Can Affect the Query Plan
By: Ben Snaidero
The order in which the tables in your queries are joined can have a dramatic effect on how the query performs. If your query happens to join all the large tables first and then joins to a smaller table later this can cause a lot of unnecessary processing by the SQL engine.
Generally speaking the SQL Server Optimizer will try to first join the tables that allows it to work with the smallest result set possible. To do this the optimizer will try to figure out which join order provides the smallest result set early in the processing but in very complex queries it does not try all possible combinations as this can become quite resource intensive. As a best practice you should try to order your table join so the join that reduces the result set the most is joined first. Before we start let's add an index to the column in the table we'll be using as the join condition (you'll see more on this in a later topic).
CREATE NONCLUSTERED INDEX idxChild_ParentID ON [dbo].[Child] ([ParentID]) -- cleanup statements --DROP INDEX Child.idxChild_ParentID
Since in most cases this issue arises when queries become really complex and the optimizer has a lot of possible plans to evaluate, i.e.. multiple table joins, in order to illustrate this point more clearly we'll use the force order hint with a simple query. Here is the code to illustrate our poor join order.
SELECT P.ParentID,C.ChildID,S.SmallID FROM [dbo].[Parent] P INNER JOIN [dbo].[Child] C ON C.ParentID=P.ParentID INNER JOIN [dbo].[Small] S ON S.SmallID=C.ParentID OPTION (FORCE ORDER)
Looking at the explain plan for this query we can see that the Parent and Child tables are joined first resulting in 1899980 rows which is then joined to the Small table which reduces the final recordset to 95 rows.
And now let's join them in the proper order so the smallest table is joined first. Here is the SQL statement.
SELECT P.ParentID,C.ChildID,S.SmallID FROM [dbo].[Small] S INNER JOIN [dbo].[Parent] P ON S.SmallID=P.ParentID INNER JOIN [dbo].[Child] C ON P.ParentID=C.ParentID
Looking at the explain plan for this query we see that the Parent table is first joined to the Small table resulting in 5 rows which is then joined to the Child table which produces the final recordset of 95 rows (as above).
Just looking at the explain plans should be enough information for us to see that the second query will perform better but let's take a look at the SQL Profiler statistics just to confirm. As we see from below joining the Small table first significantly reduces the amount of data the query has to process therefore reducing the resources required to execute this query.
|Poor Join Order||265||5935||0||309|
|Good Join Order||0||35||0||0|