Problem In SQL Server 2000, you need to implement recursive queries to retrieve data which is presented in a hierarchical format. We normally resort to implementing views, cursors or derived tables and perform queries against them. The problem arises when the hierarchy level increases as SQL Server is limited to 32 levels of recursion. We need a better way to implement recursive queries in SQL Server 2005. How do we do it?
Solution Common Table Expression (CTE) was introduced in SQL Server 2005 and can be thought of as a temporary result set that is defined within the execution scope of a single SELECT, INSERT, UPDATE, DELETE, or CREATE VIEW statement. You can think of CTE as an improved version of derived tables that more closely resemble a non-persistent type of view. Look at CTEs as your derived tables in SQL Server 2000. A CTE can be used in many of the same ways you use a derived table. CTEs can also contain references to themselves. This allows the developer to write complex queries simpler. CTEs can also be used in place of views. The use of CTEs provides two main advantages. One is that queries with derived table definitions become more simple and readable. While traditional T-SQL constructs that are used to work with derived tables normally requires a separate definition for the derived data such as a temporary table or a table-valued function, using CTEs make it easier to see the definition of the derived table with the code that uses it. The other thing is that CTEs significantly reduces the amount of code required for a query that traverses recursive hierarchies.
To understand what a CTE is all about, let's first take a look at the syntax to create it in SQL Server 2005.
Syntax In general form a recursive CTE has the following syntax:
WITH cte_alias (column_aliases) AS ( cte_query_definition --initialization UNION ALL cte_query_definition2 --recursive execution ) SELECT * FROM cte_alias
You provide the CTE with an alias and an optional list of aliases for its result columns following the keyword WITH which usually defines the derived table based on the query definition; write the body of the CTE; and refer to it from the outer query.
To put this in the right perspective, let's come up with a simple example which uses recursion. We'll look at the Employees table in the Northwind database and see that a particular employee reports to another employee. One question we can come up with is, "Who reports to whom?" The Employees table is designed in such a way that the ReportsTo column is a foreign key field that refers to the primary key field EmployeeID. Thus, we can create a query to answer our question. A sample query using CTE will look something like this.
WITH Managers AS ( --initialization SELECT EmployeeID, LastName, ReportsTo FROM Employees WHERE ReportsTo IS NULL UNION ALL --recursive execution SELECT e.employeeID,e.LastName, e.ReportsTo FROM Employees e INNER JOIN Managers m ON e.ReportsTo = m.employeeID ) SELECT * FROM Managers
The recursive CTE, Managers, defines an initialization query and a recursive execution query
The initialization query returns the base result and is the highest level in the hierarchy. This is identified by the ReportsTo value of NULL, which means that the particular Employee does not report to anybody. Depending on how the table is designed, the value can be anything as long as it represents the highest level in the hierarchy
The recursive execution query is then joined to the initialization query using the UNION ALL keyword. The result set is based on the direct subordinate as returned by the initialization query, which then appears as the next level in the hierarchy. Recursion occurs because of the query referencing the CTE itself based on the Employee in the Managers CTE as input. The join then returns the employees who have their managers as the previous record returned by the recursive query. The recursive query is repeated until it returns an empty result set.
The final result set is returned by querying the Managers CTE
The sample query contains the elements that a recursive CTE must contain. What's more is that the code is a lot more readable. This enables the developers to write complex queries with ease.
You can also use a query hint to stop a statement after a defined number of loops. This can stop a CTE from going into an infinite loop on a poorly coded statement. You do this by including the MAXRECURSION keyword in the SELECT query referring to the CTE. To use it in the previous example
SELECT * FROM Managers OPTION (MAXRECURSION 4)
To create a similar yet non-recursive query that produces the same result in SQL Server 2000, you might come up with something similar to this code:
DECLARE @rowsAdded INT
--table variable to hold accumulated results DECLARE @managers TABLE --initialize @managers who do not have managers (EmpID INT, MgrID INT, processed INT DEFAULT(0))
INSERT @managers SELECT EmployeeID, ReportsTo, 0 FROM Employees WHERE ReportsTo IS NULL
--do this while new employees are added in the previous iteration WHILE @rowsAdded > 0 BEGIN
--mark employee records going to be found in this iteration with --processed=1 UPDATE @managers SET processed=1 WHERE processed=0
--insert employees who report to employees not yet processed INSERT @managers SELECT EmployeeID, ReportsTo, 0 FROM Employees e INNER JOIN @managers r ON e.ReportsTo = r.EmpID WHERE ReportsTo <> EmployeeID AND r.processed = 1
SET @rowsAdded = @@rowcount
--mark employee records found in this iteration as processed UPDATE @managers SET processed=2 WHERE processed=1
SELECT * FROM @managers
Given the example above, hierarchical data structures, organizational charts and other parent-child table relationship reports can easily benefit from the use of recursive CTEs. Common Table Expression is just one of those T-SQL enhancements available for SQL Server 2005. CTEs bring us the chance to create much more complex queries while retaining a much simpler syntax. They also can lessen the administrative burden of creating and testing views for situations where the view will not be reused.
I think I would write SQL 2000 example without an UPDATE query, and make MAXRECURSION more obvious.
DECLARE @rowsAdded INT
DECLARE @Iteration INT = 1;
DECLARE @MaxRecursion INT = 4; -- Defaults to 100 if not specified
--table variable to hold accumulated results
DECLARE @managers TABLE (EmpID INT, MgrID INT, Iteration INT)
-- Anchor query
--initialize @managers who do not have managers
SELECT EmployeeID, ReportsTo, @Iteration
WHERE ReportsTo IS NULL ;
SET @rowsAdded=@@rowcount ;
-- Recursive query -- Uses only the results from the previous iteration, and continues until no more rows added (or MAXRECURSION reached) WHILE @rowsAdded > 0 AND @Iteration <= @MaxRecursion BEGIN
SELECT EmployeeID, ReportsTo, @Iteration + 1
FROM Employees e
INNER JOIN @managers r ON e.ReportsTo = r.EmpID
WHERE ReportsTo <> EmployeeID AND r.Iteration = @Iteration;
SET @Iteration = @Iteration + 1;
An example I was able to understand and put into action.
I have one table (Inventory) with three columns (ID, Desc, ParentID) where the parentID links back to the ID.
I store an item in house A (id=1, PID = null), Room B (ID=2, PID=1) box C (ID=3, PID=2) (house A > Room B > box C).
So when i query the table i have the box C ID of 3. this allowed me to 'walk' up the chain to get room B and house A
WITH InvItems AS
SELECT a.ID ,a.Description , A.ParentLocationID
from InventoryLocations A
where a.ID = 6
SELECT b.ID,b.Description, b.ParentLocationID
from InventoryLocations b
INNER JOIN InvItems I ON b.ID = I.ParentLocationID
SELECT * FROM InvItems
ID Desc ParentD
3 box C 2
2 Room B 1
1 House A Null
Tuesday, March 12, 2013 - 12:47:48 AM - Rashid Abbas
Excellent example, it saves a lot of time over other methods to do the same thing. Much simpler than examples from MS. Personally, I really appreciate it when someone, like this writer, makes such an effort to explain how to do something like this. This will save everyone alot of time and effort.
Keep up the good work.
Friday, January 07, 2011 - 8:19:46 PM - Randy Edge