By: Rick Dobson | Comments (5) | Related: 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | > Temp Tables
Problem
CTE is an abbreviation for Common Table Expression. A CTE is a SQL Server object, but you do not use either create or declare statements to define and populate it. As with other temporary data stores, the code can extract a result set from a relational database. CTEs are highly regarded because many believe they make the code for a temporary data store easier to read and maintain than other types of temporary data stores. You can define a CTE via the with keyword followed by a CTE name. The with keyword and name denotes the beginning of a CTE. After a CTE's name, there are two pair of parentheses. The first set of parentheses can contain the field names exposed through the CTE. The second set of parentheses includes the code for populating the fields. A SELECT statement immediately after the second set of parentheses can extract values from the CTE and optionally combine its contents with other data sources. You can also use a trailing SELECT statement for a CTE to filter and order CTE result set rows.
Solution
For the examples below we are using the AdventureWorks2014 database. Download a copy and restore to your instance of SQL Server to test the below scripts.
An Overview of SQL Server CTE Syntax and Application Use Cases
One major reason for a CTE is the ease of making the code for the temporary data store readily available. The following representation shows some pseudo-code for a CTE. The statement following the T-SQL code for CTE temporary data store does not have to be a SELECT statement, but the scope of the CTE is restricted to whatever statement type immediately follows the CTE. Legitimate following statement types include SELECT, INSERT, UPDATE, DELETE, and MERGE.
with CTE_name (field_name_1...field_name_n) ( T-SQL code for populating field_name_1 through field_name_n ) Select list from CTE_name optionally other select statement clauses as required
You can think of a CTE as an alternative to a derived table or a view. However, the code for the temporary data store does not have to reside in a FROM clause as is the case for a derived table. Also, views are permanent database structures that are designed for use in multiple contexts. CTE developers do not need special permissions to create permanent data structures as is the case for creating a view. On the other hand, re-using a CTE involves copying the defining code to whatever other environment requires its use. Additionally, you have the problem of maintaining consistency from one context to the next.
It is possible to string together two or more CTEs so that a subsequent T-SQL statement can have access to all the CTEs jointly. These CTEs can be from different sources or the same source with different filtering criteria. In the statement following the CTE definition, you can splice the CTE result sets together with a join operator or concatenate the CTE result sets with a union operator.
One especially distinguishing feature of CTEs is their ability to develop a result set recursively. The syntax for a recursive CTE is marginally different than for a non-recursive CTE. Critically, you do not have to program a loop. Instead, the code for a recursive CTE is procedurally based. This tutorial offers some coverage of recursive CTEs and concurrently reviews an application for the SQL Server hierarchyid data type that is richer than a simple parent-child relationship between the rows in a hierarchical set. For more thorough commentary on how to use the hierarchyid data type follow selected links in the Additional Information for this tutorial part.
Recursive CTEs are especially well suited for mapping the relationships among workers in an enterprise or among components in a bill of materials. This tutorial includes a sample for representing the relationships among workers in an organization. You can find a bill of materials CTE example in this source.
Here's some pseudo-code for a recursive CTE. There are two T-SQL SELECT statements within the second set of parentheses for a recursive CTE. These two statements should return the same number and types of columns. The initial T-SQL statement returns the anchor or top element in a hierarchical result set or a subset of a hierarchical set. A UNION ALL operator must separate the anchor SELECT statement from the SELECT statement for the recursive elements. The T-SQL statement after the UNION ALL statement returns the recursive set of hierarchical elements that trail the anchor. You can re-arrange the sequencing of returned rows from a recursive CTE with an order by clause in the SELECT statement trailing a CTE.
with CTE_name (field_name_1...field_name_n) ( T-SQL code for populating field_name_1 through field_name_n in anchor element union all T-SQL code for recursive elements after the anchor ) Select list from CTE_name optionally other select statement clauses as required
A Simple SQL Server CTE
The CTE samples in this tutorial rely on the Employee and Person tables in the AdventureWorks2014 database. The Employee table contains information about employee business entities in the database. This information includes BusinessEntityID, OrganizationNode, OrganizationLevel, and JobTitle.
- The BusinessEntityID field has an int data type value that uniquely identifies an employee.
- The OrganizationNode field has a hierarchyid data type that uniquely identifies
the hierarchical relationship of an employee in the set of all employees; the
information in a hierarchyid field value captures information about the depth
and breadth of relationships among the entities in a set.
- For example, the OrganizationNode field contains information about to whom an employee reports as well as whom, if anyone, reports to an employee.
- The hierarchyid data type also embodies information about sibling relationships, such as whom else reports to the same manager.
- The OrganizationLevel field is computed via the GetLevel method of a hierarchyid data type, such as OrganizationNode in the create table statement for the Employee table. The value of NULL belongs to the top-level employee in an organization. The employees reporting to the top-level employee have an OrganizationLevel value of 1. Each employee reporting to an employee with an OrganizationLevel value of 1 has an OrganizationLevel value of 2, and so forth.
- The JobTitle field has a nvarchar type with a maximum length of 50 characters to display the name for an employee's position, such as Chief Executive Officer, for the top-level employee in a hierarchy.
The Person table contains information about persons with several different kinds of relationships within the AdventureWorks company. Three columns of particular importance to this tutorial are the following ones.
- The BusinessEntityID field has the same int data type value for an employee in both Person and Employee tables. There are other types of business entities in the Person table besides employees.
- The FirstName and LastName fields have nvarchar data types with maximum field lengths of fifty characters.
- By joining the Person table to the Employee table by BusinessEntityID value, you can retrieve first and last names for a set of employees.
The following script displays the code for an inner join between all rows in Employee and Person tables. The join is based on the BusinessEntityID values from both tables. The script and its result set reveal the action of the ToString and GetAncestor methods for the hierarchyid field value of the OrganizationNode column in the Employee table.
-- inner join for all rows in Employee and Person tables -- apply ToString and GetAncestor methods to OrganizationNode -- hierarchyid data type SELECT e.BusinessEntityID, e.OrganizationNode, e.OrganizationNode.ToString() OrganizationNodeString, e.OrganizationNode.GetAncestor(1).ToString() AncestorNodeString, e.OrganizationLevel, e.JobTitle, p.FirstName, p.LastName FROM [AdventureWorks2014].[HumanResources].[Employee] e INNER JOIN [AdventureWorks2014].[Person].[Person] p ON e.BusinessEntityID = p.BusinessEntityID ORDER BY OrganizationNode.ToString()
The following screen shot displays an excerpt from the preceding script. The excerpt shows the first thirty-nine employee rows from the total of 290 employees in the AdventureWorks2014 database.
- The top row is for Keri Sánchez, the Chief Executive Officer.
- The OrganizationNode column value for this row is NULL because Keri does not belong to an organizational unit, but all organizational units report directly or indirectly to the Chief Executive Officer.
- The OrganizationNodeString and AncestorNodeString column values are also NULL because these columns depend on methods for the hierarchyid data type in the OrganizationNode column which has a value of NULL.
- The second row is for Terri Duffy, Vice President of Engineering.
- Terri reports directly to Keri Sánchez, the Chief Executive Officer.
- The value (0x58) for the OrganizationNode column value in the second row has hierarchical information coded within it for Terri Duffy.
- The ToString method for the OrganizationNode value on the second row transforms the OrganizationNode value to a generic representation of the hierarchy value for the row. The /1/ value points to one level below the top level.
- The Ancestor(1) method extracts the OrganizationNode value of the parent for the current row. The parent's row has the implicit hierarchy value of /.
- Terri Duffy has one direct report whose name is Roberto Tamurello and whose
job title is Engineering Manager.
- Information for Roberto appears in the third row of the result set.
- Because Roberto reports directly to Terri, Roberto's OrganizationNodeString value is /1/1/.
- Also, Roberto's AncestorNodeString value is /1/, which points directly to Terri, his parent in the employee hierarchy.
- Rows four through fifteen are for employees reporting directly or indirectly to Terri Duffy. This is confirmed by the fact that AncestorNodeString value for these rows always begins with /1/.
- Row sixteen is for David Bradley, whose job title is Marketing Manager.
- Because David's AncestorNodeString value is /, you can tell that David reports directly to Keri Sánchez, the Chief Executive Officer.
- David's OrganizationNodeString value is /2/. David is a sibling to Terri Duffy in the employee hierarchy; David and Terri both report to Keri.
- The employees represented in rows seventeen through twenty-four all have an AncestorNodeString value of /2/. Therefore, the employees in these rows all report directly to David.
- The employee on row twenty-five points to James Hamilton, the Vice President
of Production.
- James reports directly to the Chief Executive Officer since his AncestorNodeString value is /.
- James' OrganizationNodeString value is /3/. This value indicates he is a sibling along with Terri Duffy and David Bradley in the employee hierarchy.
- All the remaining employees in the excerpt report directly or indirectly to James. This is confirmed by the fact that the OrganizationNodeString value for these rows all start with /3/.
Now that you have a grasp of hierarchical employee data in the AdventureWorks2014 database, you are ready to see your first application of a CTE for the join of the Employee and Person tables.
- The script below extracts a subset of rows from the preceding SELECT statement.
- The with keyword for specifying the beginning of a CTE appears on the first line followed by the CTE's name, simple_cte.
- Next, a pair of parentheses contains the list of the column names for the fields returned by the CTE. These are the column names from the preceding script.
- An as keyword separates the first and second pair of parentheses.
- Within the second set of parentheses is some T-SQL for joining the Employee and Person tables by BusinessEntityID. The columns in the SELECT list match the order of those in the first pair of parentheses. This query returns data for all 290 employees in the Adventureworks2014 database.
- While the CTE code performs the join for the two tables, it does not perform
any row filtering or ordering. These functions are implemented in the
SELECT statement following the CTE.
- The SELECT list item (*) indicates that all columns from the CTE result set should be displayed.
- However, a where clause specifies that not all rows should be exposed.
In particular, the query returns rows for just two employees and those who
report directly or indirectly to them.
- Terri Duffy and employees who report directly or indirectly to her all have an OrganizationNodeString value that begins with /1
- David Bradley and employees who report directly or indirectly to him all have an OrganizationNodeString value that begins with /2.
- As a consequence, the where clause argument is as follows: OrganizationNodeString like '/1%' or OrganizationNodeString like '/2%'
- The ORDER BY clause in the SELECT statement after the CTE ensures that rows appear in hierarchical order. This order happens to be in the same order as the BusinessEntityID values, but there is no requirement in either the CTE or the SELECT statement following the CTE that requires the BusinessEntityID values and OrganizationNode values to have the same order.
-- simple cte WITH simple_cte (BusinessEntityID, OrganizationNode, OrganizationNodeString, AncestorNodeString, OrganizationLevel, JobTitle, FirstName, LastName) AS (SELECT e.BusinessEntityID, e.OrganizationNode, e.OrganizationNode.ToString() OrganizationNodeString, e.OrganizationNode.GetAncestor(1).ToString() AncestorNodeString, e.OrganizationLevel, e.JobTitle, p.FirstName, p.LastName FROM [AdventureWorks2014].[HumanResources].[Employee] e INNER JOIN [AdventureWorks2014].[Person].[Person] p ON e.BusinessEntityID = p.BusinessEntityID ) SELECT * FROM simple_cte WHERE OrganizationNodeString LIKE '/1%' OR OrganizationNodeString LIKE '/2%' ORDER BY OrganizationNode
To help you follow the code, the next screen shot shows the result set from the simple_cte object and its trailing select statement.
- As you can see from the border for the Results tab there are just twenty-three rows in the result set.
- The OrganizationNodeString column values all begin with either /1 for Terri Duffy and those in her report line or /2 for David Bradley and those in his report line.
Using More Than One SQL Server CTE at the Same Time
So, if one CTE can simplify queries and make them more readable, then might using two or more CTEs in a query make a query even easier to read and maintain? The next query sample shows the syntax for using more than one CTE at the same time. You can use the demonstration in this section to answer the question for yourself.
The next sample duplicates and extends the functionality of the preceding simple CTE example. The next sample duplicates the functionality because it returns all rows from the Employee table with an OrganizationNodeString value like '/1%' or '/2%'. Additionally, the following sample adds one more row with an OrganizationNode value of NULL.
When using more than one CTE in a single query, you still need just one with keyword at the beginning of the query. Then, each additional CTE is separated from the preceding CTE by a comma. The following query has two CTEs.
- The first CTE has the name simple_cte_employee. This CTE returns the BusinessEntityID, OrganizationNode, OrganizationNodeString, AncestorNodeString, OrganizationLevel, and JobTitle columns from the Employee table. There is no where clause on the rows extracted from the Employee table so the query returns rows for all employees.
- The second CTE has the name simple_cte_person. This CTE returns BusinessEntityID, FirstName, and LastName columns from the Person table. Again, data is returned for all rows in the Person table whether or not the person is an employee.
The SELECT statement following the pair of CTEs specifies which columns to return from each CTE as well as the filtering determining which rows are returned from the SELECT statement. This SELECT statement builds exclusively on the result sets returned from the simple_cte_employee and simple_cte_person result sets.
- All columns are returned from the simple_cte_employee result set through the specification of simple_cte_employee.* in the SELECT list.
- Only the FirstName and LastName columns are returned from the simple_cte_person result set.
- Two mechanisms are used for filtering rows from each CTE that go into the
result set for the trailing SELECT statement.
- An inner join limits rows from the simple_cte_person result set to those rows with a BusinessEntityID value that matches one in the simple_cte_employee result set. This filtering mechanism retains FirstName and LastName values only for rows that are employees.
- A WHERE clause expression designates three criteria combined via OR
operators for inclusion or rows from the join of the two CTEs.
- The first criterion is for the employee whose OrganizationNode value is NULL. This criterion is not in the preceding CTE sample, so it adds new functionality. The criterion allows the inclusion of the Chief Executive Officer into the result set.
- The second and third criteria are in the preceding CTE sample. The second criterion (OrganizationNodeString like '/1%' ) permits the inclusion of the Vice President of Engineering and those in her report line. The third criterion (OrganizationNodeString like '/2%' ) permits the inclusion of the Marketing Manager and those in his report line.
-- using more than one cte at the same time WITH simple_cte_employee ( BusinessEntityID, OrganizationNode, OrganizationNodeString, AncestorNodeString ,OrganizationLevel, JobTitle) AS (SELECT BusinessEntityID, OrganizationNode, OrganizationNode.ToString() OrganizationNodeString, OrganizationNode.GetAncestor(1).ToString() AncestorNodeString, OrganizationLevel, JobTitle FROM [AdventureWorks2014].[HumanResources].[Employee] ), simple_cte_person (BusinessEntityID, FirstName, LastName) AS (SELECT BusinessEntityID, FirstName, LastName FROM [AdventureWorks2014].[Person].[Person] ) SELECT simple_cte_employee.*, simple_cte_person.FirstName, simple_cte_person.LastName FROM simple_cte_employee INNER JOIN simple_cte_person ON simple_cte_employee.BusinessEntityID = simple_cte_person.BusinessEntityID WHERE OrganizationNodeString IS NULL OR OrganizationNodeString LIKE '/1%' OR OrganizationNodeString LIKE '/2%' ORDER BY OrganizationNode
Here's the result set from the query based on two CTEs.
- This first point to notice is that the result set for the query described above has one additional row relative to the result set from the initial example with just one CTE. This additional row is for the inclusion of the Chief Executive Officer.
- Otherwise, all rows are the same as from the first CTE example. This outcome follows from the fact that the row inclusion criteria are the same, except that the where clause for including rows appears in the trailing SELECT statement instead of the body of the CTE as in the simple_cte example.
Using a Recursive SQL Server CTE with a Parameter
CTEs are especially different from other temporary data stores in that they can return result sets with a recursive relationship among its rows. In particular, you can extract a set of records in a recursive set that tie back to an anchor record. The following script shows one approach to implementing this kind of solution for the employees in the Adventureworks2014 database.
Recall from the overview of CTE syntax that the second set of parentheses for a recursive CTE definition has two SELECT statements separated by a UNION ALL operator. The first SELECT statement specifies the column values for an anchor member in a set of records with a hierarchical relationship among its rows. The second SELECT statement finds all records that are hierarchically related to the anchor record. In the case of Employee table, you can use selected methods for OrganizationNode column values as a basis for recursively finding hierarchically related rows because the OrganizationNode column encodes hierarchical links between the rows in a set.
The following script starts by declaring a local variable named @BusinessEntityID and assigning a value of 2 to the local variable. From the prior examples, you can see that a BusinessEntityID value of 2 points to the Vice President of Engineering. Also, the hierarchyid data type (OrganizationNode) value for employees in Engineering, always starts with 0x5. Therefore, by comparing the ancestor OrganizationNode value to the OrganizationNode value for the Vice President of Engineering, a query can return all other employees in Engineering. The following query after the union all statement shows how to design such a query statement.
The last column from both the first and second SELECT statements reveals the RecursionLevel value for a row in the set of rows returned by the recursive CTE. This value starts at 0 for the anchor member, and it increases by one for the number of organizational levels of each member below the anchor member.
By modifying the value assigned to the @BusinessEntityID local variable, you can change the subset of records returned from the following query. For example, a BusinessEntityID value of 16 points at the Marketing Manager. Therefore, if @BusinessEntityID equals 16, the query's result set contains the Marketing Manager and all the employees who report to the Marketing Manager.
-- recursive cte with a parameter for starting manager DECLARE @BusinessEntityID int SET @BusinessEntityID = 2; with [EMP_cte] ( [BusinessEntityID] ,[OrganizationNode] ,OrganizationNodeString ,AncestorNodeString ,OrganizationLevel ,[JobTitle] ,[FirstName] ,[LastName] ,[RecursionLevel] ) -- CTE name and columns AS ( SELECT e.[BusinessEntityID] ,e.[OrganizationNode] ,e.[OrganizationNode].ToString() OrganizationNodeString ,e.OrganizationNode.GetAncestor(1).ToString() AncestorNodeString ,e.OrganizationLevel ,e.[JobTitle] ,p.[FirstName] ,p.[LastName] ,0 -- Get the initial list of Employees for Manager n FROM [HumanResources].[Employee] e INNER JOIN [Person].[Person] p ON p.[BusinessEntityID] = e.[BusinessEntityID] WHERE e.[BusinessEntityID] = @BusinessEntityID UNION ALL SELECT e.[BusinessEntityID] ,e.[OrganizationNode] ,e.[OrganizationNode].ToString() OrganizationNodeString ,e.OrganizationNode.GetAncestor(1).ToString() AncestorNodeString ,e.OrganizationLevel ,e.[JobTitle] ,p.[FirstName] ,p.[LastName] , [RecursionLevel] + 1 -- Join recursive member to anchor FROM [HumanResources].[Employee] e INNER JOIN [Person].[Person] p ON p.[BusinessEntityID] = e.[BusinessEntityID] INNER JOIN [EMP_cte] ON e.[OrganizationNode].GetAncestor(1) = [EMP_cte].[OrganizationNode] ) SELECT * FROM [EMP_cte] ORDER BY OrganizationNode
Here are two result sets from the preceding query.
- The top result set returns when the @BusinessEntityID value is 2. This set of employees includes the Vice President of Engineering and all the employees that include her as a direct ancestor or an indirect ancestor in OrganizationNode.
- The bottom result set returns when the @BusinessEntityID value is 16. This set of employees includes the Marketing Manager and all employees that include him as an ancestor in OrganizationNode.
Next Steps
Here are some links to resources that you may find useful to help you grow your understanding of content from this section of the tutorial.
- Introduction to Common Table Expressions (CTE's)
- Introduction to Non-Recursive CTEs
- Keeping it simple and how to do multiple CTE in a query
- Recursive CTEs Explained
- SQL Server Common Table Expressions (CTE) usage and examples
- Recursive CTE
- Using MERGE to insert, delete and update all-in-one
- Hierarchical Data (SQL Server)
- ToString (Database Engine)
- GetAncestor (Database Engine)
About the author
This author pledges the content of this article is based on professional experience and not AI generated.
View all my tips