CTE in SQL Server Examples
By: Ryan Kennedy | Updated: 2021-05-13 | Comments (2) | Related: More > Common Table Expressions
As any developer using SQL, writing long queries containing extensive logic is commonplace. This can include many joins, varying levels of aggregation, and filtering across any number of tables. In some cases, a result set cannot be achieved by a single isolated query (SELECT x FROM y WHERE z) and requires a series of queries to be strung together. Additionally, for more complex queries, the best way to write a query is often to break it down into logical steps and build it up piece by piece.
Traditionally, sub-queries are one method to isolate logic into a single SQL statement which can then be added back to the 'main' query that you are trying to build out incrementally. This can be effective, but it is also difficult to read and understand for others.
CTEs offer a logical and legible approach to writing queries that can break the complex queries down to a series of logical steps, help to improve the legibility of your queries, and achieve more complex result sets.
The below queries will utilize the AdventureWorks database from Microsoft, in addition to SQL Server Management Studio and Microsoft SQL Server Developer Edition.
You can download the latest version of this software for free at the following links:
What is a CTE?
By pure definition, a CTE is a 'temporary named result set'. In practice, a CTE is a result set that remains in memory for the scope of a single execution of a SELECT, INSERT, UPDATE, DELETE, or MERGE statement.
Let's break this down and take a look at the syntax to get a better understanding of what this means and why it is useful.
The basic syntax of a CTE is as follows:
WITH <common_table_expression> ([column names]) AS ( <cte_query_definition> ) <operation>
Broken down – the WITH clause is telling SQL Server we are about to declare a CTE, and the <common_table_expression> is how we are naming the result set for reference later. Hence, a 'named result set'.
The [column names] are where you can alias the column names that will come out of the CTE. This is an optional portion of the syntax, and I often just alias the column within the <cte_query_definition> portion of the syntax.
The <cte_query_definition> is always a SELECT statement. This is where we are defining our result set. You can think of this as a temporary table that can be referenced in a FROM or JOIN clause like any other normal table. However, there are some key differences between a CTE and a temporary table which will be described later. You can also alias your columns in this section to be referenced later on.
The <operation> placeholder is where we will do our actual execution and reference the CTE. As mentioned earlier, this can be a SELECT, INSERT, UPDATE, DELETE, or MERGE T-SQL statement. We will look at some examples of these below.
Let's take a look at a CTE example to drive this home:
WITH Simple_CTE AS ( SELECT dd.CalendarYear ,fs.OrderDateKey ,fs.ProductKey ,fs.OrderQuantity * fs.UnitPrice AS TotalSale ,dc.FirstName ,dc.LastName FROM [dbo].[FactInternetSales] fs INNER JOIN [dbo].[DimCustomer] dc ON dc.CustomerKey = fs.CustomerKey INNER JOIN [dbo].[DimDate] dd ON dd.DateKey = fs.OrderDateKey ) SELECT * FROM Simple_CTE
In the above example, we have created a CTE that contains a few joins, specifically selected column list, and a derived column 'TotalSale'. Our operation in this example is just another SELECT statement, just to show how the CTE can be used in the simplest form.
CTE with SELECT
Let's take the syntax a bit further. Often, you might find yourself needing to do a multi-tiered aggregation. That is, an aggregation of an aggregation. CTEs can be a great way to write this type of query in a readable way.
,SUM(OrderQuantity) AS TotalOrdersByMonth
FROM [dbo].[FactInternetSales] fs
INNER JOIN [dbo].[DimDate] dd ON dd.DateKey = fs.OrderDateKey
GROUP BY ProductKey, EnglishMonthName
SELECT ProductKey, AVG(TotalOrdersByMonth) AS 'Average Total Orders By Month'
GROUP BY ProductKey
ORDER BY ProductKey
In this SQL code, we are taking the sum of OrderQuantity by product per month to see how much of each item was sold per month. Then, we are averaging this aggregate to see for each product, what is the monthly average quantity sold.
CTE with INSERT
To build on the previous example, rather than just selecting the aggregated data, we can pair an INSERT statement with the subquery to insert data into a new table.
First, build this table so that we have a table to insert into:
CREATE TABLE [dbo].[TestTable]( [CalendarYear] [smallint] NOT NULL, [OrderDateKey] [int] NOT NULL, [ProductKey] [int] NOT NULL, [TotalSale] [money] NULL, [FirstName] [nvarchar](50) NULL, [LastName] [nvarchar](50) NULL )
Then, in the following example we can use a CTE to Insert Into this table.
WITH Simple_CTE AS ( SELECT dd.CalendarYear ,fs.OrderDateKey ,fs.ProductKey ,fs.OrderQuantity * fs.UnitPrice AS TotalSale ,dc.FirstName ,dc.LastName FROM [dbo].[FactInternetSales] fs INNER JOIN [dbo].[DimCustomer] dc ON dc.CustomerKey = fs.CustomerKey INNER JOIN [dbo].[DimDate] dd ON dd.DateKey = fs.OrderDateKey ) INSERT INTO dbo.TestTable SELECT * FROM Simple_CTE
Note: we have written this as a SELECT * for simplicity, but often you will want to name each column explicitly in case your underlying schema changes in the future.
CTE with UPDATE
Similar to a CTE with DELETE, we can update a table using the same method.
First, let's create a backup table so that we aren't updating the actual table:
SELECT * INTO dbo.DimCustomerBakUpdate FROM DimCustomer
Next, we can update this table. In the below example, we are updating all of the records where the AddressLine2 is NULL as specified in the WHERE clause to be a blank string instead.
WITH Update_CTE AS ( SELECT * FROM dbo.DimCustomerBakUpdate WHERE AddressLine2 IS NULL ) UPDATE Update_CTE SET AddressLine2 = ''
You can see the result – we created a result set of all records where the AddressLine2 is null, and then applied an update to that specific result set. Again, this is a simplistic example meant to demonstrate the capabilities of the statement.
CTE with DELETE
In addition to selecting, inserting, and updating data, you can use a CTE to delete data from a table.
DELETEs requires that you have only used one table in your CTE, so you can't use any join statements, or you will get an error.
Just like we did the update, let's create another backup table to delete records from:
SELECT * INTO dbo.DimCustomerBak FROM DimCustomer
Now, we can delete from this table using a CTE. In this example, we will delete all records where the AddressLine2 is null.
WITH MissingAddress_CTE AS ( SELECT * FROM dbo.DimCustomerBak WHERE AddressLine2 IS NULL ) DELETE FROM MissingAddress_CTE
When we select from the table again, you can see there are no longer any records with NULL in the AddressLine2 column.
Multiple CTEs in a Single Query
Finally, you are not limited to just writing a single CTE in a query. You can write as many as you want and separate them with a comma:
WITH Sales_Cust_Join_CTE AS ( SELECT fs.OrderDateKey ,fs.ProductKey ,fs.OrderQuantity * fs.UnitPrice AS TotalSale ,dc.FirstName ,dc.LastName FROM [dbo].[FactInternetSales] fs INNER JOIN [dbo].[DimCustomer] dc ON dc.CustomerKey = fs.CustomerKey ) ,Date_CTE AS ( SELECT DateKey ,CalendarYear FROM [dbo].[DimDate] ) SELECT CalendarYear ,ProductKey ,SUM(TotalSale) AS TotalSales FROM Sales_Cust_Join_CTE INNER JOIN Date_CTE ON Date_CTE.DateKey = Sales_Cust_Join_CTE.OrderDateKey GROUP BY CalendarYear ,ProductKey ORDER BY CalendarYear ASC ,TotalSales DESC
In this example, we create two CTEs with some joins, logic, and selecting only a few specific columns. In our final operation, we join the CTEs together and form an aggregate to show total sales for each product by calendar year.
A word of caution, if you find yourself with many CTEs, the code can become more challenging to read. So, while they can be a great tool, try not to go overboard with them.
When to use CTEs, and When not to
Though CTEs are a great way to write cleaner code, they should not be used in every scenario. For example, one major limitation is that a CTE is limited to the scope of a single execution. What that means in practice is that if you need to do more than one operation with this result set, you need to declare the CTE multiple times. This is not performant because we are doing the work of the CTE multiple times.
In this scenario it would probably be better to use a temporary table. Temporary tables can be used across multiple queries, and they can be indexed and store column statistics.
Generally speaking, if you need to create a named result set that will be used by multiple subsequent queries and the data is very large, you are better off using a temporary table. If your data is relatively small, and you only need the result set for the context of a single query, you are best off using a CTE as there is less overhead involved.
Read more about temp tables in this article Types of SQL Server Temporary Tables.
If you are familiar with subqueries, you might be curious what the difference is between a subquery and a CTE. CTEs can be recursive CTEs, whereas subqueries cannot. However, from a performance perspective, there is no real difference. The only real difference is readability, and I would always recommend using a CTE over a subquery over a CTE.
Let's take a look at what I mean:
WITH Simple_CTEAS ( SELECT dd.CalendarYear ,fs.OrderDateKey ,fs.ProductKey ,fs.OrderQuantity * fs.UnitPrice AS TotalSale ,dc.FirstName ,dc.LastName FROM [dbo].[FactInternetSales] fs INNER JOIN [dbo].[DimCustomer] dc ON dc.CustomerKey = fs.CustomerKey INNER JOIN [dbo].[DimDate] dd ON dd.DateKey = fs.OrderDateKey ) SELECT * FROM Simple_CTE SELECT * FROM ( SELECT dd.CalendarYear ,fs.OrderDateKey ,fs.ProductKey ,fs.OrderQuantity * fs.UnitPrice AS TotalSale ,dc.FirstName ,dc.LastName FROM [dbo].[FactInternetSales] fs INNER JOIN [dbo].[DimCustomer] dc ON dc.CustomerKey = fs.CustomerKey INNER JOIN [dbo].[DimDate] dd ON dd.DateKey = fs.OrderDateKey ) Subquery
These two queries return the same data, but which is easier to read? The CTE query allows us to logically arrive at the steps we took to get to our result, whereas the subquery feels backwards and difficult to read.
- Check out these CTE tutorials on MSSQLTips:
- Find a SQL query you wrote using a Subquery, and try converting it to using a CTE.
About the author
View all my tips
Article Last Updated: 2021-05-13