Exploring a Recursive Common Table Expression in SQL Server with Examples

By:   |   Updated: 2023-11-21   |   Comments (2)   |   Related: > Common Table Expressions


Problem

Have you built a recursive CTE lately? Yeah, me either. In the real world, I use a recursive CTE rarely. Along with a PIVOT operator, the syntax for a recursive CTE terrifies me. But when you break it down, there are only two parts. Don't believe me? Keep on reading. Also, why do I get an error message about maximum recursion exhausted at 100?

Solution

In this article, we'll explore what a recursive CTE is and when to use it. We'll cover the two main parts along with an optional third. I'll mention an alternative to using a recursive CTE—spoiler, it's a WHILE loop. We'll also look at getting around that pesky maximum recursion error. By the end of this article, you'll find building a recursive CTE less intimidating. If you don't need to create one today, I hope you bookmark this article for later.

What is a CTE?

You've likely used a Common Table Expression (CTE) if you write T-SQL. They come in two varieties: non-recursive and recursive. I include the former in every query I write that's longer than two lines of code. Microsoft introduced CTEs in SQL Server 2005—along with synonyms, everyone's favorite feature. Oracle was the first RDBMS I remember building a CTE in, and it was love at first sight.

CTEs are like derived tables and hold a temporary result set—don't confuse them with temporary tables. Unlike temporary tables, you can only reference them in a single statement.

People find using CTEs makes reading code easier than nested queries. A developer friend hates using them and says they make reading code harder. He rewrites other people's queries and removes the CTEs. Seriously, what's up with this guy? Style is a matter of personal preference. For me, reading top-down makes everything in life simpler. I prefer the CTE below compared to the derived table.

DROP TABLE IF EXISTS #Hero;
CREATE TABLE #Hero
(
    Id INT NOT NULL,
    [HeroName] VARCHAR(25) NOT NULL,
    [CreatedDate] DATE NOT NULL
);
GO

INSERT INTO #Hero
(
    Id,
    HeroName,
    CreatedDate
)
VALUES
(1, 'Batman', '01-01-2023'),
(2, 'Superman', '01-01-2023'),
(3, 'Wonder Woman', '01-01-2023'),
(4, 'Wonder Woman', '01-03-2023');


-- Find the duplicate superhero using a CTE.
;WITH DupHero
AS (SELECT Id,
           ROW_NUMBER() OVER (PARTITION BY HeroName ORDER BY h.CreatedDate) AS rn
    FROM #Hero AS h)
SELECT h.HeroName
FROM #Hero AS h
    JOIN DupHero AS d
        ON d.Id = h.Id
WHERE d.rn > 1;

-- Find the duplicate superhero using a derived table.
SELECT h.HeroName
FROM #Hero AS h
    JOIN
    (
        SELECT Id,
               ROW_NUMBER() OVER (PARTITION BY HeroName ORDER BY CreatedDate) AS rn
        FROM #Hero
    ) AS d
        ON d.Id = h.Id
WHERE d.rn > 1;

I often find myself using a CTE in combination with a windowing function. For example, I start by generating a row number in the first CTE and reference that number in the next. Years ago, a coworker called this the waterfall effect. Something about that phrase stuck with me. If you are not using CTEs, at least give them a try.

Explore Recursive CTEs

Let's explore the other type of CTE that frightens people: recursive. The Merriam-Webster dictionary defines recursion as a technique involving "…a procedure, subroutine, function, or algorithm that calls itself one or more times until a specified condition is met..." The definition sounds like a WHILE loop or cursor to me.

Have you used the Fibonacci sequence? If you work with an Agile team, they love using it to assign story points. It's a type of recursion where each number is the sum of the prior two numbers—I stopped at 21 below because of blackjack.

0, 1, 1, 2, 3, 5, 8, 13, 21,…

The typical recursive CTE example involves a company hierarchy or family tree. A recursive CTE references a result set multiple times until it meets a condition. In the family tree example, that's all the family members. Without a recursive CTE, you might use a WHILE loop or cursor to obtain the same results. Here is an article by Edwin Sarmiento where he provides an example. Now, let's look at the two parts of a recursive CTE.

Parts of a Recursive CTE

Someone asking me to write a recursive CTE in an interview is what nightmares are made of. At its core, a recursive CTE consists of two parts; people sometimes throw a third one in there, but we'll stick with two for now.

Let's first look at the entire dataset.

DROP TABLE IF EXISTS #Hero;
CREATE TABLE #Hero
(
    Id INT NOT NULL,
    [HeroName] VARCHAR(25) NOT NULL,
    [BossId] INT NULL,
    [BirthYear] DATE NOT NULL
);
GO

INSERT INTO #Hero
(
    Id,
    HeroName,
    BossId,
    BirthYear
)
VALUES
(5, 'Professor X', NULL, '01-01-1932'),
(6, 'Cyclops', 5, '01-01-1977'),
(3, 'Wolverine', 6, '01-01-1832'),
(4, 'Storm', 3, '01-01-1975'),
(2, 'Rogue', 3, '01-01-1981'),
(1, 'Jubilee', 2, '01-01-1989');

;WITH RecursiveCTE
AS (SELECT Id,
           HeroName,
           BossId
    FROM #Hero
    WHERE BossId IS NULL -- Here we start with Professor X

    UNION ALL
    SELECT h.Id,
           h.HeroName,
           h.BossId
    FROM #Hero h
        INNER JOIN RecursiveCTE r
            ON h.BossId = r.Id)
SELECT r.Id,
       r.HeroName,
       r.BossId
FROM RecursiveCTE r;

I didn't place the Ids in order to illustrate the magic. The recursive CTE ordered my structure in the screenshot below.

Query Results: recursive CTE

The Anchor or Parent

In the example of a company hierarchy, the anchor is the CEO. The word anchor always confuses me. I know what an anchor is, but something about saying CEO, parent, or boss simplifies it. Pick whatever word clicks in your brain. It's vital to remember SQL executes this query one time.

What does the query look like to return the boss record? I've included it below, but it's where the BossId is NULL. We'll assume the boss doesn't have a boss.

SELECT Id,
       HeroName,
       BossId
FROM #Hero
WHERE BossId IS NULL; -- Here we start with Professor X
Query Results: Boss record

Another example where we generate a family tree would be the oldest known relative. In my case, it's a great-grandmother.

The Recursive Part

Since we figured out the parent, we need to write the recursive part of the query. SQL Server executes this query until we expose all levels in the hierarchy. Below is an example keeping with our structure above.

SELECT h.Id,
       h.HeroName,
       h.BossId
FROM #Hero h
    INNER JOIN RecursiveCTE r -- Here we join to the CTE
        ON h.BossId = r.Id

Sometimes, you'll see people add in a third part called the break or end condition. Most times, this entails adding a simple WHERE clause to the recursive part. It adds more complexity to my two-part system, so I like ignoring it. However, I've included an example below for you to review. Let's say we only want to return superheroes born before a particular date in our structure.

;WITH RecursiveCTE
AS (SELECT Id,
           HeroName,
           BossId
    FROM #Hero
    WHERE BossId IS NULL -- Here we start with Professor X

    UNION ALL
    SELECT h.Id,
           h.HeroName,
           h.BossId
    FROM #Hero h
        INNER JOIN RecursiveCTE r
            ON h.BossId = r.Id
    WHERE h.BirthYear < '01-01-1985') –- The break condition
SELECT r.Id,
       r.HeroName,
       r.BossId
FROM RecursiveCTE r;

At the center of it all, you combine the boss with the recursion using the UNION ALL operator. What if, for some reason, you wanted to change it to a UNION? Well, SQL doesn't like it and throws an error.

Error Message UNION ALL

Building a Dataset

I often use a recursive CTE to build a specific dataset rather than a hierarchy. In a prior article, I used the code below. In a way, this is a CROSS JOIN of sorts. I know it isn't a CROSS JOIN; there's no need to leave a comment. Can you spot the two parts of the recursive CTE?

DECLARE @StartDate DATE = '2023-02-01';
DECLARE @EndDate DATE = '2023-02-28';

WITH Dates
AS (SELECT @StartDate AS Date
    UNION ALL
    SELECT DATEADD(DAY, 1, Date) AS Date
    FROM Dates
    WHERE Date < @EndDate)
SELECT d.Date
FROM Dates d;

Maximum Recursion

SQL Server is smart and understands our shortcomings as humans. To keep us from entering an infinite loop by default, SQL permits us to scale 100 levels deep. If you execute a recursive CTE with more than 100 levels, SQL returns the error message below.

Error message: Maximum Recursion

If you want to bypass the 100-level restriction, there's a command for that. If you remove the limitation, the query gets out in production, and SQL hangs, your DBA will email you.

DECLARE @StartDate DATE = '2022-02-01';
DECLARE @EndDate DATE = '2023-02-28';

WITH Dates
AS (SELECT @StartDate AS Date
    UNION ALL
    SELECT DATEADD(DAY, 1, Date) AS Date
    FROM Dates
    WHERE Date < @EndDate)
SELECT d.Date
FROM Dates d
OPTION (MAXRECURSION 0);
Setting Max Recursion to 0

Summary

Many things in life are scary, like taxes or running an update in production, and wondering if you included a WHERE clause. But as we saw in this article, recursive CTEs don't need to be one of them. There are only two parts: the boss and the recursion. The older I get, the easier it is to forget syntax. If you're prone to forgetfulness, bookmark this page for your future self. I better do it now, so I don't forget.

Have you constructed a recursive CTE in real life? If so, what other use cases have you encountered?

Key Points

  • Recursive CTEs help traverse recursive structures, like a company hierarchy, bill of material, or family tree.
  • At first glance, the syntax for a recursive CTE is confusing. But it's easier to digest if you remember there are only two parts: the boss and recursion.
  • I prefer a recursive CTE over trying to write a cursor or WHILE loop to accomplish the same thing. I won't hold it against you if you like the loop method.
  • If you trust your code, don't worry about maximum recursion. However, add it to avoid an error in production.
Next Steps


sql server categories

sql server webinars

subscribe to mssqltips

sql server tutorials

sql server white papers

next tip



About the author
MSSQLTips author Jared Westover Jared Westover is a passionate technology specialist at Crowe, helping to build data solutions. For the past two decades, he has focused on SQL Server, Azure, Power BI, and Oracle. Besides writing for MSSQLTips.com, he has published 12 Pluralsight courses about SQL Server and Reporting Services. Jared enjoys reading and spending time with his wife and three sons when he's not trying to make queries go faster.

This author pledges the content of this article is based on professional experience and not AI generated.

View all my tips


Article Last Updated: 2023-11-21

Comments For This Article




Thursday, November 23, 2023 - 10:59:28 AM - Jared Westover Back To Top (91772)
@Göran Peterson

Thank you for reading and for the kind words.

Tuesday, November 21, 2023 - 10:32:16 AM - Göran Peterson Back To Top (91769)
Nice post. Competent, clear and to the point. And a good vibe with it.














get free sql tips
agree to terms