Using a calendar table in SQL Server - Part 1


By:   |   Updated: 2021-04-22   |   Comments (3)   |   Related: 1 | 2 | 3 | 4 | 5 | More > Dates


Problem

A while back, I wrote an article called Creating a date dimension or calendar table in SQL Server. I have used this pattern repeatedly and, based on the questions I get from the community, many of you are using it, too. Some of the questions I get are along the lines of "how do I actually use this table in my queries?" and "what are the performance characteristics compared to other approaches?" So, I thought I would put together a collection of use cases and analysis, starting with business day problems.

Solution

As a refresher, here is how I tend to generate a range of dates for a variety of purposes, though I will concede that the particulars of this shift over time. Let's say we want our calendar table to hold 30 years of data, starting from January 1st, 2020:

DECLARE @StartDate date = '20200101', 
        @Years     int  = 30; ;WITH seq(n) AS
(
  SELECT 1 UNION ALL SELECT n + 1 FROM seq
  WHERE n < DATEDIFF(DAY, @StartDate, DATEADD(YEAR, @Years, @StartDate))
),
d(d) AS
(
  SELECT DATEADD(DAY, n - 1, @StartDate) FROM seq
)
SELECT MinDate = MIN(d), MaxDate = MAX(d), CountDates = COUNT(*)
  FROM d OPTION (MAXRECURSION 0);

The answers here are:

MinDate      MaxDate      CountDates
----------   ----------   ----------
2020-01-01   2049-12-31   10958

And here is how I create the basic calendar table I use (again, the bulk of this is described in the earlier tip):

SET DATEFIRST 7, LANGUAGE us_english;
DECLARE @StartDate date = '20200101', 
        @Years     int  = 30; ;WITH seq(n) AS
(
  SELECT 1 UNION ALL SELECT n + 1 FROM seq
  WHERE n < DATEDIFF(DAY, @StartDate, DATEADD(YEAR, @Years, @StartDate))
),
d(d) AS
(
  SELECT DATEADD(DAY, n - 1, @StartDate) FROM seq
),
src AS
(
  SELECT
    TheDate         = CONVERT(date,       d),
    TheDay          = DATEPART(DAY,       d),
    TheDayName      = DATENAME(WEEKDAY,   d),
    TheWeek         = DATEPART(WEEK,      d),
    TheISOWeek      = DATEPART(ISO_WEEK,  d),
    TheDayOfWeek    = DATEPART(WEEKDAY,   d),
    TheMonth        = DATEPART(MONTH,     d),
    TheMonthName    = DATENAME(MONTH,     d),
    TheQuarter      = DATEPART(QUARTER,   d),
    TheYear         = DATEPART(YEAR,      d),
    TheFirstOfMonth = DATEFROMPARTS(YEAR(d), MONTH(d), 1),
    TheLastOfYear   = DATEFROMPARTS(YEAR(d), 12, 31),
    TheDayOfYear    = DATEPART(DAYOFYEAR, d)
  FROM d
),
dim AS
(
  SELECT
    TheDate,
    TheDay,
    TheDaySuffix        = CONVERT(char(2), CASE WHEN TheDay / 10 = 1 THEN 'th' ELSE
                            CASE RIGHT(TheDay, 1) WHEN '1' THEN 'st' WHEN '2' THEN 'nd'
                            WHEN '3' THEN 'rd' ELSE 'th' END END),
    TheDayName,
    TheDayOfWeek,
    TheDayOfWeekInMonth = CONVERT(tinyint, ROW_NUMBER() OVER
                            (PARTITION BY TheFirstOfMonth, TheDayOfWeek ORDER BY TheDate)),
    TheDayOfYear,
    IsWeekend           = CASE WHEN TheDayOfWeek IN (CASE @@DATEFIRST
                            WHEN 1 THEN 6 WHEN 7 THEN 1 END,7)
                            THEN 1 ELSE 0 END,
    TheWeek,
    TheISOweek,
    TheFirstOfWeek      = DATEADD(DAY, 1 - TheDayOfWeek, TheDate),
    TheLastOfWeek       = DATEADD(DAY, 6, DATEADD(DAY, 1 - TheDayOfWeek, TheDate)),
    TheWeekOfMonth      = CONVERT(tinyint, DENSE_RANK() OVER
                            (PARTITION BY TheYear, TheMonth ORDER BY TheWeek)),
    TheMonth,
    TheMonthName,
    TheFirstOfMonth,
    TheLastOfMonth      = MAX(TheDate) OVER (PARTITION BY TheYear, TheMonth),
    TheFirstOfNextMonth = DATEADD(MONTH, 1, TheFirstOfMonth),
    TheLastOfNextMonth  = DATEADD(DAY, -1, DATEADD(MONTH, 2, TheFirstOfMonth)),
    TheQuarter,
    TheFirstOfQuarter   = MIN(TheDate) OVER (PARTITION BY TheYear, TheQuarter),
    TheLastOfQuarter    = MAX(TheDate) OVER (PARTITION BY TheYear, TheQuarter),
    TheYear,
    TheISOYear          = TheYear - CASE WHEN TheMonth = 1 AND TheISOWeek > 51 THEN 1
                            WHEN TheMonth = 12 AND TheISOWeek = 1  THEN -1 ELSE 0 END,     
    TheFirstOfYear      = DATEFROMPARTS(TheYear, 1,  1),
    TheLastOfYear,
    IsLeapYear          = CONVERT(bit, CASE WHEN (TheYear % 400 = 0)
                            OR (TheYear % 4 = 0 AND TheYear % 100 <> 0)
                            THEN 1 ELSE 0 END),
    Has53Weeks          = CASE WHEN DATEPART(WEEK,     TheLastOfYear) = 53 THEN 1 ELSE 0 END,
    Has53ISOWeeks       = CASE WHEN DATEPART(ISO_WEEK, TheLastOfYear) = 53 THEN 1 ELSE 0 END,
    MMYYYY              = CONVERT(char(2), CONVERT(char(8), TheDate, 101))
                          + CONVERT(char(4), TheYear),
    Style101            = CONVERT(char(10), TheDate, 101),
    Style103            = CONVERT(char(10), TheDate, 103),
    Style112            = CONVERT(char(8),  TheDate, 112),
    Style120            = CONVERT(char(10), TheDate, 120)
  FROM src
)
SELECT *
  INTO dbo.Calendar
  FROM dim
  ORDER BY TheDate
  OPTION (MAXRECURSION 0);
GO ALTER TABLE dbo.Calendar ALTER COLUMN TheDate date NOT NULL;
GO ALTER TABLE dbo.Calendar ADD CONSTRAINT PK_Calendar PRIMARY KEY CLUSTERED(TheDate);
GO

And here is an auxiliary table I use to represent holidays and other business outages, though you could theoretically just add a bit column like IsBusinessOutage to the main table:

CREATE TABLE dbo.OutageDates
(
  TheDate     date NOT NULL,
  Description nvarchar(255) NOT NULL,
  INDEX       CIX_OutageDates CLUSTERED(TheDate),
  CONSTRAINT  FK_OutageDates FOREIGN KEY(TheDate) REFERENCES dbo.Calendar(TheDate)
);

Next, let's manually add 2021 holidays and a couple of other business outages to our OutageDates table (in real life, you would automate this to a large degree, as demonstrated in the previous tip):

-- holidays
INSERT dbo.OutageDates(TheDate, Description)
VALUES
      ('20210101',N'New Year''s Day'),
      ('20210118',N'Martin Luther King Day'),
      ('20210215',N'President''s Day'),
      ('20210531',N'Memorial Day'),
      ('20210704',N'Independence Day'),
      ('20210906',N'Labour Day'),
      ('20211011',N'Indigenous Peoples'' Day'),
      ('20211111',N'Veterans'' Day'),
      ('20211125',N'Thanksgiving Day'),
      ('20211126',N'Black Friday'),
      ('20211225',N'Christmas Day');

-- company closures
INSERT dbo.OutageDates(TheDate, Description)
VALUES
      ('20211217',N'Holiday Party'),
      ('20210222',N'All-hands off-site 1'),
      ('20210223',N'All-hands off-site 2');

With those tables in place, we can answer all kinds of questions about business days.

How many Mondays are in <this date range>?

Let's say we want to find out how many Mondays there are between date 1 and date 2. We can easily get the list using the following query:

DECLARE @RangeStart date = '20210112', @RangeEnd date = '20210330';
SELECT c.TheDate 
  FROM dbo.Calendar AS c
  WHERE c.TheDate >= @RangeStart
    AND c.TheDate <= @RangeEnd
    AND c.TheDayName = N'Monday';

The plan for this query is as follows:

Plan for Mondays query against clustered index

I highlighted several metrics in the operator properties not because they're scary but to illustrate that, if you're going to ask a lot of questions where specific weekdays are specified, it can be useful to create an index like this:

CREATE INDEX IX_TheDayName ON dbo.Calendar(TheDayName, TheDate);

Now the plan for the same query looks like this:

Plan for Mondays query with new non-clustered index

At these sizes the differences may seem inconsequential, but I wanted to point them out all the same, especially if you are later joining against larger transactional tables.

How many Mondays are in <this date range> that are not business outages?

We need to change our logic just a little bit here to make sure we exclude any Mondays from our OutageDates table. You can experiment with NOT EXISTS and LEFT OUTER JOIN, but I've found the simplest plan is with EXCEPT:

DECLARE @RangeStart date = '20210112', @RangeEnd date = '20210330';
SELECT c.TheDate
FROM dbo.Calendar AS c
  WHERE c.TheDate >= @RangeStart
    AND c.TheDate <= @RangeEnd
    AND c.TheDayName = N'Monday' EXCEPT SELECT ow.TheDate
  FROM dbo.OutageDates AS ow
  WHERE ow.TheDate >= @RangeStart
    AND ow.TheDate <= @RangeEnd;

The plan for this query looks like this:

Plan for Mondays query exclusing business outages

How many paydays are in this <this date range>?

A lot of folks are paid twice a month, typically on pre-determined dates like the 1st and 15th of the month. In the days of paper checks that were handed to employees, these might shift to the closest weekday, but with everything electronic and direct deposit these days, that is becoming less common.

DECLARE @RangeStart date = '20210112', @RangeEnd date = '20210330';
SELECT TheDate, COUNT(*) OVER() 
  FROM dbo.Calendar
  WHERE TheDay IN (1,15)
    AND TheDate >= @RangeStart
    AND TheDate <= @RangeEnd;

The plan for this query is actually kind of terrible:

Query plan for paydays

We can make it a little better in terms of overall plan cost, with an index like this:

CREATE INDEX IX_TheDay_TheDate ON dbo.Calendar(TheDay, TheDate);

The plan has the same shape, but the seek on the narrower index is far more efficient:

Query plan for paydays with a supporting index

It's a little more complex if you're paid every two weeks rather than on specific days, but you only need one valid payday to calculate any series:

DECLARE @RangeStart date = '20210112', @RangeEnd date = '20210330';
DECLARE @ReferencePayDay date = '20210114';
SELECT TheDate, COUNT(*) OVER() 
  FROM dbo.Calendar
  WHERE TheDayName = N'Thursday'
    AND DATEDIFF(WEEK, @ReferencePayDay, TheDate) % 2 = 0
    AND TheDate >= @RangeStart
    AND TheDate <= @RangeEnd;

The plan looks remarkably similar to the previous plan, but the index seek has an ugly predicate to calculate the DATEDIFF. Arguably, you could determine all the paydays when creating the calendar table; say, with a column called IsPayDay. Then the calculations are simplified, and you only have to change them if you switch payroll companies to one that uses a different pay period.

How many workdays are in <this date range>?

To find all the business days in the range, we need to find all the rows that are on weekdays and also that don't exist in our OutageDates table. We could use EXCEPT here but not if we also want to return the count:

DECLARE @RangeStart date = '20210112', @RangeEnd date = '20210330';

SELECT COUNT(*) 
  FROM dbo.Calendar AS c
  WHERE IsWeekend = 0
    AND NOT EXISTS 
    (
        SELECT 1 FROM dbo.OutageDates AS od 
        WHERE od.TheDate = c.TheDate
    )
    AND TheDate >= @RangeStart
    AND TheDate <= @RangeEnd;

The plan for this query is similar to some of the other ones above:

Query plan for business days within a range

The missing index feature suggests an index on IsWeekend, but even as a filtered index…

CREATE INDEX IX_IsWeekday ON dbo.Calendar(TheDate, IsWeekend) WHERE IsWeekend = 0;

…it is not as beneficial as you might expect:

Query plan for business days in a range with a filtered index

When will the work on this project be complete?

Quite similar to the "how many business days" question, something that we often want to do is calculate when a task or project will finish, given a reasonable estimate of number of business hours. Let's say we estimate there are 160 more work hours, how do we predict the finish date, assuming an 8 hour work day, no distractions, and taking business outages into account?

DECLARE @HoursLeft int = 160;
SELECT CompletionDate = MAX(TheDate) FROM
(
  SELECT TOP (@HoursLeft / 8 + 1) TheDate
    FROM dbo.Calendar AS c
    WHERE TheDate > GETDATE()
      AND IsWeekend = 0
      AND NOT EXISTS
      (
        SELECT 1
          FROM dbo.OutageDates AS od
          WHERE od.TheDate = c.TheDate
      )
) AS x;

In terms of calendar days, the result shows that it will take about 29 calendar days, but that will be different for you on the day that you run this depending on how many non-work days there are following that day. To be more realistic but potentially less accurate in the best case, we can add a "fudge factor" that will account for the percentage of someone's time spent on distractions, sick time, or getting pulled onto a different task:

DECLARE @HoursLeft int = 160, @FudgeFactor decimal(5,2) = 0.2;
SELECT CompletionDate = MAX(TheDate) FROM
(
  SELECT TOP (CONVERT(int, CEILING(@HoursLeft / 8 * (1 + @FudgeFactor)) + 1)) TheDate
    FROM dbo.Calendar AS c
    WHERE TheDate > GETDATE()
      AND IsWeekend = 0
      AND NOT EXISTS
      (
        SELECT 1
          FROM dbo.OutageDates AS od
          WHERE od.TheDate = c.TheDate
      )
) AS x;

For me, that pushed the date out by 6 calendar days.

What was the last Tuesday in Q1 2021, May 2020, or the year 2016?

The additional columns in the calendar table make questions like this really easy to answer.

SELECT LastTuesdayQ12021 = MAX(TheDate)
  FROM dbo.Calendar
  WHERE TheDayName = N'Tuesday'
    AND TheYear = 2021
    AND TheQuarter = 1; SELECT LastTuesdayMay2019 = MAX(TheDate)
  FROM dbo.Calendar
  WHERE TheDayName = N'Tuesday'
    AND TheYear = 2019
    AND TheMonthName = N'May'; SELECT FirstTuesday2016 = MAX(TheDate)
  FROM dbo.Calendar
  WHERE TheDayName = N'Tuesday'
  AND TheYear = 2016;

These are all clustered index scans with residual I/O warnings and, sometimes, missing index suggestions. Hopefully this isn't the type of question you're asking of a calendar table enough to justify an index, but if you are, they are straightforward enough from the suggestions to implement. You can also be more deliberate in your attempt to make use of an index that already exists, such as changing…

  AND TheYear = 2021
  AND TheQuarter = 1

…to…

  AND TheDate >= '20210101'
  AND TheDate < '20210401'

What is the closest Monday to <this date>? What is the closest weekday?

This is the kind of calculation that could be useful for determining a non-standard pay period as discussed earlier. The goal is to find a specific day in either the past or future 6 days, and order by the one that is closest. One way to express this is by taking the ABS() of the difference between the input date and the possible results:

DECLARE @SourceDate date = '20210505';
SELECT TOP (1) ClosestMonday = TheDate 
  FROM dbo.Calendar
  WHERE TheDayName = N'Monday'
    AND TheDate >= DATEADD(DAY, -6, @date)
    AND TheDate <= DATEADD(DAY,  6, @date)
  ORDER BY ABS(DATEDIFF(DAY, @date, TheDate));

If you want the closest weekday, just change WHERE TheDayName = N'Monday' to WHERE IsWeekend = 0.

When was the last time the last day of the month fell on a Friday?

This is kind of a silly one, but is a good example of how the calendar table can address silly questions without all the thought that would have to go into determining this otherwise:

SELECT LastDayFriday = MAX(TheDate)
  FROM dbo.Calendar
  WHERE TheDayName = N'Friday'
    AND TheDate < GETDATE()
    AND TheLastOfMonth = TheDate;

At the time of writing, this was July 2020, but may have changed by the time you are reading this. Because the next time this happens is April 2021, and after that, December 2021, which you can determine by flipping the query around:

SELECT LastDayFriday = MIN(TheDate)
  FROM dbo.Calendar
  WHERE TheDayName = N'Friday'
    AND TheDate > GETDATE()
    AND TheLastOfMonth = TheDate;

Conclusion

A calendar table can help you tackle a lot of surprisingly complex problems dealing with business days. In the next part of this series, I'll show some ways you can use this approach to produce date ranges, fill gaps in reports, and identify islands.

Next Steps

For more info on calendar tables and general date handling in SQL Server, see these tips and videos:






get scripts

next tip button



About the author
MSSQLTips author Aaron Bertrand Aaron Bertrand (@AaronBertrand) is a passionate technologist with industry experience dating back to Classic ASP and SQL Server 6.5. He is editor-in-chief of the performance-related blog, SQLPerformance.com, and also blogs at sqlblog.org.

View all my tips


Article Last Updated: 2021-04-22

Comments For This Article




Friday, April 23, 2021 - 10:10:57 AM - Aaron Bertrand Back To Top (88595)
Dawn, you'd have to balance what you like less: client-specific calendar tables, or client-specific columns. There are pros and cons to having them as separate tables or having them in separate columns. Keeping in mind that you don't really "maintain" a calendar table after it's created, other than to adjust some of the variable things you're probably storing in other tables (like OutageDates). If the goal is to only store the date itself once, then maybe you could just have a simple calendar table with just the date-specific things that don't change per client, and then a BusinessDays table that has a key made up of (ClientID, DateTableKey).

Thursday, April 22, 2021 - 10:49:48 AM - Dawn N Brelsford Back To Top (88587)
This is great, thank you. How would you handle a calendar table for multiple fiscal years. A scenario where multiple clients have different fiscal years and the desire to only maintain one calendar table with the ability to handle varying fiscal years?

Thursday, April 22, 2021 - 8:42:02 AM - Redge Shepherd Back To Top (88585)
I've written so many date-based queries and still managed to learn a few things here. Interesting approaches to solving everyday problems.
Thank you for your in-depth review, insights, and application examples.


download














get free sql tips
agree to terms