Learn more about SQL Server tools

mssqltips logo
 

Tutorials          DBA          Dev          BI          Career          Categories          Webcasts          Scripts          Today's Tip          Join

Tutorials      DBA      Dev      BI      Categories      Webcasts

DBA    Dev    BI    Categories

 

Create an Extended Date Dimension for a SQL Server Data Warehouse


By:   |   Last Updated: 2018-07-31   |   Comments (3)   |   Related Tips: More > Dates

Problem

The Date Dimension is a key dimension in a SQL Server data warehousing as it allows us to analyze data in different aspects of date. Apart from the standard date attributes like year, quarter, month, etc., this article explains how the date dimension can be extended to richer analysis in a SQL Server data warehouse.

Solution

A date dimension is mostly a static dimension which does not require daily update. However, a window function may need a daily update. The solution would be to have a special date dimension and populate at the start of the year and update changing data on a daily basis.

Why a Special Dimension

First question would be, what is the requirement for the special dimension for a date in the data warehouse. Let's look at the options we have in case a date dimension is not present.

What are sales per year?

SELECT YEAR(OrderDate) Year ,SUM(SalesAmount) Amount
FROM FCT_Sales
GROUP BY YEAR(OrderDate) 

What are the sales done on weekends?

SELECT SUM(SalesAmount)
FROM FCT_Sales
WHERE DATEPART(dw,OrderDate) IN (1,7)

In both the scenarios, there will be a performance impact. As you are aware, we are dealing with a large number of records in a data warehouse, the above queries will have performance issues. Also, indexes won’t be a solution as the use of functions will not make the index usable.

Apart from performance issues, there are functional limitations. For example, in case you need to get the sales on a special holiday or for a season, in which there are no built-in functions, you have no choice, but to have a special date dimension.

By looking at functional and performance limitations, it is very obvious that there needs to be a special dimension to store a date which is the date dimension used more commonly in the data warehouse.

Also, there are times where more than one date column is available in the fact table. In that instance, a date dimension will act as a role playing dimension in SQL Server Analysis Services as shown in the below figure.

fact table and dimension table

In the above example, OrderDateKey, DueDateKey and ShipDateKey are linked to the Date Dimension.

Please note that the role playing dimension feature is not available in the Tabular world where you need add multiple instances of the date dimension.

Surrogate Key

Typically, surrogate keys will be an incremental number. However, in case of a date dimension, YYYYMMDD format is used for a surrogate key. This is to facilitate data partitioning in the data warehouse.  Fact tables are normally partition by the date. If mere incremental numbers are used for the date dimension, the fact table will also have the same incremental numbers which will lead to difficulties in partitioning. In case of YYYYMMDD format, it is much easier to include a partitioning function using the details in the surrogate key.

Standard Columns

In a date dimension, it is always better to include all the possible columns leaving the options of deriving attributes at the user level.

Below is an example of the basic columns for a Date Dimension.

CREATE TABLE dbo.Dim_Date (
   DateKey INT NOT NULL PRIMARY KEY,
   [Date] DATE NOT NULL,
   [Day] TINYINT NOT NULL,
   [DaySuffix] CHAR(2) NOT NULL,
   [Weekday] TINYINT NOT NULL,
   [WeekDayName] VARCHAR(10) NOT NULL,
   [WeekDayName_Short] CHAR(3) NOT NULL,
   [WeekDayName_FirstLetter] CHAR(1) NOT NULL,
   [DOWInMonth] TINYINT NOT NULL,
   [DayOfYear] SMALLINT NOT NULL,
   [WeekOfMonth] TINYINT NOT NULL,
   [WeekOfYear] TINYINT NOT NULL,
   [Month] TINYINT NOT NULL,
   [MonthName] VARCHAR(10) NOT NULL,
   [MonthName_Short] CHAR(3) NOT NULL,
   [MonthName_FirstLetter] CHAR(1) NOT NULL,
   [Quarter] TINYINT NOT NULL,
   [QuarterName] VARCHAR(6) NOT NULL,
   [Year] INT NOT NULL,
   [MMYYYY] CHAR(6) NOT NULL,
   [MonthYear] CHAR(7) NOT NULL,
   IsWeekend BIT NOT NULL,
   )

Important to note is the existence of three columns for month name. MonthName is used to store the month name such as January, February, etc. In some reports, you might have experienced that the month name will be shorten such as Jan, Feb, etc. which can be stored in MonthName_Short column. MonthName_FirstLetter column can be used to store J, F, M, etc. for the first character of the month giving more options for users. Similarly, there are three columns for Weekday as well. 

Most of these attributes can be generated by using built-in SQL Server functions such as YEAR, MONTH, DATEPART and DATENAME. In this script, EndDate can be defined.

SET NOCOUNT ON

TRUNCATE TABLE DIM_Date

DECLARE @CurrentDate DATE = '2016-01-01'
DECLARE @EndDate DATE = '2020-12-31'

WHILE @CurrentDate < @EndDate
BEGIN
   INSERT INTO [dbo].[Dim_Date] (
      [DateKey],
      [Date],
      [Day],
      [DaySuffix],
      [Weekday],
      [WeekDayName],
      [WeekDayName_Short],
      [WeekDayName_FirstLetter],
      [DOWInMonth],
      [DayOfYear],
      [WeekOfMonth],
      [WeekOfYear],
      [Month],
      [MonthName],
      [MonthName_Short],
      [MonthName_FirstLetter],
      [Quarter],
      [QuarterName],
      [Year],
      [MMYYYY],
      [MonthYear],
      [IsWeekend],
      [IsHoliday]
      )
   SELECT DateKey = YEAR(@CurrentDate) * 10000 + MONTH(@CurrentDate) * 100 + DAY(@CurrentDate),
      DATE = @CurrentDate,
      Day = DAY(@CurrentDate),
      [DaySuffix] = CASE 
         WHEN DAY(@CurrentDate) = 1
            OR DAY(@CurrentDate) = 21
            OR DAY(@CurrentDate) = 31
            THEN 'st'
         WHEN DAY(@CurrentDate) = 2
            OR DAY(@CurrentDate) = 22
            THEN 'nd'
         WHEN DAY(@CurrentDate) = 3
            OR DAY(@CurrentDate) = 23
            THEN 'rd'
         ELSE 'th'
         END,
      WEEKDAY = DATEPART(dw, @CurrentDate),
      WeekDayName = DATENAME(dw, @CurrentDate),
      WeekDayName_Short = UPPER(LEFT(DATENAME(dw, @CurrentDate), 3)),
      WeekDayName_FirstLetter = LEFT(DATENAME(dw, @CurrentDate), 1),
      [DOWInMonth] = DAY(@CurrentDate),
      [DayOfYear] = DATENAME(dy, @CurrentDate),
      [WeekOfMonth] = DATEPART(WEEK, @CurrentDate) - DATEPART(WEEK, DATEADD(MM, DATEDIFF(MM, 0, @CurrentDate), 0)) + 1,
      [WeekOfYear] = DATEPART(wk, @CurrentDate),
      [Month] = MONTH(@CurrentDate),
      [MonthName] = DATENAME(mm, @CurrentDate),
      [MonthName_Short] = UPPER(LEFT(DATENAME(mm, @CurrentDate), 3)),
      [MonthName_FirstLetter] = LEFT(DATENAME(mm, @CurrentDate), 1),
      [Quarter] = DATEPART(q, @CurrentDate),
      [QuarterName] = CASE 
         WHEN DATENAME(qq, @CurrentDate) = 1
            THEN 'First'
         WHEN DATENAME(qq, @CurrentDate) = 2
            THEN 'second'
         WHEN DATENAME(qq, @CurrentDate) = 3
            THEN 'third'
         WHEN DATENAME(qq, @CurrentDate) = 4
            THEN 'fourth'
         END,
      [Year] = YEAR(@CurrentDate),
      [MMYYYY] = RIGHT('0' + CAST(MONTH(@CurrentDate) AS VARCHAR(2)), 2) + CAST(YEAR(@CurrentDate) AS VARCHAR(4)),
      [MonthYear] = CAST(YEAR(@CurrentDate) AS VARCHAR(4)) + UPPER(LEFT(DATENAME(mm, @CurrentDate), 3)),
      [IsWeekend] = CASE 
         WHEN DATENAME(dw, @CurrentDate) = 'Sunday'
            OR DATENAME(dw, @CurrentDate) = 'Saturday'
            THEN 1
         ELSE 0
         END,
      [IsHoliday] = 0

   SET @CurrentDate = DATEADD(DD, 1, @CurrentDate)
END

Holidays and Special Days

Holidays will be handled by the following columns.

IsHoliday   BIT NOT NULL,
HolidayName VARCHAR(20) NULL,
SpecialDays VARCHAR(20) NULL

As holidays are dependent on the country or region that you are implementing the data warehouse, a customized script is needed for the holidays and special days.

The following script shows how Christmas and Valentine's day are updated in the Dim_date dimension table.

UPDATE Dim_Date
SET [IsHoliday] = 1,
[HolidayName] = 'Christmas'
Where [Month] = 12 AND [DAY] = 25

UPDATE Dim_Date
SET SpecialDays = 'Valentines Day'
Where [Month] = 2 AND [DAY] = 14

First and Last Days

In financial scenarios, last and first dates for different segments such as year, quarter, month and week will be used. The following columns are used to store those dates.

 FirstDateofYear   DATE NULL,
LastDateofYear    DATE NULL,
FirstDateofQuater DATE NULL,
LastDateofQuater  DATE NULL,
FirstDateofMonth  DATE NULL,
LastDateofMonth   DATE NULL,
FirstDateofWeek   DATE NULL,
LastDateofWeek    DATE NULL,

The following script will be used to update those columns by using DATEADD, DATEDIFF and EOMONTH functions.

[FirstDateofYear]   = CAST(CAST(YEAR(@CurrentDate) AS VARCHAR(4)) + '-01-01' AS DATE),
[LastDateofYear]    = CAST(CAST(YEAR(@CurrentDate) AS VARCHAR(4)) + '-12-31' AS DATE),
[FirstDateofQuater] = DATEADD(qq, DATEDIFF(qq, 0, GETDATE()), 0),
[LastDateofQuater]  = DATEADD(dd, - 1, DATEADD(qq, DATEDIFF(qq, 0, GETDATE()) + 1, 0)),
[FirstDateofMonth]  = CAST(CAST(YEAR(@CurrentDate) AS VARCHAR(4)) + '-' + CAST(MONTH(@CurrentDate) AS VARCHAR(2)) + '-01' AS DATE),
[LastDateofMonth]   = EOMONTH(@CurrentDate),
[FirstDateofWeek]   = DATEADD(dd, - (DATEPART(dw, @CurrentDate) - 1), @CurrentDate),
[LastDateofWeek] = DATEADD(dd, 7 - (DATEPART(dw, @CurrentDate)), @CurrentDate)

Multiple Calendars

As you are aware, there can be multiple calendars such as having a specific Financial calendar. Some companies start their Financial year on April 1st while some company’s financial year starts in October. Depending on this, the Financial Account Year, Financial Account Quarter, Financial Account Month will change accordingly.

Window Functions

In a data warehouse, there is analysis for current and previous date segments. For example, it is always better to compare the current month and previous month. In the case of budgeting, comparisons will be done for the current month and the next month. In the case where you have a pivot table and you are using the month column, at the end of every month you need to modify your tables. This can be automated, by maintaining a column with the current month. In this column, the current month will have 0 value, previous month will be -1 and next month will be 1.

The following five columns are used for Year, Quarter, Month, Week and Day.

 CurrentYear   Smallint NULL,
CurrentQuater Smallint NULL,
CurrentMonth  Smallint NULL,
CurrentWeek   Smallint NULL,
CurrentDay    Smallint NULL,

The tinyint data type cannot be used as tinyint does not support negative values. Since these columns are updated at the change of a period, they need to be changed periodically. For example, at the start of every month, the CurrentMonth column should be changed. Every day, the CurrentDay column should be changed. Therefore, these columns are updated daily using a DATEDIFF function as shown below.

--Update current date information
UPDATE Dim_Date
SET
CurrentYear   = DATEDIFF(yy,GETDATE(),date),
CurrentQuater = DATEDIFF(q,GETDATE(),date),
CurrentMonth  = DATEDIFF(m,GETDATE(),date),
CurrentWeek   = DATEDIFF(ww,GETDATE(),date),
CurrentDay    = DATEDIFF(dd,GETDATE(),date)

All scripts

The following script contains all the columns used for Dim_Date dimension table.

CREATE TABLE [dbo].[Dim_Date] (
   [DateKey] [int] NOT NULL,
   [Date] [date] NOT NULL,
   [Day] [tinyint] NOT NULL,
   [DaySuffix] [char](2) NOT NULL,
   [Weekday] [tinyint] NOT NULL,
   [WeekDayName] [varchar](10) NOT NULL,
   [WeekDayName_Short] [char](3) NOT NULL,
   [WeekDayName_FirstLetter] [char](1) NOT NULL,
   [DOWInMonth] [tinyint] NOT NULL,
   [DayOfYear] [smallint] NOT NULL,
   [WeekOfMonth] [tinyint] NOT NULL,
   [WeekOfYear] [tinyint] NOT NULL,
   [Month] [tinyint] NOT NULL,
   [MonthName] [varchar](10) NOT NULL,
   [MonthName_Short] [char](3) NOT NULL,
   [MonthName_FirstLetter] [char](1) NOT NULL,
   [Quarter] [tinyint] NOT NULL,
   [QuarterName] [varchar](6) NOT NULL,
   [Year] [int] NOT NULL,
   [MMYYYY] [char](6) NOT NULL,
   [MonthYear] [char](7) NOT NULL,
   IsWeekend BIT NOT NULL,
   IsHoliday BIT NOT NULL,
   HolidayName VARCHAR(20) NULL,
   SpecialDays VARCHAR(20) NULL,
   [FinancialYear] [int] NULL,
   [FinancialQuater] [int] NULL,
   [FinancialMonth] [int] NULL,
   [FirstDateofYear] DATE NULL,
   [LastDateofYear] DATE NULL,
   [FirstDateofQuater] DATE NULL,
   [LastDateofQuater] DATE NULL,
   [FirstDateofMonth] DATE NULL,
   [LastDateofMonth] DATE NULL,
   [FirstDateofWeek] DATE NULL,
   [LastDateofWeek] DATE NULL,
   CurrentYear SMALLINT NULL,
   CurrentQuater SMALLINT NULL,
   CurrentMonth SMALLINT NULL,
   CurrentWeek SMALLINT NULL,
   CurrentDay SMALLINT NULL,
   PRIMARY KEY CLUSTERED ([DateKey] ASC)
   )

The following script is the entire script which will update the data for dim_date.

SET NOCOUNT ON

TRUNCATE TABLE DIM_Date

DECLARE @CurrentDate DATE = '2016-01-01'
DECLARE @EndDate DATE = '2020-12-31'

WHILE @CurrentDate < @EndDate
BEGIN
   INSERT INTO [dbo].[Dim_Date] (
      [DateKey],
      [Date],
      [Day],
      [DaySuffix],
      [Weekday],
      [WeekDayName],
      [WeekDayName_Short],
      [WeekDayName_FirstLetter],
      [DOWInMonth],
      [DayOfYear],
      [WeekOfMonth],
      [WeekOfYear],
      [Month],
      [MonthName],
      [MonthName_Short],
      [MonthName_FirstLetter],
      [Quarter],
      [QuarterName],
      [Year],
      [MMYYYY],
      [MonthYear],
      [IsWeekend],
      [IsHoliday],
      [FirstDateofYear],
      [LastDateofYear],
      [FirstDateofQuater],
      [LastDateofQuater],
      [FirstDateofMonth],
      [LastDateofMonth],
      [FirstDateofWeek],
      [LastDateofWeek]
      )
   SELECT DateKey = YEAR(@CurrentDate) * 10000 + MONTH(@CurrentDate) * 100 + DAY(@CurrentDate),
      DATE = @CurrentDate,
      Day = DAY(@CurrentDate),
      [DaySuffix] = CASE 
         WHEN DAY(@CurrentDate) = 1
            OR DAY(@CurrentDate) = 21
            OR DAY(@CurrentDate) = 31
            THEN 'st'
         WHEN DAY(@CurrentDate) = 2
            OR DAY(@CurrentDate) = 22
            THEN 'nd'
         WHEN DAY(@CurrentDate) = 3
            OR DAY(@CurrentDate) = 23
            THEN 'rd'
         ELSE 'th'
         END,
      WEEKDAY = DATEPART(dw, @CurrentDate),
      WeekDayName = DATENAME(dw, @CurrentDate),
      WeekDayName_Short = UPPER(LEFT(DATENAME(dw, @CurrentDate), 3)),
      WeekDayName_FirstLetter = LEFT(DATENAME(dw, @CurrentDate), 1),
      [DOWInMonth] = DAY(@CurrentDate),
      [DayOfYear] = DATENAME(dy, @CurrentDate),
      [WeekOfMonth] = DATEPART(WEEK, @CurrentDate) - DATEPART(WEEK, DATEADD(MM, DATEDIFF(MM, 0, @CurrentDate), 0)) + 1,
      [WeekOfYear] = DATEPART(wk, @CurrentDate),
      [Month] = MONTH(@CurrentDate),
      [MonthName] = DATENAME(mm, @CurrentDate),
      [MonthName_Short] = UPPER(LEFT(DATENAME(mm, @CurrentDate), 3)),
      [MonthName_FirstLetter] = LEFT(DATENAME(mm, @CurrentDate), 1),
      [Quarter] = DATEPART(q, @CurrentDate),
      [QuarterName] = CASE 
         WHEN DATENAME(qq, @CurrentDate) = 1
            THEN 'First'
         WHEN DATENAME(qq, @CurrentDate) = 2
            THEN 'second'
         WHEN DATENAME(qq, @CurrentDate) = 3
            THEN 'third'
         WHEN DATENAME(qq, @CurrentDate) = 4
            THEN 'fourth'
         END,
      [Year] = YEAR(@CurrentDate),
      [MMYYYY] = RIGHT('0' + CAST(MONTH(@CurrentDate) AS VARCHAR(2)), 2) + CAST(YEAR(@CurrentDate) AS VARCHAR(4)),
      [MonthYear] = CAST(YEAR(@CurrentDate) AS VARCHAR(4)) + UPPER(LEFT(DATENAME(mm, @CurrentDate), 3)),
      [IsWeekend] = CASE 
         WHEN DATENAME(dw, @CurrentDate) = 'Sunday'
            OR DATENAME(dw, @CurrentDate) = 'Saturday'
            THEN 1
         ELSE 0
         END,
      [IsHoliday] = 0,
      [FirstDateofYear] = CAST(CAST(YEAR(@CurrentDate) AS VARCHAR(4)) + '-01-01' AS DATE),
      [LastDateofYear] = CAST(CAST(YEAR(@CurrentDate) AS VARCHAR(4)) + '-12-31' AS DATE),
      [FirstDateofQuater] = DATEADD(qq, DATEDIFF(qq, 0, GETDATE()), 0),
      [LastDateofQuater] = DATEADD(dd, - 1, DATEADD(qq, DATEDIFF(qq, 0, GETDATE()) + 1, 0)),
      [FirstDateofMonth] = CAST(CAST(YEAR(@CurrentDate) AS VARCHAR(4)) + '-' + CAST(MONTH(@CurrentDate) AS VARCHAR(2)) + '-01' AS DATE),
      [LastDateofMonth] = EOMONTH(@CurrentDate),
      [FirstDateofWeek] = DATEADD(dd, - (DATEPART(dw, @CurrentDate) - 1), @CurrentDate),
      [LastDateofWeek] = DATEADD(dd, 7 - (DATEPART(dw, @CurrentDate)), @CurrentDate)

   SET @CurrentDate = DATEADD(DD, 1, @CurrentDate)
END

--Update Holiday information
UPDATE Dim_Date
SET [IsHoliday] = 1,
   [HolidayName] = 'Christmas'
WHERE [Month] = 12
   AND [DAY] = 25

UPDATE Dim_Date
SET SpecialDays = 'Valentines Day'
WHERE [Month] = 2
   AND [DAY] = 14

--Update current date information
UPDATE Dim_Date
SET CurrentYear = DATEDIFF(yy, GETDATE(), DATE),
   CurrentQuater = DATEDIFF(q, GETDATE(), DATE),
   CurrentMonth = DATEDIFF(m, GETDATE(), DATE),
   CurrentWeek = DATEDIFF(ww, GETDATE(), DATE),
   CurrentDay = DATEDIFF(dd, GETDATE(), DATE)

Next Steps


Last Updated: 2018-07-31


get scripts

next tip button



About the author
MSSQLTips author Dinesh Asanka Dinesh Asanka is a 10 time Data Platform MVP and frequent speaker at local and international conferences with more than 12 years of database experience.

View all my tips




Post a comment or let the author know this tip helped.

All comments are reviewed, so stay on subject or we may delete your comment. Note: your email address is not published. Required fields are marked with an asterisk (*).

*Name    *Email    Email me updates 


Signup for our newsletter
 I agree by submitting my data to receive communications, account updates and/or special offers about SQL Server from MSSQLTips and/or its Sponsors. I have read the privacy statement and understand I may unsubscribe at any time.



    



Thursday, January 17, 2019 - 5:03:11 PM - Maruthi J Guda Back To Top

 what is the best practice to maintain the date dimension and keep it upto date? 

Recently we have added some additional data for next couple of years and deleted some old dates. The cube performence went down for 3-4 days and users were unable to run any query on the cube. 


Tuesday, July 31, 2018 - 2:43:02 PM - Dinesh Asanka Back To Top

 Yes , it should be RIGHT not LEFT.

Thanks for pointing this. 


Tuesday, July 31, 2018 - 2:06:15 PM - sree Back To Top

 

 Hi, 

 I have a doubt. Can you please help me here. I have executed the following line of code with different inputs. When the month value is two digit number like (10,11,12), it's giving wrong output. It seems there is a mistake in code. Can you please help me with the logic here.

[MMYYYY] = LEFT('0' + CAST(MONTH(@CurrentDate) AS VARCHAR(2)), 2) + CAST(YEAR(@CurrentDate) AS VARCHAR(4))

--------------------------------------------------------------------------------------------------------------------
declare @currentdate date = '2016-06-15' Select LEFT('0' + CAST(MONTH(@CurrentDate) AS VARCHAR(2)), 2) + CAST(YEAR(@CurrentDate) AS VARCHAR(4)) as [MMYYYY]

Result:

MMYYYY 062016 declare @currentdate2 date = '2016-12-15' Select LEFT('0' + CAST(MONTH(@CurrentDate2) AS VARCHAR(2)), 2) + CAST(YEAR(@CurrentDate2) AS VARCHAR(4)) as [MMYYYY]

Result:

MMYYYY 012016

 


Learn more about SQL Server tools