mssqltips logo

SQL Server Window Functions Gaps and Islands Problem



By:
Overview

In this last part of the tutorial, we’ll look into one final use case that can be solved with window functions: gaps and islands.

Explanation

The gaps and islands problem is about finding holes in sequences (the gaps) or about finding ranges of consecutive values (the islands). Typical examples are: how long and how frequent has someone logged into a website (island problem)? What were the sick days of a person and did the number of sick days over a period of time surpassed some value so that it is now extended sick leave (also an island problem)? How often did the server go offline (gaps problem)? When did the student not attend the classes (gaps problem)?

Both problems are related to each other, as finding gaps is the inverse of finding islands. Let’s analyze the problem where a server sometimes goes offline.

Gaps

First we’re going to find out on which days the server went offline. Maybe there’s a correlation with scheduled maintenance? The following T-SQL query generates some sample data:

CREATE TABLE dbo.[Server]
    (ID         INT IDENTITY(1,1) NOT NULL
    ,ServerName VARCHAR(50) NOT NULL
    ,DayOnline  DATE NOT NULL);
 
INSERT INTO dbo.[Server]
(
     [ServerName]
    ,DayOnline
)
VALUES
     ('MyServer','2018-05-01')
    ,('MyServer','2018-05-02')
    ,('MyServer','2018-05-03')
    ,('MyServer','2018-05-04')
    ,('MyServer','2018-05-05')
    ,('MyServer','2018-05-07')
    ,('MyServer','2018-05-08')
    ,('MyServer','2018-05-09')
    ,('MyServer','2018-05-15')
    ,('MyServer','2018-05-16')
    ,('MyServer','2018-05-17')
    ,('MyServer','2018-05-19')
    ,('MyServer','2018-05-20');

We want to find out that on the 6th and 18th of May 2018 the server was offline, as well as the whole range of 10 till 14th of May.

For each row, find the current date and the date from the next row. This can be accomplished by using LEAD (part 7):

SELECT
    [current] = [DayOnline]
   ,[next]    = LEAD([DayOnline]) OVER (ORDER BY [DayOnline])
FROM [dbo].[Server];

The result:

using lead to find gaps

We can now identify all gaps by finding all rows where the difference between the current and the next date is bigger than one.

SELECT
     gapStart = DATEADD(DAY,1,[current])
    ,gapEnd   = DATEADD(DAY,-1,[next])
FROM
(
SELECT
    [current] = [DayOnline]
   ,[next]    = LEAD([DayOnline]) OVER (ORDER BY [DayOnline])
FROM [dbo].[Server]
) tmp
WHERE DATEDIFF(DAY,[current],[next]) > 1;

We have now isolated the gaps:

the gaps

Islands

Let’s reverse the problem and try to find the islands: the periods in time where the server was online. You can use this to calculate uptime for example. First we’re going to add a dense rank:

SELECT
     [DayOnline]
    ,DRank = DENSE_RANK() OVER (PARTITION BY [ServerName] ORDER BY [DayOnline])
FROM [dbo].[Server];
dense rank

Now we’re going to subtract this rank from the date. For every consecutive value in a range, the same date will be returned as a result. This will act as some sort of grouping.

SELECT
     [DayOnline]
    , [Group] = DATEADD(DAY
                    ,-1 * DENSE_RANK() OVER (PARTITION BY [ServerName]
                                             ORDER BY [DayOnline])
                    ,[DayOnline])
FROM [dbo].[Server];
grouping the islands

Now we can use that grouping to find the minimum date (the start of the range) and the maximum date (the end of the range).

SELECT
     StartDate  = MIN([tmp].[DayOnline])
    ,EndDate    = MAX([tmp].[DayOnline])
FROM
(
    SELECT
         [DayOnline]
        ,[Group] = DATEADD(DAY
                        ,-1 * DENSE_RANK() OVER (PARTITION BY [ServerName]
                                                 ORDER BY [DayOnline])
                        ,[DayOnline])
    FROM [dbo].[Server]
) tmp
GROUP BY [tmp].[Group];

And we have found our islands:

islands found
Additional Information

Last Update: 5/31/2018




More SQL Server Solutions











Post a comment or let the author know this tip helped.

All comments are reviewed, so stay on subject or we may delete your comment. Note: your email address is not published. Required fields are marked with an asterisk (*).

*Name
*Email
Email me updates

Signup for our newsletter

I agree by submitting my data to receive communications, account updates and/or special offers about SQL Server from MSSQLTips and/or its Sponsors. I have read the privacy statement and understand I may unsubscribe at any time.






download





get free sql tips

I agree by submitting my data to receive communications, account updates and/or special offers about SQL Server from MSSQLTips and/or its Sponsors. I have read the privacy statement and understand I may unsubscribe at any time.



Learn more about SQL Server tools