How to Find Top N Largest Time Gaps in Date Ranges in SQL Server with T-SQL Code
By: Eli Leiba | Updated: 2019-09-18 | Comments (3) | Related: More > T-SQL
The requirement is to create a simple SQL Server T-SQL tool that will help the application programmers in analyzing application script performance issues based on their debug log. Each record is time stamped and the purpose of the tool is to display the N largest time gaps in seconds between each consecutive pair of lines in order to find the most time-consuming application code parts.
My solution involves creating a T-SQL script that will get information about the largest time gaps as consecutive line pairs, as recorded in the log file. As mentioned, each record has a time stamp along with some other data.
Here is a sample of the data that shows the time stamp and the process step. Just looking at the data it is hard to tell which segments took the longest.
2019-07-17 10:20:04,step B
2019-07-17 10:22:17,step C
2019-07-17 10:26:19,step D
2019-07-17 10:30:00,step E
2019-07-17 10:32:11,step F
2019-07-17 10:33:19,step G
2019-07-17 10:36:22,step H
2019-07-17 10:40:01,step I
2019-07-17 10:42:22,step J
2019-07-17 10:43:09,step K
2019-07-17 10:44:10,step L
2019-07-17 10:45:00,step M
2019-07-17 10:47:01,step N
2019-07-17 10:48:11,step Q
2019-07-17 10:48:55,step P
2019-07-17 10:50:00,step R
2019-07-17 10:51:09,step S
2019-07-17 10:52:30,step T
2019-07-17 10:55:00,step U
To run the test, I am going to create a table called dbo.DebugApplTable to store the above data. I am going to add an identity column as the first column in the table which will be used for joining the data.
Here is a script for creating the table:
CREATE TABLE dbo.DebugApplTable ( appDebugSequence INT IDENTITY NOT NULL, appDebugDateTime DATETIME, action_description VARCHAR(200) ) GO
Here is a script to load the sample data above. This could be automated using BCP or some other import process or you may already have the data in a SQL Server table.
INSERT INTO DebugApplTable (appDebugDateTime, action_description) VALUES ('2019-07-17 10:18:01','step A'), ('2019-07-17 10:20:04','step B'), ('2019-07-17 10:22:17','step C'), ('2019-07-17 10:26:19','step D'), ('2019-07-17 10:30:00','step E'), ('2019-07-17 10:32:11','step F'), ('2019-07-17 10:33:19','step G'), ('2019-07-17 10:36:22','step H'), ('2019-07-17 10:40:01','step I'), ('2019-07-17 10:42:22','step J'), ('2019-07-17 10:43:09','step K'), ('2019-07-17 10:44:10','step L'), ('2019-07-17 10:45:00','step M'), ('2019-07-17 10:47:01','step N'), ('2019-07-17 10:48:11','step Q'), ('2019-07-17 10:48:55','step P'), ('2019-07-17 10:50:00','step R'), ('2019-07-17 10:51:09','step S'), ('2019-07-17 10:52:30','step T'), ('2019-07-17 10:55:00','step U')
The process queries the dbo.DebugApplTable, by using a self-join technique that forces the row pairs to be consecutive and orders the rows in descending order by the time difference (in seconds) for each two consecutive pair of lines (the DATEDIFF date and time system function is used for this purpose). Only the TOP @N pairs are returned (@N is the integer number that decides how many gaps will be shown in the result).
Here is the T-SQL script that does it. The @N value is set to 3, which will just return the top 3 rows, but this can be changed as needed.
-- ================================================================================= -- Author: Eli Leiba -- Create date: 08-2019 -- Description: This TSQL script finds top N largest time gaps -- ================================================================================== DECLARE @N INT = 3 SELECT TOP (@N) ABS (DATEDIFF (SECOND,b.appDebugDateTime,a.appDebugDateTime)) AS time_sec, a.action_description AS From_Step, b.action_description AS To_Step, a.appDebugDateTime AS From_Time, b.appDebugDateTime AS To_Time FROM dbo.DebugApplTable a, dbo.DebugApplTable b WHERE b.appDebugSequence = a.appDebugSequence + 1 ORDER BY time_sec DESC
Here are the results. We can see step C to step D took 242 seconds, step D to step E took 221 seconds and step H to step I took 219 seconds.
Here is another example using a table from the AdventureWorks2014 database. In this example we are showing the difference in days.
DECLARE @N INT = 15 SELECT TOP (@N) ABS (DATEDIFF (DAY,b.OrderDate,a.OrderDate)) AS time_diff_day, a.SalesOrderID AS From_Step, b.SalesOrderID AS To_Step, a.OrderDate AS From_Time, b.OrderDate AS To_Time FROM [AdventureWorks2014].[Sales].SalesOrderHeader a, [AdventureWorks2014].[Sales].SalesOrderHeader b WHERE b.SalesOrderID = a.SalesOrderID + 1 ORDER BY time_diff_day DESC
This is a quick way to find the length of time between steps, but this approach depends on having a sequentially increasing value from row to row.
If there is a gap between values, this will not find the record.
Let's delete a record (step D) which is part of the first 2 rows in the above result set.
-- delete the step D record which will change the results DELETE FROM DebugApplTable WHERE appDebugSequence = 4
Now let's run the code again.
DECLARE @N INT = 3 SELECT TOP (@N) ABS (DATEDIFF (SECOND,b.appDebugDateTime,a.appDebugDateTime)) AS time_sec, a.action_description AS From_Step, b.action_description AS To_Step, a.appDebugDateTime AS From_Time, b.appDebugDateTime AS To_Time FROM dbo.DebugApplTable a, dbo.DebugApplTable b WHERE b.appDebugSequence = a.appDebugSequence + 1 ORDER BY time_sec DESC
Here are the results. As you can see it does not show the gap from Step C to Step E like it should.
We can get around this by creating our own sequential numbers as follows:
DECLARE @N INT = 3; with cte as ( select *, ROW_NUMBER() OVER (ORDER BY appDebugDateTime) as appDebugSequence_New FROM DebugApplTable ) SELECT TOP (@N) ABS (DATEDIFF (SECOND,b.appDebugDateTime,a.appDebugDateTime)) AS time_sec, a.action_description AS From_Step, b.action_description AS To_Step, a.appDebugDateTime AS From_Time, b.appDebugDateTime AS To_Time FROM cte a, cte b WHERE b.appDebugSequence_New = a.appDebugSequence_New + 1 ORDER BY time_sec DESC
Here are the results. We can see the biggest time gap is now between Step C and Step E which is the result we want to see.
To make this run as fast as possible it is helpful to have an index on the sequential value column that is used to join the tables. Also, using the CTE may not be the best approach for a very large dataset, so you may want to write the data to a temporary table or another permanent table where you can create a sequentially incrementing number for each row in the table and also create an index on that column.
- You can use this process as a simple T-SQL tool that helps identify the top N largest time gaps between two consecutive actions.
- The tables used can be altered to meet your needs.
Last Updated: 2019-09-18
About the author
View all my tips