Searching between two date values in SQL Server

By:   |   Updated: 2023-04-24   |   Comments (3)   |   Related: More > TSQL


Problem

Many SQL developers and users encounter challenges when efficiently searching for data within a specific date range, especially when dealing with partitioned and sharded tables. In this SQL tutorial, we will look at different ways you can write SQL to return data between date ranges and how to improve performance when data is partitioned.

Solution

When working with databases, it is common to encounter scenarios where you must search for data within a specific date range. Fortunately, Microsoft SQL Server provides several methods for performing these types of searches with T-SQL. This SQL tutorial illustrates some of the most common techniques for searching between two date values in SQL, including using the BETWEEN operator, the greater than (>) and less than (<) operators, and the DATEPART() function. It also provides some tips for efficiently searching between dates on partitioned and sharded tables, which can help improve performance.

Searching Between Dates Using SQL Arithmetic Operators

One method for searching between two date values in a SQL statement is to use arithmetic operators (greater than and less than operators). These operators allow specifying the start and end dates of the desired date range. For instance, the following SQL command can be used to retrieve records between 2009-01-01 and 2009-12-31.

Note that the CreationDate column also includes time, so the below query will only get data until 2009-12-31 at midnight. The rest of the data for that last day will not be included.

SELECT * 
FROM [StackOverflow2010].[dbo].[Posts] 
WHERE CreationDate >= '2009-01-01' AND CreationDate <= '2009-12-31';
Using arithmetic operators

Searching Between Dates Using the SQL BETWEEN Operator

Another method for searching between two date values in SQL is to use the BETWEEN operator. The BETWEEN operator filters results within a specified range, including the start and end values. For instance, the following query can be used to find all records between 2009-01-01 and 2009-12-31.

Note that the CreationDate column also includes time, so the below query will only get data until 2009-12-31 at midnight. The rest of the data for that last day will not be included.

SELECT * 
FROM [StackOverflow2010].[dbo].[Posts] 
WHERE CreationDate BETWEEN '2009-01-01' AND '2009-12-31';
Using BETWEEN operator

Searching Between Dates Using Date Functions

There are other methods to search between two date values in SQL. One method involves using the DATEPART function to extract specific date parts such as year, month, or day from a date column. For example, the following query is used to find all records where the year in the DateColumn column is 2009:

SELECT * 
FROM [StackOverflow2010].[dbo].[Posts] 
WHERE DATEPART(year, CreationDate) = 2009;

Similarly, you can use DATEPART to extract and compare other date parts, such as month or day. While this method can be less intuitive than arithmetic operators or the BETWEEN operator, it can be useful in certain scenarios, especially when extracting and comparing specific date parts.

Comparison of Arithmetic Operators vs. BETWEEN vs. Date Functions

In the screenshot below, we can note that the three methods provided return the same record count. To make sure all data for the entire year is selected, we changed the queries a bit as shown below to make sure we include the entirety of the last day of the year.

Note: in the first query statement below, if any CreationDate happens to be exactly 2010-01-01 at midnight, these would be included in the counts. So it is important to understand the data you are working with.

comparing the three methods

Also, the estimated execution plans are very similar; they all use the index on the CreationDate column:

comparing query plans

Note that the date function DATEPART performs an Index Scan operation, while the other two methods perform an Index Seek. This is because, in SQL Server, when functions are used (DATEPART in this case) all data needs to be evaluated for the condition and this does not provide the benefit of using the created indexes.

Searching Between Dates for Large Partitioned Tables

When searching between two date values on a table partitioned using a date value, it is critical to leverage the partitioning scheme. This will minimize the amount of data to be scanned. One way to do this is to use a specific partition by eliminating unnecessary partitions, which involves identifying and scanning only the partitions that contain data within the desired date range. This can result in significant performance gains, especially for large tables.

To perform partition elimination, you can include the partitioning column and the search criteria in the WHERE clause.

For example, suppose you have a table partitioned by date on the CreationDate column. You want to find all records between 2009-01-01 and 2009-12-31. You could use a query like this:

SELECT * 
FROM [StackOverflow2010].[dbo].[Posts] 
WHERE CreationDate >= '2009-01-01' AND CreationDate <= '2009-12-31';

Let's look at a more realistic example. I have been working with a huge partitioned table for years. Most of the time, the date columns are unused as a partition key and instead a date part is used.

For example, while a transaction date/time column could exist in the table, the day of the year (1 to 365) column is used as the partition key. In that case, using the query below to search over the date column may decrease performance, especially when the data size is vast and we do not have the option of creating every needed index.

In this example, we are looking at data between a certain date and time.

SELECT * FROM [StackOverflow2010].[dbo].[Posts]
WHERE CreationDate >= '2009-01-01 05:00:00' AND CreationDate <= '2009-01-05 09:00:00'

SQL Server provides several ways to control which partitions are searched based on data values. In the example below, the CreationDayofYear column is the partition key and these values are 1-365 for each day of the year. In the query we are selecting data from partitions 1,2,3,4, and 5 for the first 5 days of the year and also limiting data for the 1st day and the 5th day.

SELECT * 
FROM [StackOverflow2010].[dbo].[Posts]
WHERE CreationDate >= '2009-01-01 05:00:00'
  AND $PARTITION.pfDayOfTheYear(CreationDayofYear) IN (1);
 
UNION ALL
 
SELECT * 
FROM [StackOverflow2010].[dbo].[Posts]
WHERE $PARTITION.pfDayOfTheYear(CreationDayofYear) IN (2,3,4);
 
UNION ALL
 
SELECT * 
FROM [StackOverflow2010].[dbo].[Posts]
WHERE CreationDate <= '2009-01-05 09:00:00'
  AND $PARTITION.pfDayOfTheYear(CreationDayofYear) IN (5);

In this example, $PARTITION.pfDayOfTheYear returns the partition number based on the value of the CreationDayofYear column. The IN operator specifies the partitions that should be searched. By specifying the partitions to be searched, you can ensure that only the relevant partitions are scanned rather than scanning the entire table. Besides, by using the syntax above, we help the query optimizer ignore the CreationDate column value when we are sure that the whole partition is within the date range specified. Note: on the boundaries partitions (1 and 5), we add the needed filter to include time on the CreationDate column, but this is not needed in the middle for (2,3,4).

On the other hand, if we execute the following SQL query:

SELECT * 
FROM [StackOverflow2010].[dbo].[Posts]
WHERE CreationDate >= '2009-01-01 05:00:00' AND CreationDate <= '2009-01-05 09:00:00'
  AND $PARTITION.pfDayOfTheYear(CreationDayofYear) IN (1,2,3,4,5);

The query optimizer will be unaware of the following facts:

  • In partition 1, the CreationDate <= '2009-01-05 09:00:00' filter is meaningless.
  • In partition 5, the CreationDate >= '2009-01-01 05:00:00' filter is meaningless.
  • In partitions 2, 3 and 4, filtering on the CreationDate column is meaningless.

This approach must increase the performance when working with huge partitioned tables where we do not have the option to create every needed non-clustered index. Note that splitting the query may decrease the performance on small or medium tables (up to a few gigabytes). Also, the performance gain also depends on how partitions are distributed over the data storage.

Comparing query plans

Searching Between Dates in Sharded Tables

When searching between two date values on a sharded database, it is essential to consider the shard key and how queries are built using a routing table. If the shard key is the day or month value, then queries that span multiple shards can be prolonged and resource-intensive. One way to mitigate this issue is to use a routing table, which maps queries to the appropriate shard based on the queried date range. The routing table should be optimized to minimize the number of shards that need to be queried while ensuring that all relevant data is retrieved.

To further optimize searching between two date values on a sharded database, it is essential to ensure that indexes are appropriately set up. In particular, indexes should be partitioned to align with the shard key so that queries routed to a single shard can be executed efficiently. Additionally, query optimization techniques such as parallelism and caching can be used to improve performance further.

Conclusion

In conclusion, searching between two date values in SQL can be performed using various methods, such as arithmetic operators, the BETWEEN operator, and the DATEPART function. To ensure that only relevant data is queried when dealing with partitioned and sharded tables, pruning techniques, such as routing tables and shard filters, should be used to optimize queries. Additionally, it is essential to properly set up indexes and partitions to align with the shard key and optimize query performance. By implementing these techniques, developers can efficiently search between two date values on SQL databases and effectively manage large amounts of data.

Next Steps


sql server categories

sql server webinars

subscribe to mssqltips

sql server tutorials

sql server white papers

next tip



About the author
MSSQLTips author Hadi Fadlallah Hadi Fadlallah is an SQL Server professional with more than 10 years of experience. His main expertise is in data integration. He's one of the top ETL and SQL Server Integration Services contributors at Stackoverflow.com. He holds a Ph.D. in data science focusing on context-aware big data quality and two master's degrees in computer science and business computing.

This author pledges the content of this article is based on professional experience and not AI generated.

View all my tips


Article Last Updated: 2023-04-24

Comments For This Article




Sunday, February 4, 2024 - 4:43:45 AM - Raad Back To Top (91919)
Can I select records between outdat(2023-12-28), outtime(7:00:00) and outdate(2023-12-29), outtime(7:00:00)

Tuesday, April 25, 2023 - 4:31:52 PM - Hadi Fadlallah Back To Top (91130)
@Tim it is better to avoid functions since it will not benefit from index seek. You can add the time part to your query as follows: where [datetimecolumn] between '2023-01-01 00:00:00' and '2023-01-06 23:59:59'

Monday, April 24, 2023 - 10:01:34 AM - Tim Claremont Back To Top (91129)
I have been using the following for years. Is this good practice? WHERE CAST(<date field> AS Date) BETWEEN 1/1/2023 and 6/1/2023.

This removes the time component from the field being searched, and always gives me what I need.