SQL Server Table Partitioning and Query Performance

Problem

In this article, I will investigate whether SQL Server table partitioning enhances query performance or not. Additionally, I will explain how index partitioning can reduce query performance in specific scenarios. Therefore, if you are facing a large rowstore table and plan to partition it just to improve query performance, I recommend that you read this article first.

Solution

Data partitioning is a great tool for managing large tables. The most important uses of table and index partitioning for me are switch and truncate partition. In addition, updating the statistics of a partition in a large table is a useful feature.

In this tip, I will explore the effect of table partitioning on query performance for rowstore tables. To learn about the impact of partitioning on a clustered columnstore index, refer to query performance degraded after rebuilding clustered columnstore index. Also, you can read more about data partitioning in Partitioned Tables and Indexes.

The main question that I’m trying to answer is: Does table partitioning improve the performance of a query? To find the answer, I will set up a demo.

Set Up Test Environment

For this demonstration, I will use SQL Server 2022 and create a database called PartitionDB and configure its settings.

If DB_ID (N'PartitionDB ') Is Not Null
 Begin
  Alter Database PartitionDB Set Single_User With Rollback Immediate
  Drop Database PartitionDB 
 End
GO
 
Create Database PartitionDB 
On Primary 
(Name = N'PartitionDB', FileName = N'D:\SQLClass\PartitionDB.mdf', Size = 16384 mb, FileGrowth = 512 mb, MaxSize = Unlimited)
Log On 
(Name = N'PartitionDBLog', FileName = N'D:\SQLClass\PartitionDBLog.ldf', Size = 2048 mb, FileGrowth = 512 mb, MaxSize = Unlimited)
GO

In addition, I set the database recovery model to simple to reduce the number of log records that will be generated during data loading in the sample tables.

Alter Database PartitionDB Set Recovery Simple;
GO

Incremental Statistics is a feature that was introduced in SQL Server 2014. You can learn about it in Statistics. To enable this feature, use the following command.

ALTER DATABASE PartitionDB SET AUTO_CREATE_STATISTICS ON (INCREMENTAL = ON)
GO

After configuring some settings at the database level, I create a non-partitioned table with ten columns and then insert 100 million records into it.

Use PartitionDB 
GO 
 
Drop Table If Exists NonPartitioned
GO
 
Create Table NonPartitioned
(
 Id Int Not Null Identity,
 CreationDate DateTime2 Not Null,
 Value1 Decimal (10, 8) Not Null,
 Value2 Decimal (10, 8) Not Null,
 Value3 Decimal (10, 8) Not Null,
 Value4 Decimal (10, 8) Not Null,
 Value5 Decimal (10, 8) Not Null,
 Value6 Decimal (10, 8) Not Null,
 Value7 Decimal (10, 8) Not Null,
 CStatus Bit Not Null
)
GO
 
Alter Table NonPartitioned Rebuild With (Data_Compression = Page)
GO
 
Set Identity_Insert NonPartitioned On
GO
 
; With CTE_0(C) As (Select 0 Union All Select 0),
  CTE_1(C) AS (Select C1.C From CTE_0 C1 Cross Apply CTE_0 C2),
  CTE_2(C) AS (Select C1.C From CTE_1 C1 Cross Apply CTE_1 C2),
  CTE_3(C) AS (Select C1.C From CTE_2 C1 Cross Apply CTE_2 C2),
  CTE_4(C) AS (Select C1.C From CTE_3 C1 Cross Apply CTE_3 C2),
  CTE_5(C) AS (Select C1.C From CTE_4 C1 Cross Apply CTE_4 C2),
  CTE_6(C) AS (Select ROW_NUMBER () Over (Order By (Select Null)) From CTE_5)
Insert Into NonPartitioned With (Tablock)
(
 Id,
 CreationDate,
 Value1,
 Value2,
 Value3,
 Value4,
 Value5,
 Value6,
 Value7,
 CStatus
)
Select Top 100000000
 C AS Id, 
 DATEADD (SECOND, C, '2019-01-01') AS CreationDate,
 RAND(C) * 20 AS Value1,
 RAND(C) * 30 AS Value2,
 RAND(C) * 40 AS Value3,
 RAND(C) * 50 AS Value4,
 RAND(C) * 60 AS Value5,
 RAND(C) * 70 AS Value6,
 RAND(C) * 80 AS Value7, 
 Cast (RAND(C) * 10 AS Int) % 2 As CStatus
From CTE_6
GO
 
Set Identity_Insert NonPartitioned Off
GO 

A unique clustered index is created on the CreationDate column.

Create Unique Clustered Index IX_CreationDate On NonPartitioned (CreationDate) 
With (Data_Compression = Page) 
GO

To compare the performance of a query on a non-partitioned table with a partitioned table, I load the data into another table and then partition it on a monthly basis.

Drop Table If Exists Partitioned
GO
 
Create Table Partitioned
(
 Id Int Not Null Identity,
 CreationDate DateTime2 Not Null,
 Value1 Decimal (10, 8) Not Null,
 Value2 Decimal (10, 8) Not Null,
 Value3 Decimal (10, 8) Not Null,
 Value4 Decimal (10, 8) Not Null,
 Value5 Decimal (10, 8) Not Null,
 Value6 Decimal (10, 8) Not Null,
 Value7 Decimal (10, 8) Not Null,
 CStatus Bit Not Null
)
GO
 
Alter Table Partitioned Rebuild With (Data_Compression = Page)
GO
 
Set Identity_Insert Partitioned On
GO
 
Insert Into Partitioned With (Tablock)
(
 Id,
 CreationDate,
 Value1,
 Value2,
 Value3,
 Value4,
 Value5,
 Value6,
 Value7,
 CStatus
)
Select * From NonPartitioned
GO
 
Set Identity_Insert Partitioned Off
GO
 
Create Partition Function PF_Date (DateTime2) AS Range Right For Values
(
 '2019-02-01 00:00:00.0000000',
 '2019-03-01 00:00:00.0000000',
 '2019-04-01 00:00:00.0000000',
 '2019-05-01 00:00:00.0000000',
 '2019-06-01 00:00:00.0000000',
 '2019-07-01 00:00:00.0000000',
 '2019-08-01 00:00:00.0000000',
 '2019-09-01 00:00:00.0000000',
 '2019-10-01 00:00:00.0000000',
 '2019-11-01 00:00:00.0000000',
 '2019-12-01 00:00:00.0000000',
 '2020-01-01 00:00:00.0000000',
 '2020-02-01 00:00:00.0000000',
 '2020-03-01 00:00:00.0000000',
 '2020-04-01 00:00:00.0000000',
 '2020-05-01 00:00:00.0000000',
 '2020-06-01 00:00:00.0000000',
 '2020-07-01 00:00:00.0000000',
 '2020-08-01 00:00:00.0000000',
 '2020-09-01 00:00:00.0000000',
 '2020-10-01 00:00:00.0000000',
 '2020-11-01 00:00:00.0000000',
 '2020-12-01 00:00:00.0000000',
 '2021-01-01 00:00:00.0000000',
 '2021-02-01 00:00:00.0000000',
 '2021-03-01 00:00:00.0000000',
 '2021-04-01 00:00:00.0000000',
 '2021-05-01 00:00:00.0000000',
 '2021-06-01 00:00:00.0000000',
 '2021-07-01 00:00:00.0000000',
 '2021-08-01 00:00:00.0000000',
 '2021-09-01 00:00:00.0000000',
 '2021-10-01 00:00:00.0000000',
 '2021-11-01 00:00:00.0000000',
 '2021-12-01 00:00:00.0000000',
 '2022-01-01 00:00:00.0000000',
 '2022-02-01 00:00:00.0000000',
 '2022-03-01 00:00:00.0000000'
)
GO
 
Create Partition Scheme PS_Date AS Partition PF_Date All To ([Primary])
GO
 
Create Unique Clustered Index IX_CreationDate On Partitioned (CreationDate) 
With (Data_Compression = Page)
On PS_Date (CreationDate) 
GO

By using the following simple query, you can view information about partitions, like the number of rows in each partition and the compression status.

Select p.index_id, partition_number, rows, data_compression_desc 
From sys.partitions p
Where object_id = OBJECT_ID (N'Partitioned') 
GO

Results are displayed in the following image.

Partition Info

To enable incremental statistics, we can use this command:

Update Statistics dbo.Partitioned (IX_CreationDate) With Fullscan, INCREMENTAL = ON
GO

To measure the number of logical reads, use this command, and to view the actual execution plan, press Ctrl+M in SSMS.

SET STATISTICS IO ON
GO

This query retrieves data for seven days in August 2021. I will first execute it on the non-partitioned table.

Select * 
From NonPartitioned 
Where CreationDate >= '2021-08-03 00:00:00.0000000' 
And CreationDate < '2021-08-10 00:00:00.0000000'
GO

As shown in the image below, SQL Server performed a clustered index seek operation:

Clustered Index Seek NonPartitioned

The number of logical reads is 4025.

Logical Reads NonPartitioned

I will now execute the query on the partitioned table.

Select * 
From Partitioned 
Where CreationDate >= '2021-08-03 00:00:00.0000000' 
And CreationDate < '2021-08-10 00:00:00.0000000'
GO

SQL Server performed a clustered index seek operation again:

Clustered Index Seek Partitioned

If I hover the mouse over the clustered index seek operator, the tooltip pops up and we can see that partitioned is true and the actual partition count is one. This means that partition elimination occurred, and SQL Server read data from a partition instead of the entire table.

PartitionCount

Look at the number of logical reads in the image below, it is 4025.

Logical Reads Partitioned

Table partitioning did not reduce the number of logical reads, and it did not improve query performance despite partition elimination.

I’m going to walk through a scenario where index partitioning reduces query performance. I start this demo by creating a nonclustered index on the non-partitioned table and an aligned nonclustered index on the partitioned table.

Create Nonclustered Index IX_Value7 On NonPartitioned (Value7)
With (Data_Compression = Page)
GO
 
Create Nonclustered Index IX_Value7 On Partitioned (Value7)
With (Data_Compression = Page)
On PS_Date(CreationDate)
GO

To enable incremental statistics, I use this command:

Update Statistics dbo.Partitioned(IX_Value7) With Fullscan, INCREMENTAL = ON
GO

Let’s run a query that returns 100 rows from the non-partitioned table and orders them by Value7 in descending order.

Select Top 100 * 
From NonPartitioned
Order By Value7 Desc
GO

SQL Server used a nonclustered index scan followed by a key lookup.

Scan+ Keylookup

SQL Server read 419 pages from the buffer pool to display the results.

NonPartitioned Top

In order to compare the performance, I ran the query on the partitioned table.

Select Top 100 * 
From Partitioned
Order By Value7 Desc
GO

The query optimizer ignored the nonclustered index and scanned the whole table

Partitioned Top

The image below shows the number of logical reads.

Logical Read sPartitioned Top

Summary

Before partitioning a rowstore table, you need to specify your purpose for doing so. In my experience with production and test environments, partitioning has not improved query performance on rowstore tables.This article demonstrated that partitioning can reduce query performance in some scenarios.However, partitioning is a great solution for managing large tables.

Next Steps

6 Comments

  1. Nice article. But what I miss is identifying the problem. You mentioned that you want to improve performance. I do not think that’s the problem. What could be a problem with large tables in OLTP?
    – large transaction log when deleting old data
    – slow times when selecting HOT data
    – last page contention
    In OLAP:
    – slow load of new data and slow deletion of old data

    • Thanks for the feedback. Actually, the core point I wanted to make in the article is that data partitioning should not be seen as a performance optimization tool for faster queries, but as a management strategy for handling large tables.

    • There are potential issues with data partitioning.
      In fact, it is not a mechanism intended to improve query performance, and it may even degrade performance for certain types of queries. However, as I stated in the article, data partitioning is an excellent tool for managing large tables, for example when truncating a specific partition.

Leave a Reply

Your email address will not be published. Required fields are marked *