![]() |
|
|
By: Chad Boyd | Read Comments (1) | Print Chad is an Architect, Administrator and Developer with technologies such as SQL Server, .NET, and Windows Server. Related Tips: More |
|
I spend a lot of time interfacing with SQL engineers of all types (developers, dba's, architects, ETL engineers, etc.) and something I hear very frequently is "I want to horizontally partition my data to improve my query response times to customers". This usually makes me follow up the statement(s) with questions of my own such as "what types of queries are you trying to improve response times on by partitioning your data?" or "what systems are you thinking of employing this type of strategy on ?". Too often I get responses that in turn send us down a path of discussing why partitioning data isn't a viable strategy for improving many types of queries/workloads (such as typical OLTP workloads, seeks of data - particularly seeks that don't filter/join on partitioning keys, etc.). Sometimes engineers and others are downright shocked to see some of results of different query types against partitioned data - usually, by the end of a 1/2 hour demo and discussion and walking through some scripts (which I've attached as well), everything makes good sense and there's a good understanding of why things work the way they do. It's frequently a misconception that using the horizontal partitioning features built-in to the engine in SQL 2005 will lead to large benefits for OLTP-like workloads/queries/etc. (i.e. seeks of data, singleton lookups, etc.) when the reality is that horizontal partitioning is really more about improving performance for things like administrative tasks, bulk data loads/switches, and scans of data when the scanning can be limited by the partition ranges.
This post is not about switching data with partitions, rolling window scenarios, administrative tasks with partitioning, bulk loading/moving data with partitioning, etc. - there is plenty of that in other places (but if someone would like to see some examples, by all means comment or contact me via email and I'll be happy to oblige). Instead, for the remainder of this post we are going to walk through query execution differences for a range of access methods (i.e. seeks, scans, singleton lookups, etc.) against a partitioned structure and a non-partitioned structure that are exactly the same with the exception of one having all indexes partitioned and one not being partitioned at all.
The schema for the partitioned table is as follows:
-- Create a sample table that will mimic the main table...
create
table dbo.Fund_Dtl (
Fund_Dtl_ID int identity(1,1) not null,
fillerColumn
char(150) not null,Fund_Summary_ID
int not null,Txn_Dt
datetime not null,Partition_Column char(5) not null default ('xyz')
) on [PartitionPSFD] ([Partition_Column]);
go
-- Cluster...
create
clustered index Fund_Dtl_Fund_Summary_ID_CI on dbo.Fund_Dtl (,Fund_Summary_ID
Partition_Column
)
on [PartitionPSFD] ([Partition_Column]);go
-- PK...
alter
table dbo.Fund_Dtl with check add constraint Fund_Dtl_pk primary key nonclustered(Fund_Dtl_ID,Partition_Column) on [PartitionPSFD] ([Partition_Column]);
go
-- Nonclustered...
create
nonclustered index ix_TxnDt on dbo.Fund_Dtl (Txn_Dt
)
include (,Fund_Summary_IDPartition_Column
)
on [PartitionPSFD] ([Partition_Column]);go
As you can see, we include a clustered index and 2 nonclustered indexes (1 of which is the primary key) - the clustered index includes the partitioning column as the 2nd key column of the index, as does the primary key. Notice however that the 2nd nonclustered index doesn't include the partitioning column as a key column of the index, but instead only as a covered column (as an included column). The script then proceeds to load 602,112 records into the partitioned table, filling the Partition_Column column with 7 distinct values (7 partitions) each with 86,016 records.
Ok, then we create the non-partitioned table as follows:
-- Create another table from the data in partitioned table that isn't partitioned...
select
*into
dbo.Fund_Dtl_NoPartitionfrom
dbo.Fund_Dtl with(nolock)go
-- Index it just like the main table, with no partitioning...
-- Cluster...
create
clustered index c2 on dbo.Fund_Dtl_NoPartition (,Fund_Summary_ID
Partition_Column
);
go
-- PK...
alter
table dbo.Fund_Dtl_NoPartition with check add constraint Fund_Dtl_np_pk primary key nonclustered(Fund_Dtl_ID,Partition_Column);
go
-- Nonclustered...
create
nonclustered index ix_TxnDt_np on dbo.Fund_Dtl_NoPartition (Txn_Dt
)
include (,Fund_Summary_IDPartition_Column
);
go
As you can see, the non-partitioned table is identical to the partitioned table in every way with the exception that it isn't partitioned at all - the columns are the same, the indexes are the same, the data is the same, etc.
Ok, at this point the script continues to run 6 different identical tests against the partitioned and non-partitioned tables, and commenting on the performance differences on my machine - I'd encourage you to download the attached script and run through it on your own system to become familiar with the setup, results, etc. (it is heavily commented). If anyone has additional tests they'd like to try out, or if anyone sees any discrepancies, by all means, let me know. I've included a brief summary of the tests included in the script and the results from my machine below for those who don't have access to anything for using the attached script (or mobile readers :-)). Enjoy!
Chad Boyd ~~~ This posting is provided "AS IS" with no warranties, and confers no rights. Use of any included script samples are subject to the terms specified at http://www.mssqltips.com/disclaimer.asp and http://www.mssqltips.com/copyright.asp.
TESTS:
| Share: | Share | Tweet |
|
![]() |
![]() |
Free SQL Server Learning |
| Monday, June 13, 2011 - 1:48:27 PM - Mauricio |
|
|
Hi Chad. In your article (http://www.mssqltips.com/tip.asp?tip=2273) you say that tehre is a script attached to it. I would like to download this script but had no luck in finding how to do it. thanks in advance if you can help on this, Mauricio |
|
|
privacy | disclaimer | copyright | advertise | about authors | contribute | feedback | giveaways | user groups Some names and products listed are the registered trademarks of their respective owners. Edgewood Solutions LLC | MSSharePointTips.com | MSSQLTips.com |