SQL Server tempdb one or multiple data files
Tempdb plays an important role on SQL Server performance. A tempdb database that resides on a slow set of disks or a tempdb database that has been sized incorrectly may have an overall impact on query performance. In this tip I will go over some best practices for performance related to Tempdb.
Best practice recommends placing tempdb on a fast I/O subsystem and to use disk striping to numerous direct attached disks. Best practice, also recommends creating many files to maximize disk bandwidth and to reduce contention in allocation structures. As a general guideline, best practice, suggests creating one data file per CPU. Each file should be set to the same size because this allows the proportional fill algorithm to distribute the allocation load uniformly with minimal contention.
Contention in tempdb is caused with PFS, GAM, and SGAM page allocation when a lot of very small temp tables are created. It is not the goal of this tip to provide a detailed explanation of tempdb contention. The goal of this tip is to illustrate, with an example, that sometimes the blind application of best practices can be counterproductive to performance.
The SQL code that I will use in this example comes from my previous tip SQL Server 2008 64bit Query Optimization Trick.
Test 1: Using SQL Server with 1 tempdb
I ran the following query 3 times (keep in mind that this query spills the sort operation onto tempdb)
--- T-SQL script provided by: www.sqlworkshops.com DECLARE @c1 INT, @c2 INT, @c3 CHAR(2000) SELECT @c1=c1, @c2=c2, @c3=c3 FROM tab7 WHERE c1 < 100000 ORDER BY c2 --Results: --CPU time = 1100 ms, elapsed time = 11670 ms. --CPU time = 2190 ms, elapsed time = 12200 ms. --CPU time = 1720 ms, elapsed time = 12630 ms.
Average: (11670+12200+12630)/3= 12166
Test 2: Using SQL Server with 2 tempdb files with the same size files
I ran the following query 3 times
--- T-SQL script provided by: www.sqlworkshops.com DECLARE @c1 INT, @c2 INT, @c3 CHAR(2000) SELECT @c1=c1, @c2=c2, @c3=c3 FROM tab7 WHERE c1 < 100000 ORDER BY c2 --Results: --CPU time = 1500 ms, elapsed time = 13740 ms. --CPU time = 1710 ms, elapsed time = 14940 ms. --CPU time = 1610 ms, elapsed time = 14340 ms.
Average: (13740+14940+14340)/3= 14340
As we can see, on average, when the query runs on 2 tempdb files it is 2174 ms slower then when it runs on a single tempdb file. The reason why is that when sort operations spill over to tempdb they are pretty sequential. If we have two tempdb files, we are making 2 streams of sequential I/O which is not really sequential anymore unless our SAN is configured to handle 2 or more streams of sequential I/O concurrently.
The above picture depicts the two streams of I/O over the two tempdb files. Process Monitor shows that SQL alternates its writing in sequential order over tempdb.mdf and tempdev1.ndf.
If our subsystem I/O is capable of handling concurrent streams of sequential I/O then the best practice of a tempdb file per core will help our query performance otherwise; it will be counterproductive to performance.
- See: Concurrency enhancements for the tempdb database http://support.microsoft.com/kb/328551
- Refer to this previous tip: SQL Server 2008 64bit Query Optimization Trick
About the author
View all my tips
Article Last Updated: 2010-04-08