Solving SQL Server Database Physical File Fragmentation

By:   |   Comments (13)   |   Related: 1 | 2 | 3 | 4 | 5 | 6 | > Fragmentation and Index Maintenance


Problem

Database physical file fragmentation isn't usually taken into consideration very often as a performance issue. But, if the data files are fragmented, the database engine will take longer to retrieve data because of seek overhead or rotational latency in mechanical disks. In addition, "A heavily fragmented file in an NTFS volume may not grow beyond a certain size", if you have Auto Grow enabled in some very unfortunate situations the process could fail with the error: "665 (The requested operation could not be completed due to a file system limitation.)".

In this tip we will view the impact of physical file fragmentation and how to solve it with Microsoft's Sysinternals tool Contig.

Solution

Physical file fragmentation is a matter of consideration when talking about system performance. But, we as DBA's don't give much importance to this topic and focus all of our energies on the database, which is our job, but sometimes we have to see the entire picture. In most cases, physical file fragmentation should be the last thing to check.

Let's see what can lead to file fragmentation.

  1. Many mid-sized organizations, to reduce costs, share the database server with other applications like SharePoint or a web server. In this scenario, the file system could be fragmented regardless of the database. And when the database needs to allocate disk space, it could be scattered.
  2. Repeated backup operations could lead to disk fragmentation, so you should backup to other disks or devices.

Impact of physical file fragmentation

Clustered indexes try to organize the index such that only sequential reads will be required. This is great if the .MDF is contiguous on disk, but not as helpful if the file is spread over the disk. Remember that SQL Server doesn't know about file allocation, that is an operating system task.

The Sysinternal's Contig tool

This is a free utility from Microsoft aimed to optimize individual files, or to create new files that are contiguous. You can download it for free here http://technet.microsoft.com/en-us/sysinternals/bb897428

It's very easy to use. With the contig -a option you can analyze the fragmentation of a specified file with the database online. To defrag the file simply run Contig [FileName].

The Sysinternal's Contig too

Needless to say that in order to defrag a database, the database file must be OFFLINE.

The test

I have created a physically fragmented database to show you the effects of physical file fragmentation on performance. The DB is 3.5 GB size and as you can see in the picture below it's heavily fragmented, there are 2000 fragments.

Contig: Analyze fragmentation

Now in order to defragment the data file we need to take the database OFFLINE and then run contig and then bring the database back ONLINE.

ALTER DATABASE [Inter3112] SET OFFLINE
GO

Then we can use contig [Filename] as shown below to defragment.  We can see after the command there are only 4 fragments.

Contig: After defragmentation

ALTER DATABASE [Inter3112] SET ONLINE
GO

Checking Impact of File Fragmentation

The workload for the test was a simple DBCC CHECKTABLE.

DBCC CHECKTABLE ('inter3112.dbo.invoice')
GO

With the help of sys.dm_io_virtual_file_stats DMV which is out of the scope of this tip, we can see the total time, in milliseconds, users waited for I/O to be completed on the file (io_stall column).

I created a table in tempdb in order to store the results from sys.dm_io_virtual_file_stats and here is the script.

CREATE TABLE tempdb..file_stats
(
   Row_ID                 INT IDENTITY (1, 1) ,
   sample_ms              INT,
   num_of_reads           BIGINT,
   num_of_bytes_read      BIGINT,
   io_stall_read_ms       BIGINT,
   num_of_writes          BIGINT,
   num_of_bytes_written   BIGINT,
   io_stall_write_ms      BIGINT,
   io_stall               BIGINT,
   size_on_disk_bytes     BIGINT,
   CONSTRAINT Row_ID_PK PRIMARY KEY (Row_ID)
)
INSERT INTO tempdb..file_stats
   SELECT sample_ms,
          num_of_reads,
          num_of_bytes_read,
          io_stall_read_ms,
          num_of_writes,
          num_of_bytes_written,
          io_stall_write_ms,
          io_stall,
          size_on_disk_bytes
     FROM sys.dm_io_virtual_file_stats (DB_ID ('inter3112'), 1)
GO

Rows 2 and 4 on the image below represent the execution of DBCC CHECKTABLE before and after the file defrag. As you can see, the io_stall value dropped about 1100ms after defragmenting from 2415ms to 1344ms.

Contig: After defragmentation

Conclusion

Before you start defragmenting your database files you have to see what the best approach to your environment is and keep in mind that defragmenting is very I/O intensive. Also, this will be easier to do on smaller database files, because not as much contiguous disk space is required, and harder for larger databases.

Next Steps


sql server categories

sql server webinars

subscribe to mssqltips

sql server tutorials

sql server white papers

next tip



About the author
MSSQLTips author Daniel Farina Daniel Farina was born in Buenos Aires, Argentina. Self-educated, since childhood he showed a passion for learning.

This author pledges the content of this article is based on professional experience and not AI generated.

View all my tips



Comments For This Article




Monday, September 19, 2016 - 7:01:08 PM - Daniel Farina Back To Top (43358)

 Hi HP!

You should bring the database back online with the following syntax

 

ALTER DATABASE [Your DB Name] SET ONLINE

 

Best Regards!


Monday, September 19, 2016 - 12:45:25 AM - HP Back To Top (43350)

I need your help!

with the tool I get the following result appears. 
What can I do to write back to the database ?

Summary:
Number of files processed: 1
Number unsuccessfully procesed: 0
Average fragmentation : 2.07086e+006 frags/file

Actually there is no wirting into the Db possible.
I want to extend the Database for more Data.

Kind regards
HP

Tuesday, July 22, 2014 - 11:19:00 PM - Daniel Farina Back To Top (32833)

It is a wide question. And the answer depends on many factors:
When he says that the table has the same number of rows over time he is not giving enough info. The table can be subject to multiple updates, or even inserts and deletions. I mean you insert 2 rows and delete two and remain having 1000 rows. So in this case it could be a statistics issue (you update a lot of rows but not enough to SQL Server auto update statistics) or index fragmentation (the inserts and deletions).
It could be a staging table whose gets truncated and then loaded again with 1000 rows and if the table is a heap and the rest of the database has a lot of activity you may have fragmentation again. 
Or maybe the table is in a read only filegroup, in this case it can be physical fragmentation or even a problem with the storage system (either disks or a switch)
I Hope I have given you a hand.
Thank you for reading!


Saturday, July 19, 2014 - 2:03:25 PM - Ganesh S K Back To Top (32791)

Sir, one of the technical interviewer asked why does the access time for a query iincreases with time.

Ex: If we have 1000 tuples and initially its access time is (say) 5ms. But after 5 years for same query it will take 5sec while tuples remain same. 

Why is this?


Tuesday, June 3, 2014 - 2:16:50 PM - Gustavo Ayala Back To Top (32065)

Incredibly as it may seem, you don't need to take a database offline to do a defrag.

I've tried with databases online (even tlogs in heavy use), and it works. First a thought it was some Mark Russinovich trick, but it isn't.

Contig leverages windows defrag APIs as old as Windows 2003.

 


Wednesday, September 18, 2013 - 2:01:24 PM - Daniel Farina Back To Top (26845)

Hi Mark!

When Database is online, the instance of SQL Server has an exclusive lock on DB files, and that prevents defrag software to process those files.

I like your script! You did a very good use of contig tool!

Thanks for reading and comment!


Wednesday, September 18, 2013 - 10:00:42 AM - Mark Freeman Back To Top (26842)

>Needless to say that in order to defrag a database, the database file must be OFFLINE.

"Must" is a strong word. I don't believe that this is a requirement. However, if the file is being actively used, defragmenting it will have a significant performance impact while the defragmentation is taking place. It will also probably make the defragmentation take longer.

I’ve been surprised to find that many defragmenters don’t make any attempt to lessen the fragmentation of a file if they can’t defragment it completely. Contig is wonderful in these cases.

On one of my development servers that is relatively tight on disk space and has several very large files that keep growing, I created a simple .cmd file consisting of contig commands to defrag those files in the order of their importance and then set up a Task Manager job to run that .cmd file late at night.

One enhancement I added was to call Contig first on files that tend to fragment very often, so that they aren't breaking the disk space up into lots of pieces which make it difficult to fully defragment large files. A great example of such folders are the ones that are used by antivirus software to hold definitions and logs. Once those are done, then my .cmd file moves on to defrag my .?df files. On my (32-bit) server, an edited down version of my .cmd file looks like this:

"C:\Program Files\Contig\Contig" -v -s "C:\Windows\System32\drivers\AVG\*.*"
"C:\Program Files\Contig\Contig" -v -s "C:\Documents and Settings\All Users\Application Data\AVG2013\*.dat"
"C:\Program Files\Contig\Contig" -v -s "C:\Documents and Settings\All Users\Application Data\AVG2013\*.log"
"C:\Program Files\Contig\Contig" -v -s "C:\Program Files\Microsoft SQL Server\tempdb*.?df"
"C:\Program Files\Contig\Contig" -v -s "C:\Program Files\Microsoft SQL Server\MyMainDatabase*.?df"
"C:\Program Files\Contig\Contig" -v -s "C:\Program Files\Microsoft SQL Server\*.?df"

The way the -s option works in Contig is that it will look for all matching files in every folder within the one specified, so these commands will defrag the files in every instance (C:\Program Files\Microsoft SQL Server\*\MSSQL\Data). Note that by specifying that path 3 times in the .cmd file, Contig has to search all those folders 3 times and it is not very quick at that. But this does get the most critical files defragmented first.


Friday, September 13, 2013 - 3:02:31 PM - Daniel Farina Back To Top (26792)

Hi Bill, you can use the following Performance Counter:

"\LogicalDisk\Split IO/sec"

This counter measures if a single IO request was split into multiple requests. And that is a good indicator of file system fragmentation

 

 


Friday, September 13, 2013 - 1:35:40 PM - Pavan Back To Top (26786)

Thanks, Learned new tip...


Friday, September 13, 2013 - 10:24:28 AM - Bill Back To Top (26783)

Were your tests performed on a single disk or a disk array?  In my larger organization, we utilize SAN storage and I am assured by the storage admins that the management software for the SAN takes care of managing fragmentation.  Do you know what metrics I should have them provide to prove that physical fragmentation is not an issue?


Wednesday, August 14, 2013 - 4:00:24 PM - Daniel Farina Back To Top (26310)

Thank you Jeremy for your support! And thanks Christian for reading!

 


Tuesday, August 13, 2013 - 10:09:33 PM - Christian Back To Top (26288)

Hi Daniel, a really clear article about this ghost problem related to physical hard disks.

 

Thanks for the solution proposed.

 

 


Tuesday, August 13, 2013 - 5:23:40 PM - Jeremy Kadlec Back To Top (26284)

Daniel,

Congrats on your first tip!  Great job!

Thank you,
Jeremy Kadlec
Community Co-Leader















get free sql tips
agree to terms