SQL Server Fragmentation How to detect it (Part 6 of 9)
In our 6th post in the fragmentation series (I now know we are going to have 8 total) we are going to talk about the different ways to determine if you have fragmentation, and what type of fragmentation you have. This type of analysis will help you determine if you are being impacted by it for your given types of workloads and try to determine if you need to address it. In the final 2 posts coming up in the series we'll talk about how to address it and then finally provide an all-in-one script that you can use to validate everything we've talked about in the series for yourself and get a first-hand look at code in action.
SQL Server provides 1 primary tool for detecting fragmentation, and that tool is different for SQL 2000 vs. 2005. For SQL 2000, it's the DBCC SHOWCONTIG statement, and for 2005 it's the sys.dm_db_index_physical_stats DMF. The prior statement still works in 2005, however it's strongly recommended that you start using the new and improved DMF, which contains algorithmic improvements for more accurately determining fragmentation, uses only IS level locks (vs. full Shared locks in default scanning mode for DBCC SHOWCONTIG), outputs information on things like forwarded records and ghosted records (for a heap), and includes more detailed filtering capabilities. One fairly significant improvement to the algorithm for SQL 2005 is that it will accurately report extent level fragmentation across multiple files - with the SHOWCONTIG statement in 2000, if you have an index that spans multiple files, the extent fragmentation numbers reported would be completely useless. These tools will each report on logical fragmentation, page densities, and extent level fragmentation - nothing exists in SQL Server for detecting file-level fragmentation, but you can use any standard OS-level defraging utility to check this if necessary (most likely it's not - if you do attempt this, be sure that SQL Server is offline).
Each of the statements include options for different scanning modes - the 2005 version includes 3 modes, SHOWCONTIG only has 2:
- FAST/LIMITED - This mode (fast in 2000, limited in 2005) will scan only the parent pages of the leaf-level pages for the given index(es) (i.e. the final intermediate level of the index before the leaf). The advantages to this mode are that only a fraction of pages will be touched during the scan compared to a scan of the full leaf-level of the index (a fraction of the pages that would be equal to reducing the number of pages scanned by a factor of the fanout of the index). The down-side to this mode is that the engine can't report on some data such as page density, record and page counts, ghost/forwarded record counts, record sizes data, etc.
- SAMPLED - Only available in the sys.dm_db_index_physical_stats DMF, not available with SHOWCONTIG. This mode performs just like the DETAILED mode only on a subset of the leaf pages instead of all the pages of the leaf. If the index/heap has < 10,000 pages, the DETAILED mode will be used, but if there are >= 10,000 pages, only a 1% sample of the pages will be scanned and inspected. Obviously, the statistics reported will only be accurate for the pages scanned (i.e page density, etc.), but should be a pretty good indicator as a whole.
- DETAILED - This is the non-fast mode for SHOWCONTIG. In this mode, all pages in all levels of the index/heap are scanned and inspected. The most thorough and accurate of all modes, and also the most intrusive and longest running mode.
Given that the most intrusive type of fragmentation in most scenarios is logical fragmentation, using the FAST/LIMITED modes can be very beneficial for gauging the amount of logical fragmentation you have on an index without impacting the SQL buffer cache of holding long S/IS locks on the structure. If you have a very large index/heap, the SAMPLED mode will most also likely get you a very good estimate of the amount of all types of fragmentation you may have. To give you an idea on the effects of the FAST/LIMITED options on run-times for a SHOWCONTIG / sys.dm_db_index_physical_stats runtime, take a gander at the following:
|OPTION||Page Count||Run Time|
As you can see, using the FAST/LIMITED options can have quite a drastic impact on the run-times of the statement, so be sure to use the appropriate options for the appropriate data/environment.
So, what are the key metrics to look for in the 2 statements and how do they each correlate to the types of fragmentation and what we already know about fragmentation? Let's hit some fields from each statement that are key metrics to understand (in the bullets, I'll list the field name for SHOWCONTIG first, then the sys.dm_db_index_physical_stats column 2nd):
- "Logical Scan Fragmentation" / "avg_fragmentation_in_percent" - This is the primary indicator for the amount of logical fragmentation you have for the given structure (more accurately, it's the percentage of non-properly order pages, since it reports nothing to do with page density). The lower the number the better - as this number approaches 100% the more pages you have in the given index that are not properly ordered. For heaps, this value is actually the percentage of extent fragmentation and not logical fragmentation.
- "Avg. Page Density" / "avg_page_space_used_in_percent" - This is the primary indicator for how dense the pages in your index are, i.e. on average how full each page in the index is. The higher the number the better speaking in terms of fragmentation and read-performance. As this number approaches 0% the less space on each page in the index that is actually used to store data, and the more pages it takes to store the same amount of data (and hence the more pages that need to be read to get all records, the more memory space that is needed in the buffer pool, etc.)
- "Extent Scan Fragmentation" / "fragment_count" and "avg_fragment_size_in_pages" - These are the primary indicators for extent fragmentation in the given structure. For 2000, it's a measure of out-of-order extents in the leaf-level of an index (not valid for heaps in 2000). For 2000, the lower the number the better. For 2005, a fragment is a group of physically consecutive leaf pages in the same file for a given structure - each structure has at least 1 fragment (best-case scenario) and at most the same number of fragments as there are pages in the structure - so, the lower this number the better. The "avg_fragment_size_in_pages" value is an indication of the average size of each fragment for the given structure, and the higher the value the better (the higher the value, the greater the number of physically consecutive pages on average throughout the structure).
Those are the best indicators from the internal tools for determining fragmentation along with a brief description of what to look for with each. I often get asked if there are other ways to check fragmentation, see page linkage information for pages that are out of order, etc. - I do have a custom procedure/method that can be used, which I'll post separately following this post with a brief description.
Ok, that wraps up this post, only 2 more remaining in the series, with the next discussing how to address fragmentation and what the different methods do under the covers, then the final post with a full walk-through script outlining everything we've talked about first-hand.
About the author
View all my tips