Custom SQL Server Index Defrag and Rebuild Procedures
I often get asked for custom procedures to help with index maintenance, including things such as reorganizing, rebuilding, stats updates, etc. There are quite a variety of these out there today (a simple google search will get you a bunch), and all have their pros/cons. The big thing I generally stress to customers when asked about this type of thing is to do what is best for their environment - a custom procedure that works perfectly for me in my environment(s) might be completely wrong for your environment. Of course, there are some general rules of thumb that you can use in most any environment, but you get the idea - there is usually never a solution that is perfect for all environments (though that doesn't mean we can't keep trying to make one :-)).
Some general observations that I've seen in multiple custom procedures in the past include:
- Remember that an INDEX REBUILD (or DBREINDEX) also updates statistics - there is no need at all to update statistics for the index after rebuilding (or before). In reality, you could end up with a worse case than you had - when performing an index rebuild, stats are updated using data gathered from a look at all the data in the index (after all, all the data had to be touched to rebuild anyhow), similar to if you had performed a stats update with a full-scan. If you then update stats on the index after the rebuild and use anything less than a full-scan option, you'll most likely end up with stats that aren't as 'accurate' as the full-scan version.
- Do not simply blindly rebuild/reorganize all indexes for all tables in all databases - I see this far too often. Doing so not only wastes resources and processing time that can/could be used for other things, it also typically fails to give you any insight/feedback on the state of fragmentation in your system, or how fragmentation impacts your workload (if at all).
- Generally you should include some simple logic to weed-out small indexes (say for example, indexes with less than 5,000 or 10,000 pages). In many, many database schemas, there are 100's or 1000's of 'small' tables with multiple indexes - spending time rebuilding these can add up as the number grows. Use that time instead on the 10's or 100's of tables that really make a difference.
- If your custom procedure samples fragmentation of indexes, consider storing that data somewhere for analysis - you've done the bulk of the work by having the engine retrieve the stats, why not store them and report/analyze it so you can gain insight into the fragmentation levels of your data over time?
- Consider placing a time limit on the operation - it won't be an exact limit, but something that stops the execution of the procedure after a certain amount of time has passed and the last full operation completed would suffice in most cases - too many times I've had people call about sluggish/slow responding servers early in the morning when it boiled down to large nightly maintenance operations running long overnight.
- When using methods to determine fragmentation levels of an index to determine if it should be rebuilt or not, I typically see people use the "avg_fragmentation_in_percent" value as the hard guideline - you may also want to consider page density as a contributor, since that is logical fragmentation also (and impacts operations in cache as well as IO). If you're already looking for a level of avg_fragmentation_in_percent, consider using the same value as a guideline for "100 - avg_page_space_used_in_percent" as well.
I'm sure there are more, but that's my starting list anyhow. So, the next question I usually get is "well, what procedure do you use?" - that's a loaded question :-). I do have a generic procedure that I'll typically start with, and if the workload or environment requires it, I will modify it to be effective in that environment. I really can't stress this enough again - it ALWAYS depends on the environment - you should know your environment enough to be able to determine what is best for it, I typically won't be able to tell you what's best without any knowledge, or even a little knowledge. So, to this post I've attached my generic version of a custom procedure that allows you to specify some options, and protects against a few things, like:
- You can specify a fragmentation threshold (%) that will be the low-water mark for considering which indexes to bother with (by using both the avg_fragmentation_in_percent and (100-avg_page_space_used_in_percent) values both)
- You can specify a date/time that the operation should stop if it passes
- Given the date/time specification, indexes are operated on in order of highest fragmentation levels to lowest, so the time spent is hopefully most beneficial
- You're allowed to define online/offline rebuilds, reorganizations, stats updates, partition level operations, etc.
- You can specify a location where you'd like to store the results of the stats check off indexes
- Stats will not be updated if you perform a rebuild
- Only indexes with > 10,000 pages will be touched
It's a fairly simple procedure for general use. If you'd like to get a look at a custom procedure that is a bit more specialized, offers additional options, etc., take a look at Ola Hallengren's version. This version allows you to specify a multitude of options and operations, can iterate over multiple databases, additional logging operations, parameterized values for things like page counts and fragmentation levels, etc. (it's also documented quite well).
Between the 2 solutions, I'm sure you could likely cover a wide variety of scenarios and workloads to find what works best for you. Enjoy!
About the author
View all my tips
Article Last Updated: 2008-03-09