By: Aaron Bertrand | Comments (1) | Related: > SQL Server 2017
Problem
In a previous tip, Rajendra Gupta provided a thorough introduction to a new feature in SQL Server 2017, Resumable Online Index Rebuilds. This feature allows you to pause index rebuilds for various reasons, including bumping up against the edge of your maintenance window, forcing a failover, or installing emergency patches.
You can also pause an online index rebuild and resume with a different
MAXDOP
setting, either to make it go faster, or to
stop it from consuming all of your CPU.
But what really happens while this operation is paused? What happens if you try to rebuild an index that is already being rebuilt and is currently paused? What happens if SQL Server crashes while a resumable index rebuild is processing?
Solution
These are scenarios not typically mentioned in blog posts, but they are easy to test out.
First, let’s take our trusty WideWorldImportersDW database, and blow up the biggest fact table by 100x, so SQL Server has something to chew on:
SELECT mk = [Movement Key] + 0, [Date Key], [Stock Item Key], [Customer Key],
[Supplier Key], [Transaction Type Key], [WWI Stock Item Transaction ID],
[WWI Invoice ID], [WWI Purchase Order ID], [Quantity], [Lineage Key]
INTO Fact.MovementCopy FROM Fact.Movement;
GO SET NOCOUNT ON;
GO INSERT Fact.MovementCopy SELECT * FROM Fact.Movement;
GO 99
The system procedure sp_spaceused
shows that this
table is a 1.33 GB heap. We can add a couple of indexes to it:
CREATE CLUSTERED INDEX WideClustered ON Fact.MovementCopy
( mk, Quantity, [Supplier Key], [Customer Key], [Date Key] DESC )
WITH (ONLINE = ON); CREATE NONCLUSTERED INDEX SkinnyNCI ON Fact.MovementCopy
( [Date Key], Quantity ) INCLUDE ([Lineage Key])
WITH (ONLINE = ON);
And now it is more substantial (we’ve doubled it to about ~2.62 GB). We can see how those are broken down in terms of pages per index as follows:
SELECT index_id, pages = COUNT(*)
FROM sys.dm_db_database_page_allocations(DB_ID(), OBJECT_ID(N'Fact.MovementCopy'), NULL, NULL, 'LIMITED')
GROUP BY index_id;
Your results may vary slightly, but I get:
Where is data stored during a resumable rebuild?
Let’s start a rebuild of the clustered index, make it resumable, and set the max duration to a minute, because surely the operation won’t finish in that time:
ALTER INDEX WideClustered ON Fact.MovementCopy
REBUILD WITH (ONLINE = ON, RESUMABLE = ON, MAX_DURATION = 1 MINUTES);
You can open a new query window during that first minute and run queries against the table, just to be sure that the operation is in fact happening online and that user activity is not disrupted.
After a minute, you’ll get a series of error messages, since the way
MAX_DURATION
works (as well as issuing
PAUSE
from a remote session) is that the session gets
killed. This is not the friendliest experience, I’ll admit, but it works.
The point is, what do you think happens while this operation is paused? Rajendra already explained that you can use this opportunity to back up the log (or issue a checkpoint) and even perform shrink operations, so clearly the log isn’t being used (at least not directly) in order to maintain the work that’s been done so far. This data must be stored elsewhere (and it can therefore affect your database size). But where?
Well, let’s run our query against the page allocations function again:
SELECT index_id, pages = COUNT(*)
FROM sys.dm_db_database_page_allocations(DB_ID(), OBJECT_ID(N'Fact.MovementCopy'), NULL, NULL, 'LIMITED')
GROUP BY index_id;
This time, I get:
(Yes, there are two index copies being built, since rebuilding the clustered index also rebuilds all the non-clustered indexes.)
From those results, you can see that the rebuild work was paused after just a
little more than 50% of the work was done, and that the data is stored in “hidden”
indexes (they do not appear in sys.indexes
). Note that these hidden
indexes are not captured by the aggregate information returned by sp_spaceused
,
so that might not be the best tool to use if you’re trying to determine where
any extra space is coming from. The point is that this resumable rebuild operation
does take up space in your database (and note that SORT_IN_TEMPDB
is
not a viable option here – I’ll bet you can guess why).
As an aside, you can use this information to determine how close you are to being finished and, if you’ve kept the amount of time each resumable operation has taken, how much longer it will take. You can even use this information over time to build trends and predict durations for future rebuild operations as your database grows.
What happens if I try to rebuild an index that is already paused?
As you might expect, if you try to rebuild an index that is already part of a resumable operation, you will get an error message similar to one you would get from an index that is actively rebuilding. The text is slightly different (I think there is a token issue somewhere in current builds that will hopefully be fixed before RTM):
Cannot perform this operation on 'object' with ID 1090102924 as one or more indexes are currently in resumable index rebuild state. Please refer to sys.index_resumable_operations for more details.
The moral of the story is that if you have any tools, scripts, or 3rd party applications that perform defrag / rebuild operations for you, they’ll need to additionally check sys.index_resumable_operations to make sure that there isn’t an operation currently suspended. They may already have error handling in place, but it may be coded to a different error number or based on parsing different message text. What it may be looking for is:
Could not proceed with index DDL operation on table 'MovementCopy' because it conflicts with another concurrent operation that is already in progress on the object. The concurrent operation could be an online index operation on the same object or another concurrent operation that moves index pages like DBCC SHRINKFILE.
Let’s finish up our paused operation so we can start a new one in the next section:
ALTER INDEX WideClustered ON Fact.MovementCopy RESUME;
What happens if SQL Server crashes during a resumable operation?
I don’t know what I expected to happen here, and I don’t know that it’s reasonable to have expected this to be super graceful, but this is also easy to test. First, make sure you are on an unimportant, non-production instance of SQL Server. I can’t stress this enough.
Next, in one window, start this operation:
ALTER INDEX WideClustered ON Fact.MovementCopy
REBUILD WITH (ONLINE = ON, RESUMABLE = ON);
Now, quickly open a second window, and shut the server down (again, verify you’re not on production!):
SHUTDOWN WITH NOWAIT;
Of course, you can’t check on anything until you restart the server, so go into Control Panel / Services, find your instance, and choose Start. Then check the resumable operations view:
SELECT * FROM sys.index_resumable_operations;
Sure enough, I get a row back, and the operation is resumable, even after a crash. I can finish the operation as follows:
ALTER INDEX WideClustered ON Fact.MovementCopy RESUME;
Next Steps
This new feature in SQL Server 2017 helps out in a variety of important use cases for rebuilding indexes. Go ahead and download a pre-release evaluation, and start getting a taste of the future of index management! In the meantime, some other resources:
- Tip #4987 : SQL Server 2017 Resumable Online Index Rebuilds
- sys.index_resumable_operations
- How Online Index Operations Work
- Other SQL Server 2017 Tips
About the author
This author pledges the content of this article is based on professional experience and not AI generated.
View all my tips