Impact of Offline SQL Server Index Rebuilds

By:   |   Updated: 2023-08-28   |   Comments (5)   |   Related: More > Indexing


Problem

Do users access your SQL Server database 24 hours a day, 7 days a week? People from different time zones grind day and night to keep the company running. Rebuilding indexes offline blocks others from accessing the underlying table. Understanding what happens when you rebuild an index offline keeps users from experiencing drastic slowdowns. No DBA likes getting emails in the morning saying the application was slow last night.

Solution

This tutorial will explain what happens to SQL queries when you rebuild an index offline. You rebuild indexes to reduce fragmentation; we'll examine the two types. You'll see firsthand what happens when users perform a simple SELECT statement while an offline rebuild occurs. What about nonclustered indexes? Do they affect the availability of your table? Additionally, what's stopping you from rebuilding online? By the end, you'll better understand the impact of offline index rebuilds and how they affect end user query performance.

Defeating Index Fragmentation

I'll assume you know fragmentation isn't your friend. Hundreds of articles and videos go into the nitty gritty details. The most common explanation of fragmentation is when the physical and logical page order is not in sync, known as logical fragmentation. For example, someone hands you a stack of papers to read. The first is labeled page 1 of 100, but the next page says page 40 of 100. That's logical fragmentation.

The second type of fragmentation is called physical or internal. This type happens when pages have excessive empty space, leading to low page density. Both fragmentation types are due to inserts, updates, and deletes. To learn more about fragmentation, read articles written by Paul Randal.

Rebuild or Reorganize SQL Index

There are two options to fix fragmentation. The first and most popular method is to rebuild indexes. When you rebuild, SQL creates a new fresh index. Recall the paper example from above: a rebuild would be like reprinting the document in the correct order and trashing the old ones. Unless you're running an Enterprise edition or an Azure flavor, SQL performs it offline; more on that later. Below is the syntax for rebuilding an index.

-- https://www.mssqltips.com
-- This code rebuilds a specific index.
ALTER INDEX PK_MyFragmentedIndex ON dbo.VeryImportantTable REBUILD;
GO

-- If you want to rebuild all the indexes on a table, try this code.
ALTER INDEX ALL ON dbo.VeryImportantTable REBUILD;
GO

The second method is to reorganize indexes. SQL accepts the challenge of sorting the leaf-level pages in the correct order and reclaiming space. A couple of notes regarding reorganizing: the good news, it's online, meaning users can run queries against the table. The bad news is that it's single-threaded. It's slower than performing a rebuild, depending on the fragmentation level.

-- https://www.mssqltips.com
-- This code reorganizes one specific index.
ALTER INDEX PK_MyFragmentedIndex ON dbo.VeryImportantTable REORGANIZE;
GO

-- This code reorganizes all indexes on a table, prepare to wait.
ALTER INDEX ALL ON dbo.VeryImportantTable Reorganize;
GO

Countless articles recommend when to choose one method over the other. According to my research, Paul Randal coined the popular percentage recommendations. Paul has since said they were numbers he made up. However, they're great as a starting place. If your index is over 30% fragmented, rebuilding is a good idea, while between 5% and 30%, consider a reorganization. Leave it alone if it's under 5% or your table contains under 1,000 pages.

Building a Dataset

Let's build a dataset and fragment the heck out of it. The following T-SQL code creates one table with a clustered index and one nonclustered. Warning: If you run this script in SQL Server Management Studio, the database consumes around 40GB.

-- https://www.mssqltips.com
USE [master];
GO

IF DATABASEPROPERTYEX(   'BadFragmentation',
                         'Version'
                     ) IS NOT NULL
BEGIN
    ALTER DATABASE BadFragmentation SET SINGLE_USER WITH ROLLBACK IMMEDIATE;
    DROP DATABASE BadFragmentation;
END;
GO

CREATE DATABASE BadFragmentation
ON PRIMARY
       (
           NAME = N'BadFragmentation',
           FILENAME = N'C:\code\MSSQLTips\BadFragmentation.mdf',
           SIZE = 20000000KB,
           FILEGROWTH = 1000000KB
       )
LOG ON
    (
        NAME = N'BadFragmentation_log',
        FILENAME = N'C:\code\MSSQLTips\BadFragmentation_log.ldf',
        SIZE = 5000000KB,
        FILEGROWTH = 1000000KB
    );
GO

ALTER DATABASE BadFragmentation SET RECOVERY SIMPLE;
GO

USE BadFragmentation;
GO

DECLARE @UpperBound INT = 10000000;
;WITH cteN (Number)
AS (
   SELECT ROW_NUMBER() OVER (ORDER BY s1.[object_id])
   FROM sys.all_columns AS s1
       CROSS JOIN sys.all_columns AS s2
   )
SELECT [Number]
INTO dbo.Numbers
FROM cteN
WHERE [Number] <= @UpperBound;

CREATE UNIQUE CLUSTERED INDEX CIX_Number -- CREATE UNIQUE INDEX
ON dbo.Numbers ([Number])
WITH (FILLFACTOR = 100);


CREATE TABLE dbo.BadFragmentation
(
    Column1 INT NOT NULL,
    Column2 INT NOT NULL,
    Column3 VARCHAR(1000) NOT NULL,
    Column4 VARCHAR(1000) NOT NULL,
    Column5 VARCHAR(1000) NOT NULL,
    Column6 VARCHAR(1000) NOT NULL,
    Column7 VARCHAR(1000) NOT NULL
        CONSTRAINT PK_BadFragmentation
        PRIMARY KEY CLUSTERED (Column1)
);
GO

CREATE NONCLUSTERED INDEX IX_SecondIndex -- CREATE INDEX
ON dbo.BadFragmentation (
                            Column2,
                            Column4
                        );

INSERT INTO dbo.BadFragmentation
(
    Column1,
    Column2,
    Column3,
    Column4,
    Column5,
    Column6,
    Column7
)
SELECT TOP (10000000)
       n.Number,
       ABS(CHECKSUM(NEWID()) % 100) + 1 AS Column2,
       REPLICATE(   'Does the chicken have large talons?',
                    5
                ) AS Column3,
       REPLICATE(   'It''s just a flesh wound.',
                    5
                ) AS Column4,
       REPLICATE(   'That escalated quickly.',
                    5
                ) AS Column5,
       REPLICATE(   'All right, we''ll call it a draw.',
                    5
                ) AS Column6,
       REPLICATE(   'If I''m not back in five minutes, just wait longer.',
                    5
                ) AS Column7
FROM dbo.Numbers n
WHERE n.Number % 2 = 0;
GO

CHECKPOINT;

INSERT INTO dbo.BadFragmentation
(
    Column1,
    Column2,
    Column3,
    Column4,
    Column5,
    Column6,
    Column7
)
SELECT TOP (10000000)
       n.Number,
       ABS(CHECKSUM(NEWID()) % 100) + 1 AS Column2,
       REPLICATE(   'Does the chicken have large talons?',
                    5
                ) AS Column3,
       REPLICATE(   'It''s just a flesh wound.',
                    5
                ) AS Column4,
       REPLICATE(   'That escalated quickly.',
                    5
                ) AS Column5,
       REPLICATE(   'All right, we''ll call it a draw.',
                    5
                ) AS Column6,
       REPLICATE(   'If I''m not back in five minutes, just wait longer.',
                    5
                ) AS Column7
FROM dbo.Numbers n
WHERE n.Number % 2 <> 0;
GO

CHECKPOINT;

UPDATE dbo.BadFragmentation
SET Column3 = REPLICATE(   'Does the chicken have large talons?',
                           25
                       ),
    Column4 = REPLICATE(   'It''s just a flesh wound.',
                           25
                       ),
    Column5 = REPLICATE(   'That escalated quickly.',
                           25
                       )
WHERE Column1 % 10 = 0;

CHECKPOINT;

Detecting Fragmentation in a SQL Database

Back in the day, I loved defragmenting my hard drive. Watching blocks line up in the correct order is relaxing. Sadly, SQL Server doesn't provide the same user experience. However, to fix fragmentation, we must find it. One method involves the index properties in Management Studio. My favorite technique is with the system function sys.dm_db_index_physical_stats.

-- https://www.mssqltips.com
SELECT i.[name],
       ips.index_type_desc,
       ips.alloc_unit_type_desc,
       ips.index_depth,
       ips.index_level,
       ips.page_count,
       ips.avg_fragmentation_in_percent
FROM sys.dm_db_index_physical_stats(   DB_ID(),
                                       OBJECT_ID('dbo.BadFragmentation'),
                                       NULL,
                                       NULL,
                                       'SAMPLED'
                                   ) ips
    INNER JOIN sys.indexes i
        ON i.index_id = ips.index_id
           AND [ips].[object_id] = [i].[object_id];
GO
Fragmentation results

You can see from the screenshot above the damage our indexes absorbed. According to the internet, we must rebuild.

Rebuilding Offline

What happens when you rebuild an index offline? The short answer is that SQL issues an exclusive SCH-M (Schema Modification) lock on the table. During the rebuild, other queries trying to get a shared or exclusive lock must wait their turn. Waiting isn't something we enjoy. If a query takes over a second, the user clicks the button a dozen times. However, that's a different topic.

I'm performing an offline rebuild on the PK_BadFragmentation index in the code below. After executing the command, I'll start a small workload and run Adam Machanic's sp_whoisactive to see what's happening.

-- https://www.mssqltips.com
ALTER INDEX PK_BadFragmentation
ON dbo.BadFragmentation
REBUILD
WITH (ONLINE = OFF);
GO

-- This is a stored procedure written by Adam Machanic.
EXECUTE sp_whoisactive @output_column_list = '[session_id][sql_text][CPU][block%][status][wait_info]',
                       @find_block_leaders = 1;
sp_whoisactive results

It's easy to see in the results that my index rebuild is blocking everything else trying to touch the table.

How about for a nonclustered index? SQL places the same exclusive lock on a table if you rebuild an index that contains 2 out of 20 columns. Imagine having four or five indexes on a big table. Rebuilding them all can cause a bottleneck worse than road construction at rush hour.

Rebuilding SQL Server Indexes Online

On any Microsoft-supported version of SQL Server, there's an option to rebuild online. There's one caveat: you need to be on the Enterprise edition. Testing with the Developer edition can lead to false positives since it has all the Enterprise features. Microsoft defines an online rebuild as an option that allows concurrent user access to the underlying table or clustered index data and any associated nonclustered indexes during these index operations. Said another way: people can still function without blockage. The code below is rebuilding the index online.

-- https://www.mssqltips.com
ALTER INDEX PK_BadFragmentation
ON dbo.BadFragmentation
REBUILD
WITH (ONLINE = ON);
GO
sp_whoisactive no blocking

Notice what we're missing this time in the sp_WhoIsActive results? That's right, no blocking.

Near the end, SQL acquires a brief SCH-M (Schema Modification) lock when swapping the old and new index. With an online rebuild, there's a slight chance that other users might wait.

Reorganizing SQL Server Indexes

Finally, you have the option to reorganize. The blocking effects of reorganizing are comparable to rebuilding online. However, reorganizing a large, heavily fragmented index takes time.

-- https://www.mssqltips.com
ALTER INDEX PK_BadFragmentation ON dbo.BadFragmentation REORGANIZE;
GO

Reorganizing the index above took 10 minutes, while either of the rebuild options took 30 seconds. Note: When you reorganize, SQL Server takes an intent exclusive lock (IX) lock on the table. Also, SQL doesn't update statistics when reorganizing.

Which One to Choose?

If you can rebuild online or offline, do it online. When you don't have Enterprise or one of the Azure flavors, rely on reorganizing. As you saw in the example above, reorganizing takes longer, but you can stop it, and SQL saves your progress. Ola Hallengren created a fantastic set of maintenance scripts, including one for indexes and statistics. If you mindlessly rebuild every index every night on a large database, try Ola's scripts.

Key Takeaways

  • SQL takes an exclusive lock on a table when rebuilding an index offline. This lock blocks access to the table for everyone else.
  • You can rebuild online with Enterprise edition, Azure SQL DB, or a Managed Instance.
  • If you rebuild the same tiny indexes every night, know you're wasting time on tables under a specific page count.
  • When your database must be accessible 24x7, consider reorganizing. Also, remember to update your statistics.
Next Steps


sql server categories

sql server webinars

subscribe to mssqltips

sql server tutorials

sql server white papers

next tip



About the author
MSSQLTips author Jared Westover Jared Westover is a passionate technology specialist at Crowe, helping to build data solutions. For the past two decades, he has focused on SQL Server, Azure, Power BI, and Oracle. Besides writing for MSSQLTips.com, he has published 12 Pluralsight courses about SQL Server and Reporting Services. Jared enjoys reading and spending time with his wife and three sons when he's not trying to make queries go faster.

View all my tips


Article Last Updated: 2023-08-28

Comments For This Article




Monday, August 28, 2023 - 1:37:05 PM - Jared Westover Back To Top (91513)
@Turgut, you are welcome! Thank you for stopping by.

Monday, August 28, 2023 - 1:35:57 PM - Jared Westover Back To Top (91512)
@Alexander
Thank you for reading and commenting on the article! With the advent of SSDs, defragmenting rowstore indexes is less critical. I have an SSD and a HHD, where I perform most tests. The performance difference is night and day. A place where HD performance creeps back in is with Azure SQLDB and Managed Instance unless you spend a lot.

@Jeff Thanks for the link. I added it to my SQL playlist. It was worth clicking on to put a face to a name :). I didn't intend the article to be a you should or shouldn't defrag your indexes. More so, being aware of what happens when you do. It resulted from helping someone rebuild a huge columnstore index on standard edition. However, I see where my comment about the old 5/30 gives that impression.

I've experienced rebuilding/reorganizing directly help with nonclustered index size bloat. SQL Server stops using them at a certain point and reverts to the clustered if one exists. I intend to write an article that specifically looks at this.

Monday, August 28, 2023 - 10:07:15 AM - Jeff Moden Back To Top (91510)
@Jared,
I have to agree with Alexander, here. Rebuilding and, especially, reorganizing indexes is seriously overrated and the old, supposed "Best Practice" methods actually cause more damage than they fix. I personally experienced that little problem due to "morning after" syndrome of massive, server crippling, blocking on 18 Jan 2016 and have been on a bit of a crusade to make people aware of a whole lot of things.

Another thing is, everyone says that fragmentation CAN cause performance issues but no one has proven it with proper tests. I've actually been able to prove that the old 5/30 method was never meant to be a "Best Practice" and is actually a WORST practice.

Here's an introductory 'tube on the subject that I've done for many groups. If you watch it, understand that, despite the title, it's NOT just about Random GUIDs. I'm just using those because they're the "poster child" for fragmentation and I prove that they're not the actual problem but that index maintenance is. In the end (after the Q'n'A), I show the results of inserting 100,000 rows per day into a Random GUID Clustered Index for 58 simulated days with virtually no page splits for the entire 58 days and only going over 1% Logical Fragmentation on the 58th day with NO index maintenance for the entire 58 days.

Here's the link to the "introductory" 'tube on the subject. It's 82 minutes long and you seriously shouldn't miss a minute of it because it explains and destroys the myths and methods we've been using for more than 2 decades. Even MS updated their documentation on the subject (although that documentation still carries forward some myths because they've NOT tested correctly).

https://www.youtube.com/watch?v=rvZwMNJxqVo

And, NO! This purpose of the 'tube is NOT to convince people to use Random GUIDs. It's just explaining that even Random GUIDs got a bad rep because 99.9% of the world adopted some poorly written information as a "Best Practice".

Monday, August 28, 2023 - 9:13:24 AM - Turgut Terlemez Back To Top (91509)
Thanks so much.

Monday, August 28, 2023 - 2:01:15 AM - Alexander Back To Top (91507)
Defragmenting on an SSD does not give a practical benefit, but it reduces the life of the disks.