Alter Table Modify Column ONLINE Issues in SQL Server

By:   |   Comments (1)   |   Related: > Database Design


Problem

We recently performed a DDL operation against a SQL Server table – simply increasing the size of a varchar column – which should have been instantaneous. Instead, we killed it after observing 20 minutes of HARD_SYNC_COMMIT waits and a blocked replication log reader. Could this issue have been avoided? What went wrong?

Solution

Here was the original table (grossly simplified):

CREATE TABLE dbo.Floob
( FloobID int IDENTITY(1,1) PRIMARY KEY,
PostalCode varchar(32)
);

And here was the command to change the PostalCode column from 32 characters to 64:

ALTER TABLE dbo.Floob 
ALTER COLUMN PostalCode varchar(64) NOT NULL
WITH (ONLINE = ON);

If you don't spot the issue right away, don't worry, you're not alone. We missed it too. The problem is that the original table had the column defined as nullable (though NULL was prevented through other logic). However, the DDL command that was generated from source control inadvertently changed PostalCode to NOT NULL, based on a rule (or assumption) rather than the current state of the table.

Also, I want to be clear that this issue wasn't specifically caused by the operation being online – it is just why it was surprising to see even though we explicitly asked for an online operation. You may not be able to test this scenario exactly, since altering a column online is restricted to Enterprise Edition (just like online index operations). If you try the syntax on a lesser edition, you will see this error message:

Msg 1712, Level 16, State 1
Online index operations can only be performed in Enterprise edition of SQL Server.

Changing nullability can be, effectively, a size-of-data operation, whether online or not. On a small table, this is unlikely to be noticed; on a large table, however, it can become a problem. And we can prove this simply by inspecting the transaction log before and after a change, and understanding how those changes cause impact downstream (basically, anything that uses the log reader or redo, like replication and Availability Groups).

To set up a quick test, we can repeatedly run a script with minor changes to detect cases where the amount of work logged varies with the number of rows (which shouldn't happen for an operation that is truly online and indifferent to the number of rows affected):

  • Setting the initial column NULL or NOT NULL
  • Populating the table with 10, 100, 1,000, or 10,000 rows
  • Altering the column explicitly to be NULL or NOT NULL
    • Performing this change with ONLINE = ON or OFF

The skeleton of the test looks like this:

-- create table with null or not null
-- populate with n rows
-- count number of log records in fn_dblog()
-- alter table using null or not null, online or offline
-- find delta in number of log records in fn_dblog()

And here is the actual code (I just re-created the database each time to keep things clean):

USE master;
GO ALTER DATABASE Splunge SET SINGLE_USER WITH ROLLBACK IMMEDIATE;
GO DROP DATABASE Splunge;
GO CREATE DATABASE Splunge;
GO USE Splunge;
SET NOCOUNT ON;
GO CREATE TABLE dbo.Floob
(
FloobID int IDENTITY(1,1) PRIMARY KEY,
PostalCode varchar(32) NULL -- change to NOT NULL
);
GO DECLARE @n int = 10; -- 100, 1000, 10000 INSERT dbo.Floob(PostalCode) SELECT TOP (@n) 'N0T L33T'
FROM sys.all_objects AS s1
CROSS JOIN sys.all_objects AS s2;
GO DECLARE @i int = (SELECT COUNT(*) FROM sys.fn_dblog(DB_ID(), NULL)); DECLARE @d datetime2 = sysdatetime(); ALTER TABLE dbo.Floob
ALTER COLUMN PostalCode varchar(64)
NULL -- change to NOT NULL
WITH (ONLINE = ON); -- change to OFF (or comment out) SELECT DurationMilliseconds = DATEDIFF(MILLISECOND, @d, sysdatetime()); SELECT LogRecords = COUNT(*)-@i FROM sys.fn_dblog(DB_ID(), NULL);

I ran each of these test variation 10 times and averaged out the duration (which varied from test to test) and log records produced (which was constant for each variation). Here are the results for the log records:

Log Records produced from ALTER COLUMN

And the duration, which should be no surprise after seeing the log activity:

Duration (milliseconds) of ALTER COLUMN operation

Essentially, changing the column from NULL to NOT NULL invoked a whole bunch of additional log activity, and this increased with the number of rows affected, making it a size-of-data operation, at least to a certain extent. The majority of this log activity fell under LOP_INSERT_ROWS for online operations, and LOP_MODIFY_ROW when running the ALTER offline. This log activity can obviously impact duration and, as suggested before, might affect downstream systems and data movement as well.

You will notice that the online operations required about double the log activity as offline, and about double the duration, as well. This is the price you pay for not blocking – other activities can run against the table while the column is being changed, even if it does become a size-of-data operation.

Summary

Please use caution when planning out schema changes. The original nullability state of a column is something to easily overlook when your DDL commands are meant to look a certain way or conform to some specific standard or pattern. If your intent is to change nullability then, by all means, it is something you have to do. In our case, this was an unintended side effect, and the impact to the log was unexpected because, under most other scenarios, increasing the size of a variable-length column is an online, metadata-only operation whether or not you specify. In a future tip, I will talk about what happens when the column participates in a non-clustered index, and ways ONLINE column changes may still cause locking/blocking.

Next Steps

See these related tips and other resources:



sql server categories

sql server webinars

subscribe to mssqltips

sql server tutorials

sql server white papers

next tip



About the author
MSSQLTips author Aaron Bertrand Aaron Bertrand (@AaronBertrand) is a passionate technologist with industry experience dating back to Classic ASP and SQL Server 6.5. He is editor-in-chief of the performance-related blog, SQLPerformance.com, and also blogs at sqlblog.org.

This author pledges the content of this article is based on professional experience and not AI generated.

View all my tips



Comments For This Article




Tuesday, August 25, 2020 - 4:04:23 AM - Thomas Franz Back To Top (86356)
I wonder, WHY it has so a massive effect onto the transaction log.

Yes, it would need to read the whole table and check, if there are NULL values inside this field, but I see no reason, why it would have to rewrite the table or do any other changes.

As far I know, there is a byte for every 8 nullable columns, where the SQL server saves, if the value is really NULL. When I set the column to NOT NULL, it could become obsolete, but it I don't see, why it has to force a rewrite, instead of simply ignoring it (until the next UPDATE of the row or index rebuild / reorg).

Same thing seems to work already for changing NOT NULL -> NULL, in this case it may have to add another nullability byte to the row head, but it skips this step, since there could be no row, that has alread a NULL (so it can assume that it is NOT NULL when the flag is not present).














get free sql tips
agree to terms