Learn more about SQL Server tools

mssqltips logo
 

Tutorials          DBA          Dev          BI          Career          Categories          Webcasts          Whitepapers          Today's Tip          Join

Tutorials      DBA      Dev      BI      Categories      Webcasts

DBA    Dev    BI    Categories

 

Make your SQL Server database changes backward compatible when adding a new column


By:   |   Last Updated: 2010-06-09   |   Comments (4)   |   Related Tips: 1 | 2 | 3 | 4 | More > T-SQL

Problem

As multi-tier architectures grow over time, it is often challenging to coordinate those changes across the data, logic and presentation tiers. Unless planned and implemented carefully, an act as simple as adding a column to a table can grind all of the components of your application to a halt. While some of us have comfortable 12-hour maintenance windows every weekend, many of us are bound by service level agreements that are much more strict. So we must find ways to introduce fixes and new features with zero downtime, and without requiring every single component to be refactored at the same time.

Solution

There are several methods you can use to introduce fixes and new features to an existing application without disrupting existing applications or data. Personally, I always start at the database layer, and make every effort to ensure that my changes are backward compatible. This way, I can deploy the database changes to each environment in turn (dev, QA, staging, production) independent of changes that will need to take place in API and other logic layers, as well as the presentation layers. Since it is often the case that application development can take longer than database changes (and depend on them), and because we can often have several different applications relying on the same data tier and those applications can be on different development schedules, it is almost always preferential to push database changes ahead of the curve.

The challenge, of course, is defining and properly implementing what is meant by "backward compatible." Essentially, a change is backward compatible if it can be introduced to the environment without touching any other parts of the application - and the system keeps running as if nothing happened. I will demonstrate by providing a few examples of what I do in my environments in certain scenarios, dealing mainly with simple CRUD operations (create, read, update and delete). Let's pretend we have a very simple table called Users, with very simple CRUD procedures, and we want to add a new column called "Gender" without having to make changes to the application(s) right away. Here are the existing objects (note that error handling is omitted for brevity, and I'm only listing the procedures we'll actually change due to the new column):

    
    CREATE TABLE dbo.Users
    (
        UserID    INT IDENTITY(1,1) PRIMARY KEY,
        UserName  VARCHAR(320) NOT NULL UNIQUE,
        HairColor VARCHAR(10)
    );
    GO
    
    CREATE PROCEDURE dbo.User_GetDetails
        @UserID INT
    AS
    BEGIN
        SELECT
            UserID,
            UserName,
            HairColor
        FROM
            dbo.Users
        WHERE
            UserID = @UserID;
    END
    GO
    
    CREATE PROCEDURE dbo.User_Create
        @UserName    VARCHAR(320),
        @HairColor   VARCHAR(32),
        @UserID      INT OUTPUT
    AS
    BEGIN
        INSERT dbo.Users
        (
            UserName, 
            HairColor
        )
        SELECT
            @UserName,
            @HairColor;
            
        SET @UserID = SCOPE_IDENTITY();
    END
    GO
    
    CREATE PROCEDURE dbo.User_Update
        @UserID      INT,
        @UserName    VARCHAR(320),
        @HairColor   VARCHAR(32)
    AS
    BEGIN
        UPDATE dbo.Users
        SET UserName  = @UserName,
            HairColor = @HairColor
        WHERE
            UserID = @UserID;
    END
    GO
    

When adding a new column, especially one that is required (not NULLable), there are a couple of things to keep in mind:

  • Unless I have a really good reason to do otherwise, I avoid the SSMS table designer, and add the column using ALTER TABLE - which puts the column at the "end" of the table.
  • On larger tables, adding a NOT NULL column with a default value can lock the table for a long time. It may make sense to first add the column as NULL, populate the values in chunks, and then add a NOT NULL constraint after the fact.

Let's assume that we want to add the gender column, that we want it to be not nullable, and that the default value is 'U' (unknown/unspecified). We can do so like this, assuming the table is not huge:

    
 ALTER TABLE dbo.Users ADD Gender CHAR(1) 
        NOT NULL
        DEFAULT 'U';
    -- may want to add CHECK Gender IN ('M', 'F', 'U')
    

Now the table has the added column, every user has the value set to 'U', and the application and CRUD procedures are none the wiser. Since the column has a default value, insert statements do not need to include it; and update and select statements won't need to know about it because they aren't aware of it and don't need to try and change it. We can introduce those changes to the stored procedures while still making their presence transparent to the application(s).

First, let's deal with dbo.User_GetDetails. In this case we can simply add Gender to the output list, and - assuming you aren't doing anything like "for each field in rs.fields" - the application will just ignore it until it is ready to use the column in its output:

    
    ALTER PROCEDURE dbo.User_GetDetails
        @UserID INT
    AS
    BEGIN
        SELECT
            UserID,
            UserName,
            HairColor,
            Gender
        FROM
            dbo.Users
        WHERE
            UserID = @UserID;
    END
    GO    
    

For the insert procedure, we can have the parameter default to 'U' so that the application does not have to provide a value right away (but you can test specifying this value from T-SQL or unit tests). Note that, like adding columns to a table, I always add parameters to the end of the parameter list, since application code (and other stored procedure code) might not name their parameters and simply specify the values in order.

    
    ALTER PROCEDURE dbo.User_Create
        @UserName   VARCHAR(320),
        @HairColor  VARCHAR(32),
        @UserID     INT OUTPUT,
        @Gender     CHAR(1) = 'U'
    AS
    BEGIN
        INSERT dbo.Users
        (
            UserName, 
            HairColor,
            Gender
        )
        SELECT
            @UserName,
            @HairColor,
            @Gender;
            
        SET @UserID = SCOPE_IDENTITY();
    END
    GO
    

For the update procedure, we'll take a slightly different approach. Because you might have already updated some of the rows with values other than 'U', you'll only want to update the value if it is specified explicitly. So we allow the parameter to default to NULL, and use COALESCE() to update the column - which means if you updated it to 'M' previously and nothing is specified, it will stay as 'M':

    
    ALTER PROCEDURE dbo.User_Update
        @UserID     INT,
        @UserName   VARCHAR(320),
        @HairColor  VARCHAR(32),
        @Gender     CHAR(1) = NULL
    AS
    BEGIN
        UPDATE dbo.Users
        SET UserName  = @UserName,
            HairColor = @HairColor,
            Gender    = COALESCE(@Gender, Gender)
        WHERE
            UserID = @UserID;
    END
    GO
    

As you can see, it does not take a lot of effort to introduce database changes without impacting other parts of your application, allowing you to perform testing of fixes and new features without requiring simultaneous application development. Today we dealt with adding a column; in future tips, we will investigate other schema and data changes.

Next Steps


Last Updated: 2010-06-09


next webcast button


next tip button



About the author
MSSQLTips author Aaron Bertrand Aaron Bertrand (@AaronBertrand) is a Product Manager at SentryOne, with industry experience dating back to Classic ASP and SQL Server 6.5. He is editor-in-chief of the performance-related blog, SQLPerformance.com, and serves as a community moderator for the Database Administrators Stack Exchange.

View all my tips




Post a comment or let the author know this tip helped.

All comments are reviewed, so stay on subject or we may delete your comment. Note: your email address is not published. Required fields are marked with an asterisk (*).

*Name    *Email    Email me updates 


Signup for our newsletter
 I agree by submitting my data to receive communications, account updates and/or special offers about SQL Server from MSSQLTips and/or its Sponsors. I have read the privacy statement and understand I may unsubscribe at any time.



    



Wednesday, June 23, 2010 - 11:39:27 PM - aprato Back To Top

For what it's worth, when I started at my current employer I horrifyingly discovered all kinds of ad-hoc and SELECT * SQL, as well as direct table references scattered throughout the DAL layer.   I ended up creating backward compatible views over the tables that required any data changes.   Granted, we don't use Replication so that didn't factor into the equation.

 Aaron: Are you over the Bruins collapse?   It took me a while.... I'm now gearing up for Tyler or Taylor on Friday


Wednesday, June 23, 2010 - 2:34:08 PM - AaronBertrand Back To Top

Thanks Noel, yes you bring up some good points that I hadn't considered in my simplistic approach.  I'll partially use the excuse that we don't use these features (and some we avoid like the plague):

  • triggers (I think there is one across our entire application suite)
  • replication (we roll our own)
  • ad hoc SQL from client apps (strictly a stored procedure interface)
  • SELECT * from any source
  • CDC (we just adopted SQL Server 2008 earlier this year)
  • BCP format files (we use BULK INSERT but are very careful about schema changes there)
     
That said, you are completely right that I should have mentioned them.  I'll take care to be more thorough in the future.

Wednesday, June 23, 2010 - 1:49:10 PM - Noeldr Back To Top

 Adding a column with backward compatibility is much more complicated than that.

- You need to verify replication

- You need to verify those queries that join to the table and check that no other column in the join has that same name or that the select list include alias or that would get broken

- You need to verify that any view using joins to that table follow the above

- You also have to account for DTO on the client code that use SELECT * and my find mapping issues

- You must check Trigger that maybe pressent for auditing purposes

- You have to verify fi that table is enabled for CDC

- You must check format Files that could be used for BCP or BULK INSERT

As you can see the list is ***much more*** complex and I am not sure I have covered all avenues. 

The take away of this is simple:

"adding a column" is NEVER a simple thing, in fact it is probably one of the most misscalculated operations in the DBMS world!

 

 


Wednesday, June 23, 2010 - 11:58:52 AM - Charles Kincaid Back To Top

I needed to add a column to a table and have it NOT NULL but with a default. If I use the crud that the GUI gives me it uses a bound default and not a default constraint.  Further the default has some made up name.  If I'm going through the system views, or tables, I want to know what is what.

Here is my solution:

Step (1)

ALTER TABLE dbo.MyTable ADD MyColumn tinyint NULL

This adds the column with NULL and no constraint. This makes the ALTER TABLE work.

Step (2)

ALTER TABLE dbo.MyTable

ADD CONSTRAINT DF_MyTable_MyColumn

DEFAULT (0) FOR MyColumn

Now your default is in place and new rows would be OK. Existing rows are messed up

Step (3)

UPDATE dbo.MyTable SET MyColumn=0

Hey we just added the column so all data should be the default, right?

Step (4)

ALTER TABLE dbo.MyTable ALTER COLUMN MyColumn tinyint NOT NULL

Now the NULL flag is fixed.


Learn more about SQL Server tools