Learn more about SQL Server tools



solving sql server problems for millions of dbas and developers since 2006
join MSSQLTips for free SQL Server tips














































Add or drop identity property for an existing SQL Server column

MSSQLTips author Greg Robidoux By:   |   Read Comments (10)   |   Related Tips: More > Identities

Problem
One nice feature of SQL Server that is used quite a bit is the use of identity columns.  This function gives you a simple way of creating a unique value for every row in your table.  Adding a new column and making it an identity column is an easy thing to do as well as dropping an existing column that is an identity column, but how can you modify an existing column to make it an identity column or remove the identity property from an existing column?

Solution
Not sure how much you have researched this one so far, but there is no easy way to do this.  By design there is no simple way to turn on or turn off the identity feature for an existing column.  The only clean way to do this is to create a new column and make it an identity column or create a new table and migrate your data.

Let's take a look at a few examples:

Example 1

Here is a simple table that has two columns and one column is the identity column.

CREATE TABLE [dbo].[Test1](
   
[id] [int] IDENTITY(1,1) NOT NULL,
   
[name] [nchar](10) NULL
)

If we use SQL Server Management Studio to get rid of the identity value on column "id", a new temporary table is created, the data is moved to the temporary table, the old table is dropped and the new table is renamed.  This can be seen in the script below. 

To get this script use Management Studio to make the change and then right click in the designer and select "Generate Change Script".

/* To prevent any potential data loss issues, you should review this 
script in detail before running it outside the context of the database designer.*/
BEGIN TRANSACTION
SET 
QUOTED_IDENTIFIER ON
SET 
ARITHABORT ON
SET 
NUMERIC_ROUNDABORT OFF
SET 
CONCAT_NULL_YIELDS_NULL ON
SET 
ANSI_NULLS ON
SET 
ANSI_PADDING ON
SET 
ANSI_WARNINGS ON
COMMIT
BEGIN TRANSACTION
GO
CREATE TABLE dbo.Tmp_Test1
   
(
   
id INT NOT NULL,
   
name NCHAR(10) NULL
   )  
ON [PRIMARY]
GO
IF EXISTS(SELECT FROM dbo.Test1)
    
EXEC('INSERT INTO dbo.Tmp_Test1 (id, name)
      SELECT id, name FROM dbo.Test1 WITH (HOLDLOCK TABLOCKX)'
)
GO
DROP TABLE dbo.Test1
GO
EXECUTE sp_rename N'dbo.Tmp_Test1'N'Test1''OBJECT' 
GO
COMMIT

Example 2

If we make this example a little more complicated, by having a primary key and creating a second table with a foreign key constraint referencing back to the first table we can see that even more work needs to be done.

CREATE TABLE [dbo].[Test1](
   
[id] [int] IDENTITY(1,1) NOT NULL,
   
[name] [nchar](10) NULL,
 
CONSTRAINT [PK_Test1] PRIMARY KEY CLUSTERED 
(
   
[id] ASC
))
GO

CREATE TABLE [dbo].[Test2](
   
[id] [int] NULL,
   
[name2] [nchar](10) NULL
ON [PRIMARY]
GO

ALTER TABLE [dbo].[Test2] WITH CHECK ADD CONSTRAINT [FK_Test2_Test1] FOREIGN KEY([id])
REFERENCES [dbo].[Test1] ([id])
GO

ALTER TABLE [dbo].[Test2] CHECK CONSTRAINT [FK_Test2_Test1]
GO

If we do the same thing and use Management Studio to get rid of the identity value on column "id" in table "test1" and script out the change, we can see that even more steps need to take place. 

  1. First a temp table "Tmp_Test1" is created with the correct column attributes
  2. The data is moved to "Tmp_Test1" from "Test1"
  3. The FK constraint on "Test2" is dropped
  4. Table "Test1" is dropped
  5. Table "Tmp_Test1" is renamed to "Test1"
  6. The primary key is created on table "Test1"
  7. And lastly the FK constraint is recreated on table "Test2".  That's a lot of steps.
/* To prevent any potential data loss issues, you should review this
 script in detail before running it outside the context of the database designer.*/
BEGIN TRANSACTION
SET 
QUOTED_IDENTIFIER ON
SET 
ARITHABORT ON
SET 
NUMERIC_ROUNDABORT OFF
SET 
CONCAT_NULL_YIELDS_NULL ON
SET 
ANSI_NULLS ON
SET 
ANSI_PADDING ON
SET 
ANSI_WARNINGS ON
COMMIT
BEGIN TRANSACTION
GO
CREATE TABLE dbo.Tmp_Test1
   
(
   
id INT NOT NULL,
   
name NCHAR(10) NULL
   )  
ON [PRIMARY]
GO
IF EXISTS(SELECT FROM dbo.Test1)
    
EXEC('INSERT INTO dbo.Tmp_Test1 (id, name)
      SELECT id, name FROM dbo.Test1 WITH (HOLDLOCK TABLOCKX)'
)
GO
ALTER TABLE dbo.Test2
   
DROP CONSTRAINT FK_Test2_Test1
GO
DROP TABLE dbo.Test1
GO
EXECUTE sp_rename N'dbo.Tmp_Test1'N'Test1''OBJECT' 
GO
ALTER TABLE dbo.Test1 ADD CONSTRAINT
   
PK_Test1 PRIMARY KEY CLUSTERED 
   
(
   
id
   
WITHSTATISTICS_NORECOMPUTE = OFFIGNORE_DUP_KEY = OFFALLOW_ROW_LOCKS = ON
ALLOW_PAGE_LOCKS = ONON [PRIMARY]

GO
COMMIT
BEGIN TRANSACTION
GO
ALTER TABLE dbo.Test2 ADD CONSTRAINT
   
FK_Test2_Test1 FOREIGN KEY
   
(
   
id
   
REFERENCES dbo.Test1
   
(
   
id
   
ON UPDATE  NO ACTION 
    
ON DELETE  NO ACTION 
   
GO
COMMIT

The same holds true if we want to change an existing column and make one of the columns an identity column.  This scenario is probably less likely, but there may be a need. 

Another approach would be to add a new column and make it an identity column or add a new column without the identity property and migrate the data from the old column to the new column.  After that you could drop the old column and then rename the column using the sp_rename stored procedure.  If you have indexes, foreign keys and other constraints on these columns you would still need to drop these prior to making the changes, so this approach is not much faster.

As you can see there really is not any easy way to do this. There are some other approaches that you can find on the internet that modify values in the system tables.  These approaches do work, but if you make a mistake you could totally mess up your data, so make sure you understand what is in store before modifying system tables.

I wish there was a simpler way of turning on and turning off the identity property.  For small tables or for databases that are not that busy this approach works without much issue.  But if you have large tables or your databases are very busy it is kind of hard to drop constraints and tables on the fly like this.  I guess we will have to wait and see if there are any changes in SQL 2008.

Next Steps

  • Now that we have seen there is no easy way to do this, keep this in mind when designing your tables.  Identity columns are great, but if you do need to change one it could be more painful then you think
  • Try to avoid using identity columns as your primary key and for foreign key constraints just for this reason.  I know once you start using them it is difficult to stop using them, but be aware of potential issues you may face down the line.


Last Update: 1/4/2008


About the author
MSSQLTips author Greg Robidoux
Greg Robidoux is the President of Edgewood Solutions and a co-founder of MSSQLTips.com.

View all my tips


print tip Print  
Become a paid author




Recommended For You








Learn more about SQL Server tools
Comments and Feedback:
Friday, January 04, 2008 - 8:15:52 PM - aprato Read The Tip

I think this statement is a good topic for discussion

  • Try to avoid using identity columns as your primary key and for foreign key constraints just for this reason.  I know once you start using them it is difficult to stop using them, but be aware of potential issues you may face down the line.
  •  

    I use identities for PKs and FK's  I would expect that having natural keys would lead to even more headaches if natural keys in the database end up changing, no?  You'd have to promote any changes through all child tables.  I think of a customer table or an employee table.  A company or a person who gets married can change their name which leads to cascading headaches with child tables.  In addition, if I need my inserts to be fast, I'd prefer the table had a clustered identity PK as opposed to a clustered natural key to minimize the overhead of page splits and fragmentation.  Applying a FILL FACTOR could help but then queries have to read more pages than necessary. I'm not sure if I agree with this statement as being a general rule - to me, it seems too broad. 

    I've never had the need to remove an identity but I suspect its removal would be due more to a flaw in the initial physical modeling. 

    As far as modifying system tables, while it can be done in SQL Server 2000 it can't be done in SQL Server 2005.  Well, actually, it can but it's a secret.  I was at Microsoft for some perf testing a year ago and the response to my question on modifying 2005 system tables was something along the lines of "We could tell you, but then we'd have to kill you".   In general, modifying SQL Server system tables is not a good idea because (a) it'll cost you money to fix anything you screw up and (b) whatever you screw up could end up really corrupting your database big time.

    I'd love to hear other opinions


    Friday, September 05, 2008 - 9:39:27 AM - Preethi Read The Tip

    Couple of points here.
    If you have enterprise edition, you have a method to change the identity property without much data movement. (Ofcourse you may have to drop and create the foreign keys of other tables referencing this Identity column.)

    See the following link for further details

    http://sqlserveruniverse.com/content/DEVL0400101022007SwitchingFromNonIDENTITY.aspx

    On question of whether we should have identity column for PK or not, we can argue for both sides.  I have seen a lot of benifits of identity during inserts and updates as it reduces the new pages for inserts and page splits. Also, oftena complex multi column clustered index could be replaced with a PK and it reduces space usage. When the space needed for PK reduces it allows the non clustered indexes to be smaller, and makes the index scans, seeks to be faster.

    I have also seen the downside of it.  one of the major issue isssue is you will not know the value until you insert it. It needs an additional traffic over the network, and complex queries may suffer due to this.

    I have used Identity from my first project with SQL Server 6.0, and I am happy about it.  (Ofcourse I had a headache with Identity on those days, partially as I was new to this)  So far, I have seen identity is a great option which gives the control to the designer.

    Finally,  one standard question:  If you want to insert data, with an explicit identity value, you can do that.   Why on earth you want to change the identity property?  Is it because you of bad design?

      


    Saturday, September 06, 2008 - 8:04:47 PM - aprato Read The Tip

     Here's an example....

    I had a customer accidentally delete some child rows involved in a PK-FK relationship from their database. I had to take an older backup and reconstruct the rows.  To do that, I had to re-insert them using IDENTITY_INSERT ON


    Sunday, September 07, 2008 - 7:12:33 AM - Preethi Read The Tip

    I have a couple of questions:

    • Is your "Child" table work as a parent to another table?
      • If so, how can a person delete the records?
      • If not, why bother about the value of the column. It is inserted automatically, and you don't have to worry anythign about it. You can enter any value.
    • In case if your child record is some other database, or you dont maintain referential integrity, do you allow users to delete rows directly?
    • do the application take care of this validations, violations?
    • Apart from the fact that you need to type additional words to insert the rows, is there any problem with the row?
    Let me give you a scenario, where you need Identity column: If you want to create full text index on a table, you need the table with unique, non nullable, single column index. Identity column is a good candidate to create such an index.


    Sunday, September 07, 2008 - 8:47:33 AM - aprato Read The Tip

     Yes, it was a parent to another table that was part of a DELETE CASCADE FK.

    The delete occurred to a bug in the application code.  I'll just summarize and say that these rows shouldn't have been deleted because a business rule was not coded correctly in the logic tier. I wanted to re-create these rows as they originally appeared rather than re-generate new identities. 

    I use identities as a clustered PK index on my high volume tables so I can avoid the expense that page splitting  causes. I then leverage some of my key non-clustered indexes as covering indexes (and in some cases use included columns) so I can leverage the fact that the engine logically sorts the index data in key order.  This way, if I have a query that requires a sort can use the covering index.

    For instance, my model has a table that stores user scores when he/she takes an exam.  The key reporting query wants the user's name, the exam name, and the score achieved sorted by date.  Since tens of thousands of users are taking exams at any one time, I want my score inserts fast so I use the identity as the clustered index and I have an index on (userid, date_taken) include (exam_name, score).  It works really well and is very fast.

     



    Sunday, September 07, 2008 - 9:07:28 AM - Preethi Read The Tip

    [quote user="aprato"]Since tens of thousands of users are taking exams at any one time, I want my score inserts fast so I use the identity as the clustered index[/quote]

    I totally agree. This is one of the big points of Identity.  As consequent inserts go in the same page. this approach reduces the unnecessary page loads.

    Additionally, any data created by user is subjected to errors.  People may make spelling mistakes and some of them will be discovered much much later.  Keeping them as Primary key duplicates the error into all referenced tables and they may appear before clients, and unless you do some costly cascade  operations, they will not be eliminated. identity column eliminates this risk completely.

     


    Sunday, September 07, 2008 - 9:12:03 AM - aprato Read The Tip

     I'm sorry, I should've added that I needed these to appear as they originally did because we have a trigger action that inserts into an audit table that's used by a separate system that does data warehousing.  This table stores the id of the row and the action (I,U,X- for Insert, Update, Delete).


    Thursday, April 08, 2010 - 1:56:52 PM - peterruselmalinsvig Read The Tip
    It is indeed good to learn and share insights and I hope that would be a common ground for any of us.

    Friday, November 12, 2010 - 1:41:56 AM - neng Read The Tip

    If I want to add a column to a millions records table, and also need to update the value of that new column. That table need to be able to access all the time. What would you recommend? Beside do the alter table and run the update command. Thank you.


    Monday, November 15, 2010 - 5:28:56 PM - Greg Robidoux Read The Tip

    You could do an update in several steps.  Instead of doing one large update you could do the update in increments of 1000 rows or something like that.  You could use the primary key or clustered index to use as the key to determine what group of rows to update.  This way the locking could be done in smaller chunks instead of locking the entire table for the entire update duration.



    Post a Comment or Question

    Keep it clean and stay on the subject or we may delete your comment.
    Your email address is not published. Required fields are marked with an asterisk (*)

    *Name   *Email Notify for updates

    Signup for our newsletter


    Comments
    *Enter Code refresh code


     
    Sponsor Information







    Copyright (c) 2006-2014 Edgewood Solutions, LLC All rights reserved
    privacy | disclaimer | copyright | advertise | about
    authors | contribute | feedback | giveaways | user groups | community | events | first timer?
    Some names and products listed are the registered trademarks of their respective owners.