Gaps in SQL Server Identity Columns
I have a table that makes use of integer based sequential values (identity based or otherwise) and I need to determine what gaps exist within this table. What is the best way to determine these gaps using a set-based approach (i.e. no cursor operations)?
A variety of simple queries can be used with any version of SQL Server to determine the first-level gaps within a given integer-based column in a table, some of which are more scalable than others. My favorite, and probably the most scalable solution I've found is a morph of a solution created by Umachandar Jayachandran.
Let's setup an example with this code and then begin to issues the queries below.
|SELECT a.pkid + 1 as GapValue|
FROM t1 a
WHERE NOT EXISTS(
SELECT * FROM t1 b
WHERE b.pkid = a.pkid + 1 )
AND a.pkid < (select max(pkid) from t1);
The query above will give you the minimum integer value (first-level gap) for every gap in the pkid column from table t1. The query from above will return the following result:
This is a great start, however, there is at least 1 fairly problematic limitation with the above: you will never get a result set that contains all ID's in a multiple-value gap range, you will only get the first ID of a gap range (for example, notice in the data that the first gap starts at 4 as output in the result set, however there also is no 5 or 6 in the data, but the results of the query only provide the first value in a gap range). This can pose a significant problem, for example, if you are trying to insert records into the table containing the gaps based of a query from data in another table. You may still end up getting the data in successfully, but you will not have done so by using up all the gaps, or even the smallest values for the id's in question.
To address this issue, with SQL Server 2000 it would be extremely difficult and would require some iterative techniques. However, with SQL Server 2005 and the advent of Recursion and CTE's (recursive common table expressions), you can easily morph the solution from above to provide you what you're looking for, as follows:
|declare @i int;
SELECT @i = MAX(pkid) FROM t1;
WITH tmp (gapId) AS (
SELECT a.gapId + 1
Not only will this result in a data-set including ALL the id values in all gap ranges, but is also extremely efficient and scalable when compared to other possible solutions using functions and iterative (i.e. cursor based) solutions:
- Be sure to check out the supporting code example to understand the entire process.
- This code is a prime example of new and enhanced approaches to resolving historical SQL Server problems with SQL Server 2005.
- As you begin to work with SQL Server 2005, research the new coding options to expand your knowledge and improve the functionality and performance of your code.
- Check out the following related tips on MSSQLTips.com:
- Special thanks to Chad Boyd from the MSSQLTips.com community for this tip
About the author
View all my tips