Overview
The process of adding and more specifically deleting records from the database over time results in a large number of gaps in tables’ identifier sequence number. It may be desirable to know where the breaks are, perhaps to re-use the numbers following identification of duplicates that
need renumbering or where a bulk insert is required, or perhaps to identify
gaps in a cheque sequence, or analyze potential issues with the database
programs creating gaps invalidly. To do this however you must first identify the gaps.
To identify the gaps I have created the following script, which using the sample temp table created below demonstrates one method for achieving the desired result.
Sample table creation
Create table #SequenceTable (SeqNumber Int ) Declare @Seq int Set @Seq = 10 while @Seq <= 100 Begin Insert into #SequenceTable values(@Seq) Set @Seq = @Seq + 1 End Insert into #SequenceTable values(101) Insert into #SequenceTable values(102) Insert into #SequenceTable values(103) Insert into #SequenceTable values(105) Insert into #SequenceTable values(106) Insert into #SequenceTable values(107) Insert into #SequenceTable values(114) Insert into #SequenceTable values(115) Insert into #SequenceTable values(116) Insert into #SequenceTable values(129) Insert into #SequenceTable values(130) Insert into #SequenceTable values(131)
Gap identification code
Select LastSeqNumber , NextSeqNumber , FirstAvailable = LastSeqNumber + 1 , LastAvailable = NextSeqNumber - 1 , NumbersAvailable = NextSeqNumber - (LastSeqNumber + 1) from ( Select LastSeqNumber = (Select isnull(Max(Seq2.SeqNumber),0) as SeqNumber from #SequenceTable Seq2 where Seq2.SeqNumber < Seq1.SeqNumber) , NextSeqNumber = SeqNumber from #SequenceTable Seq1 ) as A where NextSeqNumber - LastSeqNumber > 1 order by LastSeqNumber
Resulting output
LastSeqNumber | NextSeqNumber | FirstAvailable | LastAvailable | NumbersAvailable |
0 | 10 | 1 | 9 | 9 |
103 | 105 | 104 | 104 | 1 |
107 | 114 | 108 | 113 | 6 |
116 | 129 | 117 | 128 | 12 |
A detailed breakdown
The code requires us to access table #SequenceTable twice.
The first is a straight read of table #SequenceTable (Seq1), setting NextSeqNumber to the number found in each row’s SeqNumber field. NextSeqNumber is the first SeqNumber value following a break.
The second sub query select on table #SequenceTable (Seq2), selects a single value for field LastSeqNumber, being the highest value found that precedes the value loaded into NextSeqNumber. LastSeqNumber is the immediate number found preceding a break.
Finally, wrapping the selection of LastSeqNumber and NextSeqNumber in an outer select statement (A), selects only records where a gap exists (NextSeqNumber – LastSeqNumber > 1). You can change the value here to find only gaps big enough for the number of rows you require.
The code performs a small trick when determining LastSeqNumber to ensure capture of a
break at the start of the number sequence, in this case values 1 to 9. Here it uses isnull to return a zero value instead of null when it tries to find a value less than the smallest found number. Of course, when the smallest number is one, then the difference between zero and one is one, so this will not report a gap. For this to work, the SeqNumber column must not contain numbers less than zero.
As a final nicety, the code simply calculates the First available number and last available number in the missing number range and the number of available slots in the sequence this represents.
Summary
The above code is a simple method for identifying gaps in a number sequence. My own specific use was to find gaps for re-use when renumbering duplicate sequenced records. For this purpose, I was able to change the gap qualifier > 1 on the outer query, to the size of the gap that I required to fill with additional rows. I originally thought a poor conversion from an earlier database created the gaps, however on running the script on several occasions I saw new gaps appearing. It turned out that code was inadvertently corrupting sequence numbers causing gaps, and I arranged a fix.
For your own purpose, it should be easy to insert your own table and column names using change/replace, and make this work over your own tables.