How to find the Gap

Question

Post reply

How to find the Gap

karthik M

One Orange Chip

Points: 29139
More actions
December 31, 2007 at 4:53 am

#62995

Experts,
I have the table like
EmpNo Ename
1 A
2 B
3 C
5 E
6 F
7 G
8 H
9 I
15 ABC
16 BCD
17 CDE
18 DEF
25 YYY
28 WWW
30 QQQ
31 AAAA
32 BBBB
33 CCCC
34 DDDDD
35 EEEEEE
36 FFFF
37 GGGG
38 HHHHH
39 YTYTY
40 UUUUI
Now i want to find the gap in the sequence.
Experts inputs are truly appreciable.
karthik

Viewing 7 posts - 1 through 6 (of 6 total)

You must be logged in to reply to this topic. Login to reply

Simon Liddle SSCrazy Points: 2580 More actions · Answer 1

Are you after the gaps in the EmpNo column?

If so, without giving you the actual code, one way to do it would be to:

select the values

from your tally/numbers table (or an equivalent)

which are between the min and max values of EmpNo

and are not in the EmpNo column

Whether this is an efficient solution or not I am not sure but it is a solution.

Hope this helps,

Simon

Jeff Moden SSC Guru Points: 1004405 More actions · Answer 2

Are you after the gaps in the EmpNo column?

That's what I want to know...

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

karthik M One Orange Chip Points: 29139 More actions · Answer 3

Are you after the gaps in the EmpNo column?

Yes.

karthik

Jeff Moden SSC Guru Points: 1004405 More actions · Answer 4

Now, I'm worried for you... the practice of "finding gaps" in a "sequence" column is usually accompanied by the insane idea that you'd ever reuse missing numbers. The INT data type allows over 2 Billion numbers and, if that's not big enough, try BIGINT.

Still, we all have people to work for and sometimes they can't be convinced of how wrong they may be... so, with that in mind, here's a real test of the metal known as "Finding Gaps in a Sequence". This is the fastest/shortest method for finding gaps I've ever seen and it does not use any Temp Tables or Table Variables. I wish I could find the person who originated this code just to say "thanks".

First, some copious amounts of test data to test the speed and accuracy with...

--===== Setup for speed and to prevent blocking

SET NOCOUNT ON

SET TRANSACTION ISOLATION LEVEL READ UNCOMMITTED

--=============================================================================

-- Create an experimental table to simulate the table being examined

--=============================================================================

--===== If the experimental temp table exists, drop it so we can do repeat runs

-- if we want to "play".

IF OBJECT_ID('TempDB..#yourtable') IS NOT NULL

DROP TABLE #yourtable

--===== Create the experimental temp table and populate with IDs on the fly

-- This always works because SysColumns always has at least 4000 entries

-- even in a new database and 4000*4000 = 16,000,000

SELECT TOP 1000000 ID = IDENTITY(INT, 1, 1)

INTO #yourtable

FROM Master.dbo.SysColumns sc1,

Master.dbo.SysColumns sc2

-- --===== Like any good table, our experimental table needs a Primary Key

ALTER TABLE #yourtable

ADD PRIMARY KEY CLUSTERED (ID)

--===== Ok, we have an experimental table with a million IDs from 1 to 1,000,000.

-- Let's remove some of the rows and see if we can find if they're missing

-- This deletes a handful of individual rows and 3 very small ranges

DELETE #yourtable

WHERE ID IN (1,2,3,250,251,2000,4000,4002,4004,900001,900002,900003,999999)

-- This deletes a "monster" range just to see how it's handled.

DELETE #yourtable

WHERE ID BETWEEN 500000 AND 700000

OR (ID BETWEEN 700000 AND 800000

AND ID%2 = 0) --Even numbered items only

... and the code that finds the start and the end of each gap even if the gap is a single row...

--===== This short little ditty is what does the actual work

SELECT GapStart = (SELECT ISNULL(MAX(b.ID),0)+1

FROM #yourtable b

WHERE b.ID < a.ID),

GapEnd = ID - 1

FROM #yourtable a

WHERE a.ID - 1 NOT IN (SELECT ID FROM #yourtable)

AND a.ID - 1 > 0

Do notice that there is a "Triangular Join" involved here... normally, that's a form of "Death by SQL". It's not here because of the constraints implied by the MAX function...

... and if your table isn't properly indexed (index on the ID column, preferably a Clustered one), plan on it taking many minutes to run instead of the normal 3 or 4 seconds... I stopped it after 5 minutes.

I have to say it again... if you have a system that requires reuse of any type of sequence, you have a really bad design. The possible exception might be an ordered parts list where the sequence number is listed on an exploded assembly diagram or similar beast.

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

karthik M One Orange Chip Points: 29139 More actions · Answer 5

Thanks Jeff. I didn't check it. I will check it and get back to you.

karthik

DonaldW SSCarpal Tunnel Points: 4525 More actions · Answer 6

Jeff Moden (1/2/2008)
This is the fastest/shortest method for finding gaps I've ever seen and it does not use any Temp Tables or Table Variables. I wish I could find the person who originated this code just to say "thanks".

Looks like something Itzik Ben-Gan wrote in one of his books or in SQL Server Magazine. I know he's covered the topic several times.