How to find the Gap

  • Experts,

    I have the table like

    EmpNo Ename

    1 A

    2 B

    3 C

    5 E

    6 F

    7 G

    8 H

    9 I

    15 ABC

    16 BCD

    17 CDE

    18 DEF

    25 YYY

    28 WWW

    30 QQQ

    31 AAAA

    32 BBBB

    33 CCCC

    34 DDDDD

    35 EEEEEE

    36 FFFF

    37 GGGG

    38 HHHHH

    39 YTYTY

    40 UUUUI

    Now i want to find the gap in the sequence.

    Experts inputs are truly appreciable.

    karthik

  • Are you after the gaps in the EmpNo column?

    If so, without giving you the actual code, one way to do it would be to:

    select the values

    from your tally/numbers table (or an equivalent)

    which are between the min and max values of EmpNo

    and are not in the EmpNo column

    Whether this is an efficient solution or not I am not sure but it is a solution.

    Hope this helps,

    Simon

  • Are you after the gaps in the EmpNo column?

    That's what I want to know...

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Are you after the gaps in the EmpNo column?

    Yes.

    karthik

  • Now, I'm worried for you... the practice of "finding gaps" in a "sequence" column is usually accompanied by the insane idea that you'd ever reuse missing numbers. The INT data type allows over 2 Billion numbers and, if that's not big enough, try BIGINT.

    Still, we all have people to work for and sometimes they can't be convinced of how wrong they may be... so, with that in mind, here's a real test of the metal known as "Finding Gaps in a Sequence". This is the fastest/shortest method for finding gaps I've ever seen and it does not use any Temp Tables or Table Variables. I wish I could find the person who originated this code just to say "thanks".

    First, some copious amounts of test data to test the speed and accuracy with...

    --===== Setup for speed and to prevent blocking

    SET NOCOUNT ON

    SET TRANSACTION ISOLATION LEVEL READ UNCOMMITTED

    --=============================================================================

    -- Create an experimental table to simulate the table being examined

    --=============================================================================

    --===== If the experimental temp table exists, drop it so we can do repeat runs

    -- if we want to "play".

    IF OBJECT_ID('TempDB..#yourtable') IS NOT NULL

    DROP TABLE #yourtable

    --===== Create the experimental temp table and populate with IDs on the fly

    -- This always works because SysColumns always has at least 4000 entries

    -- even in a new database and 4000*4000 = 16,000,000

    SELECT TOP 1000000 ID = IDENTITY(INT, 1, 1)

    INTO #yourtable

    FROM Master.dbo.SysColumns sc1,

    Master.dbo.SysColumns sc2

    -- --===== Like any good table, our experimental table needs a Primary Key

    ALTER TABLE #yourtable

    ADD PRIMARY KEY CLUSTERED (ID)

    --===== Ok, we have an experimental table with a million IDs from 1 to 1,000,000.

    -- Let's remove some of the rows and see if we can find if they're missing

    -- This deletes a handful of individual rows and 3 very small ranges

    DELETE #yourtable

    WHERE ID IN (1,2,3,250,251,2000,4000,4002,4004,900001,900002,900003,999999)

    -- This deletes a "monster" range just to see how it's handled.

    DELETE #yourtable

    WHERE ID BETWEEN 500000 AND 700000

    OR (ID BETWEEN 700000 AND 800000

    AND ID%2 = 0) --Even numbered items only

    ... and the code that finds the start and the end of each gap even if the gap is a single row...

    --===== This short little ditty is what does the actual work

    SELECT GapStart = (SELECT ISNULL(MAX(b.ID),0)+1

    FROM #yourtable b

    WHERE b.ID < a.ID),

    GapEnd = ID - 1

    FROM #yourtable a

    WHERE a.ID - 1 NOT IN (SELECT ID FROM #yourtable)

    AND a.ID - 1 > 0

    Do notice that there is a "Triangular Join" involved here... normally, that's a form of "Death by SQL". It's not here because of the constraints implied by the MAX function...

    ... and if your table isn't properly indexed (index on the ID column, preferably a Clustered one), plan on it taking many minutes to run instead of the normal 3 or 4 seconds... I stopped it after 5 minutes.

    I have to say it again... if you have a system that requires reuse of any type of sequence, you have a really bad design. The possible exception might be an ordered parts list where the sequence number is listed on an exploded assembly diagram or similar beast.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Thanks Jeff. I didn't check it. I will check it and get back to you.

    karthik

  • Jeff Moden (1/2/2008)


    This is the fastest/shortest method for finding gaps I've ever seen and it does not use any Temp Tables or Table Variables. I wish I could find the person who originated this code just to say "thanks".

    Looks like something Itzik Ben-Gan wrote in one of his books or in SQL Server Magazine. I know he's covered the topic several times.

Viewing 7 posts - 1 through 6 (of 6 total)

You must be logged in to reply to this topic. Login to reply