finding break in sequence

  • I need to be able to identify breaks in a sequence so I can evaluate the data more correctly. In the sample I have given I need to be able to identify the break in sequence at 69397576, ideally I would set that as a D. My query also needs to recognize that the 3 sequences following 69397576 are sequential and would belong to that set. so the out come would look like this.

    id file_name page_follow

    693975631555557564_22222221114014810D

    693975641555557564_22222221114014810F

    693975651555557564_22222221114014810F

    693975661555557564_22222221114014810F

    693975671555557564_22222221114014810F

    693975681555557564_22222221114014810F

    693975691555557564_22222221114014810F

    693975761555557564_22222221114014810D

    693975771555557564_22222221114014810F

    693975781555557564_22222221114014810F

    693975791555557564_22222221114014810F

    here is some test data.

    create table test1 (id INT

    , [file_name] VARCHAR(100)

    , page_follow CHAR(1));

    go

    insert test1

    values ('69397563','1555557564_22222221114014810','D')

    ,('69397564','1555557564_22222221114014810','F')

    ,('69397565','1555557564_22222221114014810','F')

    ,('69397566','1555557564_22222221114014810','F')

    ,('69397567','1555557564_22222221114014810','F')

    ,('69397568','1555557564_22222221114014810','F')

    ,('69397569','1555557564_22222221114014810','F')

    ,('69397576','1555557564_22222221114014810','F')

    ,('69397577','1555557564_22222221114014810','F')

    ,('69397578','1555557564_22222221114014810','F')

    ,('69397579','1555557564_22222221114014810','F');

    Any advise would be greatly appreciated. I am looking for a set based solution preferably. Thanks!

  • Great job posting ddl and sample data. You stated the desired results but it is exactly the same as your sample data. What exactly are looking to do here?

    _______________________________________________________________

    Need help? Help us help you.

    Read the article at http://www.sqlservercentral.com/articles/Best+Practices/61537/ for best practices on asking questions.

    Need to split a string? Try Jeff Modens splitter http://www.sqlservercentral.com/articles/Tally+Table/72993/.

    Cross Tabs and Pivots, Part 1 – Converting Rows to Columns - http://www.sqlservercentral.com/articles/T-SQL/63681/
    Cross Tabs and Pivots, Part 2 - Dynamic Cross Tabs - http://www.sqlservercentral.com/articles/Crosstab/65048/
    Understanding and Using APPLY (Part 1) - http://www.sqlservercentral.com/articles/APPLY/69953/
    Understanding and Using APPLY (Part 2) - http://www.sqlservercentral.com/articles/APPLY/69954/

  • Maybe something like this?

    SELECT fileName

    , RecID

    , page_follow

    , rn

    , PrevID

    , RecID-PrevID AS GapSize

    FROM

    (SELECT [file_name] AS fileName

    , [id] AS RecID

    , page_follow

    , ROW_NUMBER() OVER (ORDER BY [id]) AS rn

    , MAX([id]) OVER (PARTITION BY [file_name]

    ORDER BY [id]

    ROWS BETWEEN 1 PRECEDING

    AND 1 PRECEDING) AS PrevID

    FROM test1) x

    WHERE x.RecID-x.PrevID>1;

    If you do this a lot, you should definitely buy Itzik Ben-Gan's book MS SQL Server 2012 High-Performance T-SQL Using Window Functions. Kathi Kellenberger was right -- worth the read!

  • The difference between the sample and actual was a D for the id 69397576, which marks a break in the sequence.

  • This one is not so bad since you are looking for a break in an integer sequence.

    Just outer join to the next record (id+1) and check for the null.

    This will also call the last record a gap:

    create table #test1 (id INT

    , [file_name] VARCHAR(100)

    , page_follow CHAR(1));

    go

    insert #test1

    values ('69397563','1555557564_22222221114014810','D')

    ,('69397564','1555557564_22222221114014810','F')

    ,('69397565','1555557564_22222221114014810','F')

    ,('69397566','1555557564_22222221114014810','F')

    ,('69397567','1555557564_22222221114014810','F')

    ,('69397568','1555557564_22222221114014810','F')

    ,('69397569','1555557564_22222221114014810','F')

    ,('69397576','1555557564_22222221114014810','F')

    ,('69397577','1555557564_22222221114014810','F')

    ,('69397578','1555557564_22222221114014810','F')

    ,('69397579','1555557564_22222221114014810','F');

    UPDATE t1

    SET page_follow = 'D'

    FROM #test1 t1

    LEFT JOIN #test1 t2

    ON t1.id+1 = t2.id

    WHERE t2.id IS NULL

    SELECT * FROM #test1

    drop table #test1

    replace the update with this to have the last row NOT be a gap:

    UPDATE t1

    SET page_follow = 'D'

    FROM #test1 t1

    LEFT JOIN #test1 t2

    ON t1.id+1 = t2.id

    CROSS APPLY (SELECT MAX(id) id FROM #test1) t3

    WHERE t2.id IS NULL

    AND t1.id<>t3.id

  • Here are a couple of options that you could try.

    SELECT t.id,

    t.[FILE_NAME],

    CASE WHEN NOT EXISTS( SELECT 1 FROM test1 x WHERE t.id = x.id + 1)

    THEN 'D' ELSE 'F' END page_follow

    FROM test1 t

    --Example of query

    --SELECT *

    --FROM test1 t

    --WHERE NOT EXISTS(

    --SELECT 1

    --FROM test1 x

    --WHERE t.id = x.id + 1)

    UPDATE t SET

    page_follow = 'D'

    FROM test1 t

    WHERE NOT EXISTS(

    SELECT 1

    FROM test1 x

    WHERE t.id = x.id + 1)

    SELECT *

    FROM test1

    Luis C.
    General Disclaimer:
    Are you seriously taking the advice and code from someone from the internet without testing it? Do you at least understand it? Or can it easily kill your server?

    How to post data/code on a forum to get the best help: Option 1 / Option 2
  • That is perfect! Thank you so much for your help.

  • One more question, What if I wanted throw out, or mark the original set 69397563-69397569, which is the D plus all F until the sequence break? Thoughts?

  • That is usually solved like this:

    WITH CTE AS(

    SELECT id, id - ROW_NUMBER() OVER (ORDER BY id) grouper

    FROM #test1

    )

    SELECT MIN(id) min_id, MAX(id) max_id

    FROM CTE

    GROUP BY grouper

    ORDER BY min_id

    I just noticed that you're on SQL 2012 and you might want to check the following article as well. https://www.simple-talk.com/sql/t-sql-programming/the-sql-of-gaps-and-islands-in-sequences/

    Luis C.
    General Disclaimer:
    Are you seriously taking the advice and code from someone from the internet without testing it? Do you at least understand it? Or can it easily kill your server?

    How to post data/code on a forum to get the best help: Option 1 / Option 2
  • I recommend a web search for: Itzik Ben-Gan Gaps Islands. There are some amazingly-efficient solutions for this now.

    Best,
    Kevin G. Boles
    SQL Server Consultant
    SQL MVP 2007-2012
    TheSQLGuru on googles mail service

Viewing 10 posts - 1 through 9 (of 9 total)

You must be logged in to reply to this topic. Login to reply