Assign Consecutive Numbers to a block of data

  • I want to assign consecutive numbers to a block of data where block of data is based on days consecutive to each other i.e., one day apart.

    Date format is: YYYY-MM-DD

    Data:

    TestId TestDate

    ----------- -----------------------

    1 2011-07-21 00:00:00.000

    1 2011-07-22 00:00:00.000

    1 2011-07-27 00:00:00.000

    1 2011-07-29 00:00:00.000

    1 2011-07-30 00:00:00.000

    1 2011-07-31 00:00:00.000

    1 2011-08-01 00:00:00.000

    1 2011-08-10 00:00:00.000

    1 2011-08-12 00:00:00.000

    1 2011-08-13 00:00:00.000

    2 2013-01-02 00:00:00.000

    2 2013-01-03 00:00:00.000

    2 2013-01-04 00:00:00.000

    2 2013-08-03 00:00:00.000

    2 2013-08-05 00:00:00.000

    2 2013-09-02 00:00:00.000

    My Attempt: [/b]

    WITH cte AS

    (

    SELECTTestId,

    TestDate,

    ROW_NUMBER() OVER

    (

    PARTITION BYTestId

    ORDER BYTestId, TestDate

    )AS OrderId

    FROMdbo.tblDatesSequenceTest

    )

    SELECT*

    FROMcte

    Create Table with Data to Test: [/b]

    CREATE TABLE dbo.tblDatesSequenceTest ( TestId INT NOT NULL, TestDate DATETIME NOT NULL )

    INSERT INTO dbo.tblDatesSequenceTest VALUES ( 1, '2011-07-21 00:00:00.000' )

    INSERT INTO dbo.tblDatesSequenceTest VALUES ( 1, '2011-07-22 00:00:00.000' )

    INSERT INTO dbo.tblDatesSequenceTest VALUES ( 1, '2011-07-27 00:00:00.000' )

    INSERT INTO dbo.tblDatesSequenceTest VALUES ( 1, '2011-07-29 00:00:00.000' )

    INSERT INTO dbo.tblDatesSequenceTest VALUES ( 1, '2011-07-30 00:00:00.000' )

    INSERT INTO dbo.tblDatesSequenceTest VALUES ( 1, '2011-07-31 00:00:00.000' )

    INSERT INTO dbo.tblDatesSequenceTest VALUES ( 1, '2011-08-01 00:00:00.000' )

    INSERT INTO dbo.tblDatesSequenceTest VALUES ( 1, '2011-08-10 00:00:00.000' )

    INSERT INTO dbo.tblDatesSequenceTest VALUES ( 1, '2011-08-12 00:00:00.000' )

    INSERT INTO dbo.tblDatesSequenceTest VALUES ( 1, '2011-08-13 00:00:00.000' )

    -- Test 2

    INSERT INTO dbo.tblDatesSequenceTest VALUES ( 2, '2013-01-02 00:00:00.000' )

    INSERT INTO dbo.tblDatesSequenceTest VALUES ( 2, '2013-01-03 00:00:00.000' )

    INSERT INTO dbo.tblDatesSequenceTest VALUES ( 2, '2013-01-04 00:00:00.000' )

    INSERT INTO dbo.tblDatesSequenceTest VALUES ( 2, '2013-08-03 00:00:00.000' )

    INSERT INTO dbo.tblDatesSequenceTest VALUES ( 2, '2013-08-05 00:00:00.000' )

    INSERT INTO dbo.tblDatesSequenceTest VALUES ( 2, '2013-09-02 00:00:00.000' )

    Expected Output:

    TestId TestDate OrderId

    ----------- ----------------------- --------------------

    1 2011-07-21 00:00:00.000 1

    1 2011-07-22 00:00:00.000 1

    1 2011-07-27 00:00:00.000 2

    1 2011-07-29 00:00:00.000 3

    1 2011-07-30 00:00:00.000 3

    1 2011-07-31 00:00:00.000 3

    1 2011-08-01 00:00:00.000 3

    1 2011-08-10 00:00:00.000 4

    1 2011-08-12 00:00:00.000 5

    1 2011-08-13 00:00:00.000 5

    2 2013-01-02 00:00:00.000 6

    2 2013-01-03 00:00:00.000 6

    2 2013-01-04 00:00:00.000 6

    2 2013-08-03 00:00:00.000 7

    2 2013-08-05 00:00:00.000 8

    2 2013-09-02 00:00:00.000 9

    The OrderId is the column I am trying to obtain using my following cte code, but I can't work around it.

    Any help in this will be much appreciated.

  • I managed to calculate difference between second and previous row:

    SELECTTestId,

    TestDate,

    DIFF

    FROM(

    SELECTa.*,

    b.TestDate AS PreviousRecordDate,

    CASE WHEN DATEDIFF(DAY, b.TestDate, a.TestDate) <=1 THEN 0 ELSE 1 END AS DIFF

    FROMdbo.tblDatesSequenceTest AS a

    LEFT OUTER JOINdbo.tblDatesSequenceTest AS b

    ONa.TestId=b.TestIdAND

    a.TestDate=b.TestDate + 1

    ORDER BYa.TestId, a.TestDate

    )AS a

    ORDER BYTestDate

    Output:

    TestId TestDate PreviousRecordDate DIFF

    ----------- ----------------------- ----------------------- -----------

    1 2011-07-21 00:00:00.000 NULL 1

    1 2011-07-22 00:00:00.000 2011-07-21 00:00:00.000 0

    1 2011-07-27 00:00:00.000 NULL 1

    1 2011-07-29 00:00:00.000 NULL 1

    1 2011-07-30 00:00:00.000 2011-07-29 00:00:00.000 0

    1 2011-07-31 00:00:00.000 2011-07-30 00:00:00.000 0

    1 2011-08-01 00:00:00.000 2011-07-31 00:00:00.000 0

    1 2011-08-10 00:00:00.000 NULL 1

    1 2011-08-12 00:00:00.000 NULL 1

    1 2011-08-13 00:00:00.000 2011-08-12 00:00:00.000 0

    2 2013-01-02 00:00:00.000 NULL 1

    2 2013-01-03 00:00:00.000 2013-01-02 00:00:00.000 0

    2 2013-01-04 00:00:00.000 2013-01-03 00:00:00.000 0

    2 2013-08-03 00:00:00.000 NULL 1

    2 2013-08-05 00:00:00.000 NULL 1

    2 2013-09-02 00:00:00.000 NULL 1

    Now thinking about the next step: :ermm:

  • Google: "Gaps and Islands Itzik Ben-gan"

    That should get you started.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • You can probably figure it out from this:

    SELECT *, [GroupBy] = TestDate - rn

    FROM (

    SELECT *, [rn] = ROW_NUMBER() OVER(PARTITION BY TestId ORDER BY TestDate)

    FROM tblDatesSequenceTest

    ) d

    “Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw

    For fast, accurate and documented assistance in answering your questions, please read this article.
    Understanding and using APPLY, (I) and (II) Paul White
    Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

  • Output:

    TestId TestDate rn GroupBy

    ----------- ----------------------- -------------------- -----------------------

    1 2011-07-21 00:00:00.000 1 2011-07-20 00:00:00.000

    1 2011-07-22 00:00:00.000 2 2011-07-20 00:00:00.000

    1 2011-07-27 00:00:00.000 3 2011-07-24 00:00:00.000

    1 2011-07-29 00:00:00.000 4 2011-07-25 00:00:00.000

    1 2011-07-30 00:00:00.000 5 2011-07-25 00:00:00.000

    1 2011-07-31 00:00:00.000 6 2011-07-25 00:00:00.000

    1 2011-08-01 00:00:00.000 7 2011-07-25 00:00:00.000

    1 2011-08-10 00:00:00.000 8 2011-08-02 00:00:00.000

    1 2011-08-12 00:00:00.000 9 2011-08-03 00:00:00.000

    1 2011-08-13 00:00:00.000 10 2011-08-03 00:00:00.000

    2 2013-01-02 00:00:00.000 1 2013-01-01 00:00:00.000

    2 2013-01-03 00:00:00.000 2 2013-01-01 00:00:00.000

    2 2013-01-04 00:00:00.000 3 2013-01-01 00:00:00.000

    2 2013-08-03 00:00:00.000 4 2013-07-30 00:00:00.000

    2 2013-08-05 00:00:00.000 5 2013-07-31 00:00:00.000

    2 2013-09-02 00:00:00.000 6 2013-08-27 00:00:00.000

    Still confused, sorry :doze:

    ChrisM@Work (3/18/2015)


    You can probably figure it out from this:

    SELECT *, [GroupBy] = TestDate - rn

    FROM (

    SELECT *, [rn] = ROW_NUMBER() OVER(PARTITION BY TestId ORDER BY TestDate)

    FROM tblDatesSequenceTest

    ) d

  • How about sequencing the new column [GroupBy] using ROW_NUMBER()?

    “Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw

    For fast, accurate and documented assistance in answering your questions, please read this article.
    Understanding and using APPLY, (I) and (II) Paul White
    Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

  • superb!!!Thanks a lot sir 🙂 🙂 🙂

    Final code:

    WITH cte AS

    (

    SELECTTestId,

    TestDate,

    --[GroupBy] = TestDate - rn,

    RANK() OVER (

    PARTITION BYTestId

    ORDER BYTestDate - rn

    )AS OrderId

    FROM(

    SELECTTestId,

    TestDate,

    ROW_NUMBER() OVER

    (

    PARTITION BY TestId

    ORDER BY TestDate

    )AS rn

    FROMtblDatesSequenceTest

    ) d

    )

    SELECT*

    FROMcte

    Output:

    TestId TestDate OrderId

    ----------- ----------------------- --------------------

    1 2011-07-21 00:00:00.000 1

    1 2011-07-22 00:00:00.000 1

    1 2011-07-27 00:00:00.000 3

    1 2011-07-29 00:00:00.000 4

    1 2011-07-30 00:00:00.000 4

    1 2011-07-31 00:00:00.000 4

    1 2011-08-01 00:00:00.000 4

    1 2011-08-10 00:00:00.000 8

    1 2011-08-12 00:00:00.000 9

    1 2011-08-13 00:00:00.000 9

    2 2013-01-02 00:00:00.000 1

    2 2013-01-03 00:00:00.000 1

    2 2013-01-04 00:00:00.000 1

    2 2013-08-03 00:00:00.000 4

    2 2013-08-05 00:00:00.000 5

    2 2013-09-02 00:00:00.000 6

    ChrisM@Work (3/18/2015)


    How about sequencing the new column [GroupBy] using ROW_NUMBER()?

  • DaPainKiller (3/18/2015)


    superb!!!Thanks a lot sir 🙂 🙂 🙂

    Final code:

    WITH cte AS

    (

    SELECTTestId,

    TestDate,

    --[GroupBy] = TestDate - rn,

    RANK() OVER (

    PARTITION BYTestId

    ORDER BYTestDate - rn

    )AS OrderId

    FROM(

    SELECTTestId,

    TestDate,

    ROW_NUMBER() OVER

    (

    PARTITION BY TestId

    ORDER BY TestDate

    )AS rn

    FROMtblDatesSequenceTest

    ) d

    )

    SELECT*

    FROMcte

    Output:

    TestId TestDate OrderId

    ----------- ----------------------- --------------------

    1 2011-07-21 00:00:00.000 1

    1 2011-07-22 00:00:00.000 1

    1 2011-07-27 00:00:00.000 3

    1 2011-07-29 00:00:00.000 4

    1 2011-07-30 00:00:00.000 4

    1 2011-07-31 00:00:00.000 4

    1 2011-08-01 00:00:00.000 4

    1 2011-08-10 00:00:00.000 8

    1 2011-08-12 00:00:00.000 9

    1 2011-08-13 00:00:00.000 9

    2 2013-01-02 00:00:00.000 1

    2 2013-01-03 00:00:00.000 1

    2 2013-01-04 00:00:00.000 1

    2 2013-08-03 00:00:00.000 4

    2 2013-08-05 00:00:00.000 5

    2 2013-09-02 00:00:00.000 6

    ChrisM@Work (3/18/2015)


    How about sequencing the new column [GroupBy] using ROW_NUMBER()?

    Teach a man to fish...thanks! You made my day 🙂

    “Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw

    For fast, accurate and documented assistance in answering your questions, please read this article.
    Understanding and using APPLY, (I) and (II) Paul White
    Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

Viewing 8 posts - 1 through 7 (of 7 total)

You must be logged in to reply to this topic. Login to reply