No alternative but row-by-row processing?

  • I know how to do this with a loop or cursor, but wondered if there was a brilliant mind out there that could suggest a set-based operation?

    This is some example data:

    I'm working with the first three columns in a CTE and I'm trying to get the result in the green column to return as part of the set.

    Basically, where the code and type are the same on consecutive days, the first day is the first date of those days.

    So, code 0242628 has SICK20 running consecutively from 3rd May to 5th May and so the first day is 3rd May. There is then a break (6th, 7th and 8th May), before another set runs from 9th May to 12th May, so first date is the 9th.

    I cannot figure out a set-based way to do this.

  • mmcardle - Monday, May 21, 2018 9:20 AM

    I know how to do this with a loop or cursor, but wondered if there was a brilliant mind out there that could suggest a set-based operation?

    This is some example data:

    I'm working with the first three columns in a CTE and I'm trying to get the result in the green column to return as part of the set.

    Basically, where the code and type are the same on consecutive days, the first day is the first date of those days.

    So, code 0242628 has SICK20 running consecutively from 3rd May to 5th May and so the first day is 3rd May. There is then a break (6th, 7th and 8th May), before another set runs from 9th May to 12th May, so first date is the 9th.

    I cannot figure out a set-based way to do this.

    You've been here long enough to know that you should present your sample data in a consumable format.

    The absence of evidence is not evidence of absence
    - Martin Rees
    The absence of consumable DDL, sample data and desired results is, however, evidence of the absence of my response
    - Phil Parkin

  • looks like a gaps and islands problem to me.
    You might find it easier with a Date Tally Table too.

    You could RANK() the data and then take the RANK value away from the date.  All records with the same calculated date are consecutive days.  (NOTE this won't work if you are ignoring weekends/bank holidays) 

    you could LEFT join the data to itself using a DATEADD() offset to link today and yesterday, then RANK the joined dates to find the 1st rank dates.

  • Phil Parkin - Monday, May 21, 2018 9:40 AM

    You've been here long enough to know that you should present your sample data in a consumable format.

    Sorry Phil, I was rushing and the first attempt at copying and pasting from Excel went awry.

    OK, the data can be had from this:

    DECLARE @tabData TABLE
        (
            [Code] varchar(7),
            [Date] date,
            [Type] varchar(20)
        );

    INSERT INTO @tabData([Code], [Date], [Type])
    SELECT '0242589', '2018-04-20', 'HOLHRSAPR'
    UNION
    SELECT '0242597', '2018-04-19', 'HOLHRSAPR'
    UNION
    SELECT '0242628', '2018-04-02', 'HOLHRSAPR'
    UNION
    SELECT '0242628', '2018-04-03', 'SICK20'
    UNION
    SELECT '0242628', '2018-04-04', 'SICK20'
    UNION
    SELECT '0242628', '2018-04-05', 'SICK20'
    UNION
    SELECT '0242628', '2018-04-09', 'SICK20'
    UNION
    SELECT '0242628', '2018-04-10', 'SICK20'
    UNION
    SELECT '0242628', '2018-04-11', 'SICK20'
    UNION
    SELECT '0242628', '2018-04-12', 'SICK20'
    UNION
    SELECT '0242640', '2018-04-03', 'HOLHRSAPR'
    UNION
    SELECT '0242640', '2018-04-09', 'HOLHRSAPR';

    SELECT    [Code], [Date], [Type], '?' AS [First Day]
    FROM    @tabData;

  • Hope this helps:

    DECLARE @tabData TABLE
        (
            [Code] varchar(7),
            [Date] date,
            [Type] varchar(20)
        );

    INSERT INTO @tabData([Code], [Date], [Type])
    SELECT '0242589', '2018-04-20', 'HOLHRSAPR'
    UNION
    SELECT '0242597', '2018-04-19', 'HOLHRSAPR'
    UNION
    SELECT '0242628', '2018-04-02', 'HOLHRSAPR'
    UNION
    SELECT '0242628', '2018-04-03', 'SICK20'
    UNION
    SELECT '0242628', '2018-04-04', 'SICK20'
    UNION
    SELECT '0242628', '2018-04-05', 'SICK20'
    UNION
    SELECT '0242628', '2018-04-09', 'SICK20'
    UNION
    SELECT '0242628', '2018-04-10', 'SICK20'
    UNION
    SELECT '0242628', '2018-04-11', 'SICK20'
    UNION
    SELECT '0242628', '2018-04-12', 'SICK20'
    UNION
    SELECT '0242640', '2018-04-03', 'HOLHRSAPR'
    UNION
    SELECT '0242640', '2018-04-09', 'HOLHRSAPR';

    WITH base AS (
    SELECT
     

    .[Code]
      ,

    .[Date]
      ,

    .[Type]
      , '?' AS [First Day]
      , rn = ROW_NUMBER() OVER (PARTITION BY

    .[Code],

    .[Type] ORDER BY

    .[Date])
    --  , [td2].*
    FROM
      @tabData td
    --  LEFT OUTER JOIN @tabData AS [td2]
    --    ON [td2].[Code] =

    .[Code] AND [td2].[Type] =

    .[Type] AND DATEADD(DAY, 1,[td2].[Date]) =

    .[Date]
    )
    SELECT
      .[Code]
      , .[Date]
      , .[Type]
      , [First Day] = MIN(.[Date]) OVER (PARTITION BY .[Code], .[Type], DATEADD(DAY,-.[rn],.[Date]))
      , .[rn]
      , [grpDate] = DATEADD(DAY,-.[rn],.[Date])
    FROM
      [base] ;
    GO

  • If you look up  dim date table, it can have this field in it or you can create it. This table can be created and used in a number of projects. it can be created in a few minutes and will hold all the dates you will ever need. And will emirate the need for  most formulas, You will just query the table for where the date  matches the date in this table.

  • itmasterw 60042 - Monday, May 21, 2018 11:20 AM

    If you look up  dim date table, it can have this field in it or you can create it. This table can be created and used in a number of projects. it can be created in a few minutes and will hold all the dates you will ever need. And will emirate the need for  most formulas, You will just query the table for where the date  matches the date in this table.

    Care to demonstrate how this would work in this situation?

  • Lynn Pettis - Monday, May 21, 2018 11:42 AM

    itmasterw 60042 - Monday, May 21, 2018 11:20 AM

    If you look up  dim date table, it can have this field in it or you can create it. This table can be created and used in a number of projects. it can be created in a few minutes and will hold all the dates you will ever need. And will emirate the need for  most formulas, You will just query the table for where the date  matches the date in this table.

    Care to demonstrate how this would work in this situation?

    +1, exactly what I was thinking.

    The absence of evidence is not evidence of absence
    - Martin Rees
    The absence of consumable DDL, sample data and desired results is, however, evidence of the absence of my response
    - Phil Parkin

  • Here's an example using just table joins:


    ;WITH TabData (Code, Date, Type)
    AS
    (
        SELECT
            Code, Date, type
        FROM
            @tabData
    )
    SELECT
        Full_Data.Code,
        Full_Data.Date,
        Full_Data.type,
        Max(Series_Start.Date) AS Key_Date
    FROM
        TabData Full_Data
    Join
        TabData Series_Start
    LEFT JOIN
        TabData Previous_Date
            ON
                Series_Start.Code = Previous_Date.Code
            AND Series_Start.Type = Previous_Date.Type
            AND Series_Start.Date = DateAdd(Day, 1, Previous_Date.Date)
            ON
                Full_Data.Code = Series_Start.Code
            AND Full_Data.Type = Series_Start.Type
            AND Full_Data.Date >= Series_Start.Date
            AND Previous_Date.Date IS Null
    GROUP BY
        Full_Data.Code,
        Full_Data.Date,
        Full_Data.type
    ORDER BY
        Full_Data.Code,
        Full_Data.Date

  • sestell1 - Monday, May 21, 2018 12:35 PM

    Here's an example using just table joins:


    ;WITH TabData (Code, Date, Type)
    AS
    (
        SELECT
            Code, Date, type
        FROM
            @tabData
    )
    SELECT
        Full_Data.Code,
        Full_Data.Date,
        Full_Data.type,
        Max(Series_Start.Date) AS Key_Date
    FROM
        TabData Full_Data
    Join
        TabData Series_Start
    LEFT JOIN
        TabData Previous_Date
            ON
                Series_Start.Code = Previous_Date.Code
            AND Series_Start.Type = Previous_Date.Type
            AND Series_Start.Date = Previous_Date.Date + 1
            ON
                Full_Data.Code = Series_Start.Code
            AND Full_Data.Type = Series_Start.Type
            AND Full_Data.Date >= Series_Start.Date
            AND Previous_Date.Date IS Null
    GROUP BY
        Full_Data.Code,
        Full_Data.Date,
        Full_Data.type
    ORDER BY
        Full_Data.Code,
        Full_Data.Date

    Your code does not run.

    Msg 206, Level 16, State 2, Line 56

    Operand type clash: date is incompatible with int

  • Well with out putting this together it is hard to fully demonstrate. But unless I am misunderstanding the request, you should be able to do an inner join between his table and the dime data table and return the FirstDayOfMonth. So if his table DateTb, for example then 
    he would just do something like this:

    SELECT FirstDayOfMonth
                                        From DimDate  dt INNER JOIN DateTB db
                                         ON
      dt.date = db.Date

  • itmasterw 60042 - Monday, May 21, 2018 1:37 PM

    Well with out putting this together it is hard to fully demonstrate. But unless I am misunderstanding the request, you should be able to do an inner join between his table and the dime data table and return the FirstDayOfMonth. So if his table DateTb, for example then 
    he would just do something like this:

    SELECT FirstDayOfMonth
                                        From DimDate  dt INNER JOIN DateTB db
                                         ON
      dt.date = db.Date

    But that isn't what the OP is looking for in this case.  Go back at look at the original post.

  • Lynn Pettis - Monday, May 21, 2018 12:37 PM

    sestell1 - Monday, May 21, 2018 12:35 PM

    Here's an example using just table joins:


    ;WITH TabData (Code, Date, Type)
    AS
    (
        SELECT
            Code, Date, type
        FROM
            @tabData
    )
    SELECT
        Full_Data.Code,
        Full_Data.Date,
        Full_Data.type,
        Max(Series_Start.Date) AS Key_Date
    FROM
        TabData Full_Data
    Join
        TabData Series_Start
    LEFT JOIN
        TabData Previous_Date
            ON
                Series_Start.Code = Previous_Date.Code
            AND Series_Start.Type = Previous_Date.Type
            AND Series_Start.Date = Previous_Date.Date + 1
            ON
                Full_Data.Code = Series_Start.Code
            AND Full_Data.Type = Series_Start.Type
            AND Full_Data.Date >= Series_Start.Date
            AND Previous_Date.Date IS Null
    GROUP BY
        Full_Data.Code,
        Full_Data.Date,
        Full_Data.type
    ORDER BY
        Full_Data.Code,
        Full_Data.Date

    Your code does not run.

    Msg 206, Level 16, State 2, Line 56

    Operand type clash: date is incompatible with int

    Oh sorry, I was lazy and changed the date column data type to DateTime so I could subtract 1 rather than use DateAdd.  :Whistling:
    I've updated the code above to use DateAdd.  

  • The following 2 links are solution to the Packing Intervals Problem by Itzik Ben-Gan.
    2011/04/13 ... http://blogs.solidq.com/en/sqlserver/packing-intervals/
    2015/08/11 ... http://sqlmag.com/sql-server/new-solution-packing-intervals-problem

  • IF OBJECT_ID('tempdb..#tabData') IS NOT NULL DROP TABLE #tabData;

    CREATE TABLE #tabData ([Code] varchar(7),[Date] date,[Type] varchar(20));

    INSERT INTO #tabData ([Code], [Date], [Type])

    VALUES

    ('0242589', '2018-04-20', 'HOLHRSAPR'),

    ('0242597', '2018-04-19', 'HOLHRSAPR'),

    ('0242628', '2018-04-02', 'HOLHRSAPR'),

    ('0242628', '2018-04-03', 'SICK20'),

    ('0242628', '2018-04-04', 'SICK20'),

    ('0242628', '2018-04-05', 'SICK20'),

    ('0242628', '2018-04-09', 'SICK20'),

    ('0242628', '2018-04-10', 'SICK20'),

    ('0242628', '2018-04-11', 'SICK20'),

    ('0242628', '2018-04-12', 'SICK20'),

    ('0242640', '2018-04-03', 'HOLHRSAPR'),

    ('0242640', '2018-04-09', 'HOLHRSAPR');

    CREATE UNIQUE CLUSTERED INDEX ucx_Stuff ON #tabData ([Code], [Type], [Date]);

    SELECT [Code], [Date], [Type],

    [First Day] = MIN([Date]) OVER(PARTITION BY [Code], [Type], grp)

    FROM (

    SELECT [Code], [Date], [Type],

    grp = DATEADD(DAY,1-ROW_NUMBER() OVER(PARTITION BY [Code], [Type] ORDER BY [Date]),[Date])

    FROM #tabData

    ) d;

    EDIT: Exactly the same method as Lynn ๐Ÿ™‚

    โ€œWrite the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.โ€ - Gail Shaw

    For fast, accurate and documented assistance in answering your questions, please read this article.
    Understanding and using APPLY, (I) and (II) Paul White
    Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

Viewing 15 posts - 1 through 15 (of 17 total)

You must be logged in to reply to this topic. Login to reply