No alternative but row-by-row processing?

  • sestell1 - Monday, May 21, 2018 2:57 PM

    Lynn Pettis - Monday, May 21, 2018 12:37 PM

    sestell1 - Monday, May 21, 2018 12:35 PM

    Here's an example using just table joins:


    ;WITH TabData (Code, Date, Type)
    AS
    (
        SELECT
            Code, Date, type
        FROM
            @tabData
    )
    SELECT
        Full_Data.Code,
        Full_Data.Date,
        Full_Data.type,
        Max(Series_Start.Date) AS Key_Date
    FROM
        TabData Full_Data
    Join
        TabData Series_Start
    LEFT JOIN
        TabData Previous_Date
            ON
                Series_Start.Code = Previous_Date.Code
            AND Series_Start.Type = Previous_Date.Type
            AND Series_Start.Date = Previous_Date.Date + 1
            ON
                Full_Data.Code = Series_Start.Code
            AND Full_Data.Type = Series_Start.Type
            AND Full_Data.Date >= Series_Start.Date
            AND Previous_Date.Date IS Null
    GROUP BY
        Full_Data.Code,
        Full_Data.Date,
        Full_Data.type
    ORDER BY
        Full_Data.Code,
        Full_Data.Date

    Your code does not run.

    Msg 206, Level 16, State 2, Line 56

    Operand type clash: date is incompatible with int

    Oh sorry, I was lazy and changed the date column data type to DateTime so I could subtract 1 rather than use DateAdd.  :Whistling:
    I've updated the code above to use DateAdd.  

    That is why I have just gotten into the habit of using the DATEADD function, that and it makes it clear what is being done.

  • ChrisM@Work - Tuesday, May 22, 2018 4:12 AM

    IF OBJECT_ID('tempdb..#tabData') IS NOT NULL DROP TABLE #tabData;

    CREATE TABLE #tabData ([Code] varchar(7),[Date] date,[Type] varchar(20));

    INSERT INTO #tabData ([Code], [Date], [Type])

    VALUES

    ('0242589', '2018-04-20', 'HOLHRSAPR'),

    ('0242597', '2018-04-19', 'HOLHRSAPR'),

    ('0242628', '2018-04-02', 'HOLHRSAPR'),

    ('0242628', '2018-04-03', 'SICK20'),

    ('0242628', '2018-04-04', 'SICK20'),

    ('0242628', '2018-04-05', 'SICK20'),

    ('0242628', '2018-04-09', 'SICK20'),

    ('0242628', '2018-04-10', 'SICK20'),

    ('0242628', '2018-04-11', 'SICK20'),

    ('0242628', '2018-04-12', 'SICK20'),

    ('0242640', '2018-04-03', 'HOLHRSAPR'),

    ('0242640', '2018-04-09', 'HOLHRSAPR');

    CREATE UNIQUE CLUSTERED INDEX ucx_Stuff ON #tabData ([Code], [Type], [Date]);

    SELECT [Code], [Date], [Type],

    [First Day] = MIN([Date]) OVER(PARTITION BY [Code], [Type], grp)

    FROM (

    SELECT [Code], [Date], [Type],

    grp = DATEADD(DAY,1-ROW_NUMBER() OVER(PARTITION BY [Code], [Type] ORDER BY [Date]),[Date])

    FROM #tabData

    ) d;

    EDIT: Exactly the same method as Lynn 🙂

    Just a little more compact than mine.  I almost missed the row_number function, needed a second look.

  • Lynn and ChrisM, many thanks for your suggestions. I have incorporated them into my code and much happier now as I'm not a massive fan of row-by-row processing, although it has its place in the universe.

    Thanks again.

Viewing 3 posts - 16 through 17 (of 17 total)

You must be logged in to reply to this topic. Login to reply