Can this be done with analytics?

  • hi, i have a minor dilemma, you've never let me down before. thanks in advance.

    i have this table:

    CREATE TABLE [dbo].[testdata](

    [seq] [int] NULL,

    [section] [varchar](1) NULL,

    [cat] [varchar](1) NULL,

    [seconds] [int] NULL

    ) ON [PRIMARY]

    with this data in it:

    insert into testdata (seq, section, cat, seconds)

    values (1, 'A', '0', 7)

    insert into testdata (seq, section, cat, seconds)

    values (2, 'A', '0', 6)

    insert into testdata (seq, section, cat, seconds)

    values (3, 'A', '1', 8)

    insert into testdata (seq, section, cat, seconds)

    values (4, 'A', '1', 7)

    insert into testdata (seq, section, cat, seconds)

    values (5, 'A', '1', 6)

    insert into testdata (seq, section, cat, seconds)

    values (6, 'A', '1', 6)

    insert into testdata (seq, section, cat, seconds)

    values (7, 'A', '1', 5)

    insert into testdata (seq, section, cat, seconds)

    values (8, 'A', '1', 8)

    insert into testdata (seq, section, cat, seconds)

    values (9, 'A', '0', 8)

    insert into testdata (seq, section, cat, seconds)

    values (10, 'A', '0', 7)

    insert into testdata (seq, section, cat, seconds)

    values (11, 'A', '0', 7)

    insert into testdata (seq, section, cat, seconds)

    values (12, 'A', '1', 5)

    insert into testdata (seq, section, cat, seconds)

    values (13, 'A', '1', 5)

    insert into testdata (seq, section, cat, seconds)

    values (14, 'A', '1', 6)

    insert into testdata (seq, section, cat, seconds)

    values (15, 'A', '0', 8)

    insert into testdata (seq, section, cat, seconds)

    values (16, 'A', '0', 7)

    insert into testdata (seq, section, cat, seconds)

    values (17, 'A', '0', 6)

    insert into testdata (seq, section, cat, seconds)

    values (18, 'A', '0', 6)

    insert into testdata (seq, section, cat, seconds)

    values (19, 'A', '0', 5)

    insert into testdata (seq, section, cat, seconds)

    values (20, 'A', '0', 8)

    insert into testdata (seq, section, cat, seconds)

    values (1, 'B', '0', 5)

    insert into testdata (seq, section, cat, seconds)

    values (2, 'B', '0', 5)

    insert into testdata (seq, section, cat, seconds)

    values (3, 'B', '0', 8)

    insert into testdata (seq, section, cat, seconds)

    values (4, 'B', '0', 5)

    insert into testdata (seq, section, cat, seconds)

    values (5, 'B', '0', 6)

    insert into testdata (seq, section, cat, seconds)

    values (6, 'B', '1', 7)

    insert into testdata (seq, section, cat, seconds)

    values (7, 'B', '1', 7)

    insert into testdata (seq, section, cat, seconds)

    values (8, 'B', '1', 6)

    insert into testdata (seq, section, cat, seconds)

    values (9, 'B', '1', 5)

    insert into testdata (seq, section, cat, seconds)

    values (10, 'B', '1', 7)

    insert into testdata (seq, section, cat, seconds)

    values (11, 'B', '1', 8)

    insert into testdata (seq, section, cat, seconds)

    values (12, 'B', '1', 8)

    insert into testdata (seq, section, cat, seconds)

    values (13, 'B', '1', 5)

    insert into testdata (seq, section, cat, seconds)

    values (14, 'B', '0', 5)

    insert into testdata (seq, section, cat, seconds)

    values (15, 'B', '0', 5)

    insert into testdata (seq, section, cat, seconds)

    values (16, 'B', '0', 6)

    insert into testdata (seq, section, cat, seconds)

    values (17, 'B', '0', 7)

    insert into testdata (seq, section, cat, seconds)

    values (18, 'B', '0', 8)

    insert into testdata (seq, section, cat, seconds)

    values (19, 'B', '0', 7)

    insert into testdata (seq, section, cat, seconds)

    values (20, 'B', '0', 8)

    i would like to output this table:

    CREATE TABLE [dbo].[testoutput](

    [section] [varchar](1) NULL,

    [seconds] [int] NULL

    )

    with this data:

    insert into testoutput (section, seconds)

    values ('A', 40)

    insert into testoutput (section, seconds)

    values ('B', 53)

    the business rule is that in each section (A, B, etc), i would like a sum of ONLY the longest segment in seconds. only consecutive groupings can be considered.

    i've been trying to make this happen using analytics, but I can't get the logic down.

    any help is appreciated!

  • perhaps:

    ;

    WITH IslandMin AS (

    SELECT row_number() OVER (ORDER BY section,seq) AS rn,seq,section

    FROM testdata t1

    WHERE NOT EXISTS (SELECT 1 FROM testdata t2 WHERE t1.seq-1 = t2.seq AND t1.section = t2.section AND t1.cat = t2.cat)

    ), IslandMax AS (

    SELECT row_number() OVER (ORDER BY section,seq) AS rn,seq,section

    FROM testdata t1

    WHERE NOT EXISTS (SELECT 1 FROM testdata t2 WHERE t1.seq+1 = t2.seq AND t1.section = t2.section AND t1.cat = t2.cat)

    ), MakeGroups AS (

    SELECT i1.rn,t1.section,SUM(seconds) AS seconds

    FROM testdata t1

    JOIN IslandMin i1

    ON i1.seq <= t1.seq

    AND i1.section = t1.section

    JOIN IslandMax i2

    ON i2.seq >= t1.seq

    AND i1.rn = i2.rn

    AND i2.section = t1.section

    GROUP BY i1.rn,t1.section

    )

    SELECT section, MAX(seconds)

    FROM MakeGroups

    GROUP by section

    I did not use analytic functions here. I'll give them a shot below.

  • Here it is using lag and lead to find the islands:

    WITH getlaglead AS (

    SELECT

    seq,

    section,

    cat,

    seconds,

    LAG(cat,1,'N/A') OVER (PARTITION BY SECTION ORDER BY seq) AS prevcat,

    LEAD(cat,1,'N/A') OVER (PARTITION BY SECTION ORDER BY seq) AS nextcat

    FROM #testdata

    ), IslandMin AS (

    SELECT row_number() OVER (ORDER BY section,seq) AS rn,seq,section

    FROM getlaglead t1

    WHERE cat <> prevcat

    ), IslandMax AS (

    SELECT row_number() OVER (ORDER BY section,seq) AS rn,seq,section

    FROM getlaglead t1

    WHERE cat <> nextcat

    ),MakeGroups AS (

    SELECT i1.rn,t1.section,SUM(seconds) AS seconds

    FROM #testdata t1

    JOIN IslandMin i1

    ON i1.seq <= t1.seq

    AND i1.section = t1.section

    JOIN IslandMax i2

    ON i2.seq >= t1.seq

    AND i1.rn = i2.rn

    AND i2.section = t1.section

    GROUP BY i1.rn,t1.section

    )

    SELECT section, MAX(seconds)

    FROM MakeGroups

    GROUP by section

    No time right now to do any performance test to see whether using the analytic functions actually make things better.

Viewing 3 posts - 1 through 2 (of 2 total)

You must be logged in to reply to this topic. Login to reply