HELP,TSQL, - How do I write the SELECT stmt to condense multiple time spans

  • mw_sql_developer - Wednesday, August 1, 2018 1:00 PM

    bmg002 - Wednesday, August 1, 2018 12:51 PM

    mw_sql_developer - Wednesday, August 1, 2018 11:57 AM

    Done! Finally!
    I tested against single spans as well... Works beautifully


    If object_id('tempdb..#DLT') IS NOT NULL DROP TABLE #DLT;

    CREATE TABLE #DLT( emp VARCHAR(13), St VARCHAR(8), Et VARCHAR(8), RSN VARCHAR(100) );

    INSERT INTO #DLT( emp, St, Et, RSN )
    VALUES
    ('0010970116','20160701',    '20170131',    'Member Eligibility Change'),
    ('0010970115','20160701',    '20170131',    'Member Eligibility Change'),
    ('0010970115','20170201',    '20171130',    'Member Eligibility Change'),
    ('0010970115','20171201',    '20171231',    'Member Eligibility Change'),
    ('0010970115','20180105',    '20180531',    'Member Eligibility Change'),
    ('0010970115','20180601',    '29991231',    'Member Eligibility Change')

    ;
    With Step1 as
    (
    Select emp, St, Et
    ,LEAD(St) OVER(PARTITION BY Emp ORDER BY emp, St) AS NextSt
    ,ROW_NUMBER() OVER (PARTITION BY emp ORDER BY St) AS THE_ORDER
    ,DATEDIFF(D, Et , (LEAD(St) OVER(PARTITION BY Emp ORDER BY emp, St)) ) as DIFF -- Gap in days
    FROM
    #DLT
    )
    ,
    Step2 as
    (
    Select *
    ,CASE WHEN DATEADD(D, 1, CAST( ET as DATETIME)) = CAST( NEXTST as DATETIME) THEN 'OK' ELSE NULL END as DEC
    ,LAG(DIFF) OVER(PARTITION BY Emp ORDER BY emp, St) AS PreDiff
    ,LEAD(DIFF) OVER(PARTITION BY Emp ORDER BY emp, St) AS PostDiff
    FROM
    Step1
    )
    ,
    Step3 as
    (
            Select
            *
            ,
            CASE WHEN (PreDiff is NULL ) OR (PreDiff <> 1 ) THEN
                'START_OF_SPAN'
            ELSE
                CASE WHEN DEC IS NULL THEN 'END_OF_SPAN' ELSE
                    'YOU ARE INSIDE A SPAN'
                END
            END as FINAL_DECISION

            FROM Step2
    )
    ,
    Step4 as
    (
            Select
            *
            FROM Step3 WHERE FINAL_DECISION = 'START_OF_SPAN'
    )
    ,
    Step5 as
    (
            Select
            *
            FROM Step3 WHERE FINAL_DECISION = 'END_OF_SPAN'
    ),
    Step6 as
    (
            Select A.EMP, A.St,
            CASE WHEN D.Et IS NULL THEN A.ET ELSE D.Et END as ENDDT
            FROM Step4 A
            OUTER APPLY
            (
                Select top 1 * FROM Step5
                WHERE
                emp = A.emp
                AND
                THE_ORDER >= A.THE_ORDER
                ORDER BY THE_ORDER
            )D

    )
    Select * FROM Step6

    Discovered a bug in my version of this - a single row customer doesn't work and my output was in the form "<start date> - <end date>" which is what I thought you wanted based on your original post.
    Looking through your code, there is a little bit of tidying up you could do, no?  Like are you using "PostDiff"?  and you could likely remove steps 4, 5 and 6 if you change your final select to:

      Select A.EMP, A.St,
       CASE WHEN D.Et IS NULL THEN A.ET ELSE D.Et END as ENDDT
       FROM Step3 A
            
       OUTER APPLY
       (
        Select top 1 * FROM Step3 B
        WHERE
        emp = A.emp
        AND
        THE_ORDER >= A.THE_ORDER
                AND .[Final_decision] = 'END_OF_SPAN'
        ORDER BY THE_ORDER
       )D
            WHERE [A].[FINAL_DECISION] = 'START_OF_SPAN'

    But now it is starting to feel like semantics without adding much to this conversation.
    To make my code work with your new data set and give the output you have, CTE D needs to be changed to:

    [D]
    AS ( SELECT
       [C].[emp]
       , [C].[St]
       , [C].[Et]
       , [C].[DIFF]
       , [C].[minStart]
       , CASE
                WHEN ([C].[St] = [C].[minStart] AND DIFF IS NULL) THEN
                    [C].[Et]
        WHEN [C].[St] = [C].[minStart] THEN
         [C].[St]
        WHEN
         (
          [C].[DIFF] = 1
          AND LAG([C].[DIFF]) OVER ( PARTITION BY
                   [C].[emp]
                   ORDER BY
                   [C].[St]
                  ) > 1
         ) THEN
         [C].[St]
        WHEN [C].[DIFF] = 1 THEN
         NULL
        WHEN [C].[DIFF] IS NULL THEN
         [C].[Et]
        ELSE
         [C].[Et]
        END AS [daterange]
      FROM
       [C] )

    and the final select needs to be changed to:

    SELECT
    [E].[emp]
    , CASE
        WHEN nextDate IS NULL
        THEN [e].[St]
        ELSE
         [E].[daterange]
         END AS st
    , CASE WHEN nextDate IS NULL
    THEN [E].daterange
    ELSE
    [E].[nextdate]
    END AS enddt
    FROM
    [E]
    WHERE
    [E].[rn] % 2 != 0;

    Not trying to pick apart your code; just seeing ways to make it a bit shorter.  It just felt to me like the problem should be solvable with fewer CTEs.  Even my version with 3 (poorly named) CTE's feels a bit large for this problem...  Plus i have some unused columns in my CTE's.  In CTE D, I only need to pull in St, and emp plus the calculated stuff.  I don't need to pull in the other columns from C; they are just used for calculations.  So my version could be optimized a bit as well.

    I purposely made it to have many CTEs. I could have done it in a few .. Just for clarity  I like to have many CTEs. BTW - It does handle single spans

    Yep - yours handles single spans, mine did not without the adjustments in my last post.

    The above is all just my opinion on what you should do. 
    As with all advice you find on a random internet forum - you shouldn't blindly follow it.  Always test on a test server to see if there is negative side effects before making changes to live!
    I recommend you NEVER run "random code" you found online on any system you care about UNLESS you understand and can verify the code OR you don't care if the code trashes your system.

  • Here's my take on it:
    If object_id('tempdb..#DLT') IS NOT NULL DROP TABLE #DLT;
    CREATE TABLE #DLT( emp VARCHAR(13), St VARCHAR(8), Et VARCHAR(8), RSN VARCHAR(100) );
    INSERT INTO #DLT( emp, St, Et, RSN )
    VALUES
    ('0010970116','20160701',    '20170131',    'Member Eligibility Change'),
    ('0010970115','20160701',    '20170131',    'Member Eligibility Change'),
    ('0010970115','20170201',    '20171130',    'Member Eligibility Change'),
    ('0010970115','20171201',    '20171231',    'Member Eligibility Change'),
    ('0010970115','20180105',    '20180531',    'Member Eligibility Change'),
    ('0010970115','20180601',    '29991231',    'Member Eligibility Change'),
    ('0010970115','20180701',    '29991231',    'Member Eligibility Change');
    With ChangePoints As
    (
    Select
      emp,
      ChangeDate,
      Depth
    From
    (
      Select
      emp,
       ChangeDate,
      Sum(Stack) Over (Partition By emp Order By ChangeDate) As Depth,
      rsn
      From
      (
      Select emp, Cast(st as date) st, cast(et as date) et, rsn from #DLT
      )FIXDATES
      Cross Apply (Values(st, 1),(et, -1)) As CHNG(ChangeDate, Stack)
    ) FINDOVERLAPS
    Where Depth In (0,1)
    ),
    MergeAdjacent As
    (
    Select
      emp,
      ChangeDate,
      Depth
    From
    (
      Select
       emp,
      NullIf(ChangeDate,
        IIF(Depth = 0,
        DateAdd(Day, -1, Lead(ChangeDate) Over (Partition By emp Order By ChangeDate)),
       DateAdd(Day, 1, Lag(ChangeDate) Over (Partition By emp Order By ChangeDate))
      )) As ChangeDate,
      Depth
      From ChangePoints
    ) COMBINE
    Where ChangeDate Is Not Null
    )
    Select
    emp,
    st,
    et
    From
    (
    Select
      emp,
      ChangeDate As st,
      Lead(ChangeDate) Over (Partition By emp Order By ChangeDate) As et,
      Depth
    From MergeAdjacent
    ) PIVOTBACK
    Where Depth = 1

Viewing 2 posts - 16 through 16 (of 16 total)

You must be logged in to reply to this topic. Login to reply