Date filtering using DATEDIFF

  • I am attempting to create an episode marker when duration between dates is >= 90 days, filtered by EMID.  Below is the query I ran, with actual vs expected output in the attached file.   There are 2 issues: 1- the DATEDIFF function is supposed to restart with each new EMID (which is the point of using partition)- currently this is not the case.  2- The query is not recognizing 90 day periods within the same EMID, only the transition from one EMID to another.  Hopefully the actual vs expected table will clarify these issues.

    SELECT TOP (1000) [EMID],[visit_date_01],
    CASE
    WHEN DATEDIFF(DAY, visit_date_01, LAG(visit_date_01,1,getdate()) OVER (partition by EMID order by EMID)) <= 90 THEN '1'
    WHEN DATEDIFF(DAY, visit_date_01, LAG(visit_date_01,1,getdate()) OVER (PARTITION BY EMID ORDER BY EMID)) BETWEEN 90 AND 179 THEN '2'
    WHEN DATEDIFF(DAY, visit_date_01, LAG(visit_date_01,1,getdate()) OVER (PARTITION BY EMID order by EMID)) > 180 THEN '3'
    END AS EPISODE
    FROM [res_treatment_escalation].[dbo].['c-spine_full_dataset_4#29#20_wi$']
    Attachments:
    You must be logged in to view attached files.
  • based on your expected I would suggest you change the default value from getdate() to visit_date_01

    and add visit_date_01 to the order by (instead of EMID as that is already on the partition)

    and your expected results for " BA10031 2018-08-01" - I think it should be 3, not 2 as you mention

  • Thank you for your reply. I made the recommended changes, and below is the new query- the result is that every row in the Episode column is 1

    SELECT TOP (1000) [EMID],[visit_date_01],
    CASE
    WHEN DATEDIFF(DAY, visit_date_01, LAG(visit_date_01,1,visit_date_01) OVER (partition by EMID order by visit_date_01)) <= 90 THEN '1'
    WHEN DATEDIFF(DAY, visit_date_01, LAG(visit_date_01,1,visit_date_01) OVER (PARTITION BY EMID ORDER BY visit_date_01)) BETWEEN 90 AND 179 THEN '2'
    WHEN DATEDIFF(DAY, visit_date_01, LAG(visit_date_01,1,visit_date_01) OVER (PARTITION BY EMID order by visit_date_01)) > 180 THEN '3'
    END AS EPISODE


    FROM [res_treatment_escalation].[dbo].['c-spine_full_dataset_4#29#20_wi$']
  • add these 2 lines to your code so you can see what you are doing wrong - and then you should know how to fix it

    ,LAG(visit_date_01, 1, visit_date_01) OVER (PARTITION BY EMID order by visit_date_01)

    ,DATEDIFF(DAY, visit_date_01, LAG(visit_date_01, 1, visit_date_01) OVER (PARTITION BY EMID order by visit_date_01))

  • You have your dates switched in the DATEDIFF function.  It should be DATEDIFF(DAY, LAG(...), visit_date_01).  Or you should be using LEAD() instead of LAG().

    Drew

    J. Drew Allen
    Business Intelligence Analyst
    Philadelphia, PA

  • Thanks Drew- I changed to LEAD.  Attached is updated actual vs expected output.  Any insight on how to change to code to accommodate the expected output?

    Attachments:
    You must be logged in to view attached files.
  • your expected output does not match the code you supplied - so please be very very clear on exactly what are the rules for what you intend in getting as we can't read minds (at least that far from us!!)

    in any case I think that you will need to do this slightly differently as you have multiple records for same EMID and date which messes up any lag/lead window (at least according to what you posted as desired for the second example)

    so can you pick up the following example and state what you need as output for each line with a clear explanation of why the value should be what you expect

    ;with base as
    (select *
    from (values
    ('BA10010', convert(date, '2014-04-01'))
    ,('BA10010', convert(date, '2015-01-01'))
    ,('BA10010', convert(date, '2015-05-01'))
    ,('BA10010', convert(date, '2016-11-01'))
    ,('BA10010', convert(date, '2016-11-01'))
    ,('BA10010', convert(date, '2016-11-01'))
    ,('BA10010', convert(date, '2017-06-01'))
    ,('BA10010', convert(date, '2017-06-01'))
    ,('BA10010', convert(date, '2017-06-01'))
    ,('BA10010', convert(date, '2017-06-01'))
    ,('BA10010', convert(date, '2017-06-01'))
    ,('BA10010', convert(date, '2017-06-01'))
    ,('BA10010', convert(date, '2017-07-01'))
    ,('BA10010', convert(date, '2017-07-01'))
    ,('BA10010', convert(date, '2017-07-01'))
    ,('BA10010', convert(date, '2017-07-01'))
    ,('BA10010', convert(date, '2017-07-01'))


    ) t(EMID, visit_date_01)
    )
    select b1.EMID
    , b1.visit_date_01
    , days_last_visit
    , previous_visit
    , CASE
    WHEN days_last_visit <= 90 THEN '1'
    WHEN days_last_visit BETWEEN 90 AND 179 THEN '2'
    WHEN days_last_visit > 180 THEN '3'
    END AS EPISODE
    from base b1
    inner join (select EMID
    , visit_date_01
    , datediff(day, LAG(visit_date_01, 1, visit_date_01) OVER (PARTITION BY EMID order by visit_date_01), visit_date_01) as days_last_visit
    -- next line only to show what date was found
    , LAG(visit_date_01, 1, visit_date_01) OVER (PARTITION BY EMID order by visit_date_01) as previous_visit
    from (select distinct EMID, visit_date_01
    from base
    ) t
    ) dts
    on dts.emid = b1.emid
    and dts.visit_date_01 = b1.visit_date_01
    order by b1.emid, b1.visit_date_01

  • Thank you Frederico- this is probably a very basic question, but how to I include the table name in the query you supplied so it knows where to pull data from?  When I try to add this at the end I get an error "Incorrect syntax near the keyword 'FROM' "

    The table name is

    FROM [res_treatment_escalation].[dbo].['c-spine_full_dataset_4#29#20_wi$']
  • you could answer my question without using your data as I supplied sample data on the CTE (you need to learn about these)

    to use your real table you use the following query replacing the "from base" with "from your tablename"

    select b1.EMID
    , b1.visit_date_01
    , days_last_visit
    , previous_visit
    , CASE
    WHEN days_last_visit <= 90 THEN '1'
    WHEN days_last_visit BETWEEN 90 AND 179 THEN '2'
    WHEN days_last_visit > 180 THEN '3'
    END AS EPISODE
    from base b1
    inner join (select EMID
    , visit_date_01
    , datediff(day, LAG(visit_date_01, 1, visit_date_01) OVER (PARTITION BY EMID order by visit_date_01), visit_date_01) as days_last_visit
    -- next line only to show what date was found
    , LAG(visit_date_01, 1, visit_date_01) OVER (PARTITION BY EMID order by visit_date_01) as previous_visit
    from (select distinct EMID, visit_date_01
    from base
    ) t
    ) dts
    on dts.emid = b1.emid
    and dts.visit_date_01 = b1.visit_date_01
    order by b1.emid, b1.visit_date_01
  • Thank you so much!  I was able to run the query- however, the output is still not exactly what I am looking for.  I have attached actual vs expected- what I am looking for exactly is for the episode number to begin at 1 for each unique EMID, continue as 1 until a 90 day period between dates is found (or a new EMID occurs), and increment by 1.  Hopefully this is clear, and I really appreciate your help.  I am a total beginner and trouble shooting this on my own takes days.

    Attachments:
    You must be logged in to view attached files.
  • that is clearer but still could you explain why

    01/07/2017 had an increment when difference between previous date is only 30 days

    01/08/2017 did not have an increment when difference between previous date is 31 days

    and if we were to add the following dates what would you expect and why - sorry if I insist but it sill needs clarification

    01/09/2017 (diff 31

    01/10/2017 (diff 30

    01/11/2017 (diff 31

    01/12/2017 (diff 30

    EMIDvisit_date_01days_last_visitprevious_visitEPISODE
    BA1001001/04/2014001/04/20141
    BA1001001/01/201527501/04/20142
    BA1001001/05/201512001/01/20153
    BA1001001/11/201655001/05/20154
    BA1001001/06/201721201/11/20165
    BA1001001/07/20173001/06/20176
    BA1001001/08/20173101/07/20176
  • Hi Frederico- sorry, this was my mistake.  There should be no increment on 1/07/2017 or 1/08/2017 (fixed below).  The same with the other dates you asked about.  An increment only occurs when >=90 days occur between dates within the same EMID

    EMID    visit_date_01  days_last_visit  previous_visit  EPISODE
    BA10010 01/04/2014 0 01/04/2014 1
    BA10010 01/01/2015 275 01/04/2014 2
    BA10010 01/05/2015 120 01/01/2015 3
    BA10010 01/11/2016 550 01/05/2015 4
    BA10010 01/06/2017 212 01/11/2016 5
    BA10010 01/07/2017 30 01/06/2017 5
    BA10010 01/08/2017 31 01/07/2017 5
  • thanks for the clarification.

    this should then do the trick - as before replace the "base" table with your own table

    select b1.EMID
    , b1.visit_date_01
    , dts.episode
    , dts.previous_visit
    , dts.daysdiff
    , dts.rankx
    from base b1
    inner join (select t2.*
    , sum(rankx) over (PARTITION BY EMID order by visit_date_01)
    as episode
    from (select EMID
    , visit_date_01
    , case
    when LAG(visit_date_01, 1, visit_date_01) OVER (PARTITION BY EMID order by visit_date_01) = visit_date_01
    then 1
    when datediff(day, LAG(visit_date_01, 1, visit_date_01) OVER (PARTITION BY EMID order by visit_date_01), visit_date_01) >= 90
    then 1
    else 0
    end as rankx
    -- next lines only to show what values were considered was found can be removed
    , datediff(day, LAG(visit_date_01, 1, visit_date_01) OVER (PARTITION BY EMID order by visit_date_01), visit_date_01) as daysdiff
    , LAG(visit_date_01, 1, visit_date_01) OVER (PARTITION BY EMID order by visit_date_01) as previous_visit
    from (select distinct EMID, visit_date_01
    from base
    ) t
    ) t2
    ) dts
    on dts.emid = b1.emid
    and dts.visit_date_01 = b1.visit_date_01
    order by b1.emid, b1.visit_date_01
  • Thats it!  Amazing work, and thank you so much.  I am forever in your debt.

Viewing 14 posts - 1 through 13 (of 13 total)

You must be logged in to reply to this topic. Login to reply