Loop through data to find consecutive results prior to a certain date within a 6 month period

  • VIG (12/30/2011)


    ChrisM@home (12/30/2011)


    Hey VIG, this one's right up your street, been expecting you.

    Our results differ, I get two positive rows, you get one:

    Why ???

    ClaCaseIDNameIDIncidentDateMissed

    37430 5806272009-11-11 Y

    I think it is wrong

    Possibly interpretation:

    .

    .

    .

    (4,580627,'2009-03-27'),

    (9,580627,'2009-04-28'),

    (9,580627,'2009-05-13'),

    (9,580627,'2009-05-27'), --(28584,580627,'2009-08-02') and (37430,580627,'2009-11-11')

    (9,580627,'2009-06-12'),

    (9,580627,'2009-06-27'), --(28584,580627,'2009-08-02') and (37430,580627,'2009-11-11')

    (9,580627,'2009-07-27'), --(28584,580627,'2009-08-02') and (37430,580627,'2009-11-11')

    (4,580627,'2009-08-12'),

    .

    .

    .

    What do you reckon?


    [font="Arial"]Low-hanging fruit picker and defender of the moggies[/font]

    For better assistance in answering your questions, please read this[/url].


    Understanding and using APPLY, (I)[/url] and (II)[/url] Paul White[/url]

    Hidden RBAR: Triangular Joins[/url] / The "Numbers" or "Tally" Table: What it is and how it replaces a loop[/url] Jeff Moden[/url]

  • Sorry,Chris

    It was my mistake

    select cc.ClaCaseID,cc.NameID , cc.IncidentDate

    ,case when exists (select PaymentStatus

    from

    (select ap.ReceiverID,ap.PaymentStatus

    ,ROW_NUMBER() over(partition by ap.ReceiverID order by DueDate) rn

    , ROW_NUMBER() over(partition by ap.ReceiverID order by DueDate)

    - DENSE_RANK() over (partition by ap.ReceiverID,PaymentStatus order by DueDate) grp

    from (select ReceiverID,PaymentStatus,MAX(duedate) duedate

    from #AccPayments a

    where a.ReceiverID=cc.NameID

    and DueDate >= DATEADD(m,-6,cc.IncidentDate) and DueDate<=cc.IncidentDate

    group by ReceiverID,PaymentStatus,YEAR(duedate),MONTH(duedate)

    ) ap

    ) x

    group by PaymentStatus,grp having PaymentStatus=9 and MAX(rn)-MIN(rn) +1 =COUNT(*) and COUNT(*)>=3

    ) then 'Y' else 'N' end Missed

    from #ClaCases cc

    order by cc.ClaCaseID,cc.NameID , cc.IncidentDate

  • VIG (12/30/2011)


    ...It was my mistake...

    Not necessarily - it could still be interpretation. The same three missed payments are captured by two different event rows. The correct solution could be one or the other or both. I suspect the real data may not allow this.


    [font="Arial"]Low-hanging fruit picker and defender of the moggies[/font]

    For better assistance in answering your questions, please read this[/url].


    Understanding and using APPLY, (I)[/url] and (II)[/url] Paul White[/url]

    Hidden RBAR: Triangular Joins[/url] / The "Numbers" or "Tally" Table: What it is and how it replaces a loop[/url] Jeff Moden[/url]

  • ChrisM@home (12/30/2011)

    it could still be interpretation. The same three missed payments are captured by two different event rows. The correct solution could be one or the other or both. I suspect the real data may not allow this.

    I don't know

    Let's wait for the author. That's his task. not mine 🙂

  • Please check this Query,

    CREATE TABLE #ClaCases (ClaCaseID INT,ReceiverID int, IncidentDate Date )

    GO

    INSERT INTO #ClaCases VALUES (1234,80451,'2008-11-21')

    INSERT INTO #ClaCases VALUES (2442,84244,'2010-03-23')

    INSERT INTO #ClaCases VALUES (6774,45676,'2008-06-14')

    GO

    CREATE TABLE #AccPayments (PaymentStatus INT,ReceiverID int, DueDate Date )

    GO

    INSERT INTO #AccPayments VALUES (4,80451,'2008-07-01')

    INSERT INTO #AccPayments VALUES (4,80451,'2008-08-01')

    INSERT INTO #AccPayments VALUES (9,80451,'2008-09-01')

    INSERT INTO #AccPayments VALUES (9,80451,'2008-10-01')

    INSERT INTO #AccPayments VALUES (4,80451,'2008-11-01')

    INSERT INTO #AccPayments VALUES (4,80451,'2008-12-01')

    INSERT INTO #AccPayments VALUES (4,84244,'2009-11-01')

    INSERT INTO #AccPayments VALUES (4,84244,'2009-12-01')

    INSERT INTO #AccPayments VALUES (9,84244,'2010-01-01')

    INSERT INTO #AccPayments VALUES (9,84244,'2010-02-01')

    INSERT INTO #AccPayments VALUES (9,84244,'2010-03-01')

    INSERT INTO #AccPayments VALUES (9,84244,'2010-04-01')

    INSERT INTO #AccPayments VALUES (4,84244,'2010-05-01')

    INSERT INTO #AccPayments VALUES (9,84244,'2010-06-01')

    INSERT INTO #AccPayments VALUES (9,45676,'2008-01-01')

    INSERT INTO #AccPayments VALUES (4,45676,'2008-02-01')

    INSERT INTO #AccPayments VALUES (9,45676,'2008-03-01')

    INSERT INTO #AccPayments VALUES (4,45676,'2008-04-01')

    INSERT INTO #AccPayments VALUES (9,45676,'2008-05-01')

    INSERT INTO #AccPayments VALUES (4,45676,'2008-06-01')

    INSERT INTO #AccPayments VALUES (4,45676,'2008-07-01')

    GO

    SELECT ClaCaseID, ReceiverID, IncidentDate

    , CASE WHEN SUM(Missed) = 1 THEN 'Y' ELSE 'N' END AS Missed

    FROM (

    SELECT ClaCaseID

    , ReceiverID AS ReceiverID

    , IncidentDate AS IncidentDate

    , CASE WHEN ((MAX(PaymentStatus) = 9) AND (COUNT(AdsFormula) >= 3)) THEN 1 ELSE 0 END AS Missed

    FROM (

    SELECT ClaCaseID, ReceiverID, IncidentDate, [Serial], PaymentStatus

    , Serial - ROW_NUMBER() OVER (PARTITION BY PaymentStatus ORDER BY Serial) AS AdsFormula

    FROM (

    SELECT CC.ClaCaseID, CC.ReceiverID, CC.IncidentDate

    , ROW_NUMBER() OVER (ORDER BY CC.ReceiverID) AS [Serial]

    , AP.PaymentStatus

    FROM #ClaCases CC INNER JOIN

    #AccPayments AP

    ON CC.ReceiverID = AP.ReceiverID

    WHERE AP.DueDate > DATEADD(MM, -6, CC.IncidentDate)

    ) InnerDT

    ) DT

    GROUP BY ClaCaseID, AdsFormula, ReceiverID, IncidentDate

    )OuterDT

    GROUP BY ClaCaseID, ReceiverID, IncidentDate

  • gpm.alwyn (1/2/2012)


    Please check this Query,

    ...

    Msg 207, Level 16, State 1, Line 17

    Invalid column name 'ReceiverID'.


    [font="Arial"]Low-hanging fruit picker and defender of the moggies[/font]

    For better assistance in answering your questions, please read this[/url].


    Understanding and using APPLY, (I)[/url] and (II)[/url] Paul White[/url]

    Hidden RBAR: Triangular Joins[/url] / The "Numbers" or "Tally" Table: What it is and how it replaces a loop[/url] Jeff Moden[/url]

  • Thanks everyone for the help thus far.

    I have tested your queries and have picked up a few differences in your result sets.

    @ChrisM@home: Your Query is very close to my desired solution. The only issue is that it selects the max DueDate for a particular month regardless of whether this is AFTER IncidentDate. I should've explicitly stated that if the max DueDate for a particular month falls past IncidentDate then use the max DueDate <= IncidentDate for that month. From what I see if I had informed you of this condition your query will return exactly what I require.

    Also there is something I hadn't considered before. I haven't come across a case like this as of yet.

    There may be multiple instances of 3 consecutive PaymentStatus = 9 separated by a PaymentStatus = 4 within a 6 month period prior to IncidentDate. If this is the case then once again Flag 'Y'.

    Out of the 42 000 instances ChrisM@home and VIG have 220 Cases where their solutions do not match.

    Here is some sample data with these mismatches.

    --DROP TABLE #ClaCases

    DROP TABLE #ClaCases

    CREATE TABLE #ClaCases (ClaCaseID INT,NameID int, IncidentDate Date )

    INSERT INTO #ClaCases VALUES (2879,325309,'2008-06-28')

    INSERT INTO #ClaCases VALUES (3706,227013,'2008-08-02')

    INSERT INTO #ClaCases VALUES (3806,467693,'2008-08-11')

    INSERT INTO #ClaCases VALUES (4346,221694,'2008-08-22')

    INSERT INTO #ClaCases VALUES (4612,221694,'2008-08-29')

    INSERT INTO #ClaCases VALUES (4870,422711,'2008-09-16')

    select * from #ClaCases

    --DROP TABLE #AccPayments

    CREATE TABLE #AccPayments (PaymentStatus INT,ReceiverID int, DueDate Date )

    INSERT INTO #AccPayments VALUES (9,325309,'2008-04-29')

    INSERT INTO #AccPayments VALUES (9,325309,'2008-05-06')

    INSERT INTO #AccPayments VALUES (9,325309,'2008-06-02')

    INSERT INTO #AccPayments VALUES (4,325309,'2008-06-05')

    INSERT INTO #AccPayments VALUES (9,325309,'2008-07-01')

    INSERT INTO #AccPayments VALUES (4,325309,'2008-07-03')

    INSERT INTO #AccPayments VALUES (4,325309,'2008-07-28')

    INSERT INTO #AccPayments VALUES (9,227013,'2008-05-01')

    INSERT INTO #AccPayments VALUES (9,227013,'2008-06-02')

    INSERT INTO #AccPayments VALUES (9,227013,'2008-07-01')

    INSERT INTO #AccPayments VALUES (4,227013,'2008-07-26')

    INSERT INTO #AccPayments VALUES (9,227013,'2008-08-01')

    INSERT INTO #AccPayments VALUES (9,227013,'2008-08-26')

    INSERT INTO #AccPayments VALUES (9,227013,'2008-09-01')

    INSERT INTO #AccPayments VALUES (9,227013,'2008-09-26')

    INSERT INTO #AccPayments VALUES (9,227013,'2008-10-01')

    INSERT INTO #AccPayments VALUES (9,467693,'2008-06-07')

    INSERT INTO #AccPayments VALUES (9,467693,'2008-07-01')

    INSERT INTO #AccPayments VALUES (9,467693,'2008-07-26')

    INSERT INTO #AccPayments VALUES (9,467693,'2008-08-01')

    INSERT INTO #AccPayments VALUES (4,467693,'2008-08-26')

    INSERT INTO #AccPayments VALUES (9,467693,'2008-09-01')

    INSERT INTO #AccPayments VALUES (9,467693,'2008-09-26')

    INSERT INTO #AccPayments VALUES (9,467693,'2008-10-01')

    INSERT INTO #AccPayments VALUES (9,467693,'2008-10-27')

    INSERT INTO #AccPayments VALUES (9,467693,'2008-11-01')

    INSERT INTO #AccPayments VALUES (9,467693,'2008-12-01')

    INSERT INTO #AccPayments VALUES (9,467693,'2008-12-15')

    INSERT INTO #AccPayments VALUES (9,467693,'2009-01-15')

    INSERT INTO #AccPayments VALUES (9,467693,'2009-02-16')

    INSERT INTO #AccPayments VALUES (4,221694,'2008-05-01')

    INSERT INTO #AccPayments VALUES (9,221694,'2008-06-02')

    INSERT INTO #AccPayments VALUES (9,221694,'2008-07-01')

    INSERT INTO #AccPayments VALUES (9,221694,'2008-07-10')

    INSERT INTO #AccPayments VALUES (9,221694,'2008-08-01')

    INSERT INTO #AccPayments VALUES (4,221694,'2008-08-26')

    INSERT INTO #AccPayments VALUES (9,221694,'2008-09-01')

    INSERT INTO #AccPayments VALUES (9,221694,'2008-09-26')

    INSERT INTO #AccPayments VALUES (9,221694,'2008-10-01')

    INSERT INTO #AccPayments VALUES (9,221694,'2008-10-27')

    INSERT INTO #AccPayments VALUES (9,221694,'2008-11-01')

    INSERT INTO #AccPayments VALUES (9,221694,'2008-11-26')

    INSERT INTO #AccPayments VALUES (9,221694,'2008-12-01')

    INSERT INTO #AccPayments VALUES (9,221694,'2008-12-27')

    INSERT INTO #AccPayments VALUES (9,221694,'2009-01-02')

    INSERT INTO #AccPayments VALUES (9,221694,'2009-02-02')

    INSERT INTO #AccPayments VALUES (9,422711,'2008-06-05')

    INSERT INTO #AccPayments VALUES (4,422711,'2008-06-12')

    INSERT INTO #AccPayments VALUES (9,422711,'2008-07-07')

    INSERT INTO #AccPayments VALUES (9,422711,'2008-07-26')

    INSERT INTO #AccPayments VALUES (9,422711,'2008-08-11')

    INSERT INTO #AccPayments VALUES (9,422711,'2008-08-26')

    INSERT INTO #AccPayments VALUES (9,422711,'2008-09-09')

    INSERT INTO #AccPayments VALUES (4,422711,'2008-09-26')

    INSERT INTO #AccPayments VALUES (9,422711,'2008-10-09')

    Select * from #AccPayments

    --Drop Table #Result

    CREATE TABLE #Result (ClaCaseID INT,ReceiverID int, NameID int, IncidentDate Date, Missed_VIG varchar(25), Missed_CHRIS@HOME varchar(25), Desired_Missed varchar(25) )

    INSERT INTO #Result VALUES (2879,325309,325309,'2008-06-28','Y','N','N')

    INSERT INTO #Result VALUES (3706,227013,227013,'2008-08-02','Y','N','N')

    INSERT INTO #Result VALUES (3806,467693,467693,'2008-08-11','Y','N','Y') --MaxDate issue

    INSERT INTO #Result VALUES (4346,221694,221694,'2008-08-22','Y','N','Y') --MaxDate issue

    INSERT INTO #Result VALUES (4612,221694,221694,'2008-08-29','Y','N','N')

    INSERT INTO #Result VALUES (4870,422711,422711,'2008-09-16','Y','N','Y') --MaxDate issue

    Select * from #Result

  • mic.con87 (1/3/2012)


    ...@ChrisM@home: Your Query is very close to my desired solution. The only issue is that it selects the max DueDate for a particular month regardless of whether this is AFTER IncidentDate. I should've explicitly stated that if the max DueDate for a particular month falls past IncidentDate then use the max DueDate <= IncidentDate for that month. From what I see if I had informed you of this condition your query will return exactly what I require....

    Adjusted to suit:

    ;WITH MissedPayments AS (

    SELECT

    ClaCaseID,

    NameID,

    IncidentDate,

    ValidRow,

    Seq,

    GroupRows = COUNT(*)

    FROM (

    SELECT

    ClaCaseID,

    NameID,

    IncidentDate,

    ValidRow,

    Seq = ROW_NUMBER() OVER (PARTITION BY ClaCaseID, NameID ORDER BY DueDate, ValidRow)

    - DENSE_RANK() OVER (PARTITION BY ClaCaseID, NameID, ValidRow ORDER BY DueDate)

    FROM (

    SELECT

    c.ClaCaseID,

    c.NameID,

    c.IncidentDate,

    a.DueDate,

    ValidRow = CASE WHEN a.PaymentStatus = 9 THEN 1 ELSE 0 END,

    MonthPicker = ROW_NUMBER() OVER(PARTITION BY c.ClaCaseID, c.NameID, YEAR(a.DueDate), MONTH(a.DueDate) ORDER BY a.DueDate DESC)

    FROM #AccPayments a

    INNER JOIN #ClaCases c ON c.NameID = a.ReceiverID

    AND a.DueDate BETWEEN DATEADD(month,-6,c.IncidentDate) AND c.IncidentDate

    ) d WHERE MonthPicker = 1

    ) o

    WHERE ValidRow = 1

    GROUP BY ClaCaseID, NameID, IncidentDate, ValidRow, Seq

    HAVING COUNT(*) > 2

    )

    SELECT c.ClaCaseID, c.NameID, c.IncidentDate, Missed = CASE WHEN m.ClaCaseID IS NULL THEN 'N' ELSE 'Y' END

    FROM #ClaCases c

    LEFT JOIN MissedPayments m ON m.ClaCaseID = c.ClaCaseID

    AND m.NameID = c.NameID

    AND m.IncidentDate = c.IncidentDate

    ORDER BY c.ClaCaseID, c.NameID, c.IncidentDate


    [font="Arial"]Low-hanging fruit picker and defender of the moggies[/font]

    For better assistance in answering your questions, please read this[/url].


    Understanding and using APPLY, (I)[/url] and (II)[/url] Paul White[/url]

    Hidden RBAR: Triangular Joins[/url] / The "Numbers" or "Tally" Table: What it is and how it replaces a loop[/url] Jeff Moden[/url]

  • Thanks a million!!! I'm busy testing your code and it seems to return the exact result I am seeking and is very efficient!

    Hope this wont be a hassle but is there any way your could please explain the following parts of your query as I'm no SQL expert and I'd like to improve my own skills.

    Seq = ROW_NUMBER() OVER (PARTITION BY ClaCaseID, NameID ORDER BY DueDate, ValidRow)

    - DENSE_RANK() OVER (PARTITION BY ClaCaseID, NameID, ValidRow ORDER BY DueDate)

    I'm not sure why you perform this subtraction? Can you please explain how the ROW_NUMBER() and DENSE_RANK() work in this case.

    ALSO

    HAVING COUNT(*) > 2

    I haven't used HAVING before. Is there a reason this couldn't be used in the WHERE Clause?

    Thanks:-)

  • Thanks for the feedback - I won't hold my breath while you're still testing 😉

    DENSE_RANK() and ROW_NUMBER() are used together like this to resolve subsets of the same partition, where there is a sequence - in this case date - and the subsets appear as islands and gaps in the sequence. This small modification to the code will show you (see also DENSE_RANK() in BOL, you're already familiar with ROW_NUMBER())

    SELECT

    ClaCaseID,

    NameID,

    IncidentDate,

    ValidRow,

    Seq1 = ROW_NUMBER() OVER (PARTITION BY ClaCaseID, NameID ORDER BY DueDate, ValidRow),

    Seq2 = DENSE_RANK() OVER (PARTITION BY ClaCaseID, NameID, ValidRow ORDER BY DueDate)

    FROM (

    SELECT

    c.ClaCaseID,

    c.NameID,

    c.IncidentDate,

    a.DueDate,

    ValidRow = CASE WHEN a.PaymentStatus = 9 THEN 1 ELSE 0 END,

    MonthPicker = ROW_NUMBER() OVER(PARTITION BY c.ClaCaseID, c.NameID, YEAR(a.DueDate), MONTH(a.DueDate) ORDER BY a.DueDate DESC)

    FROM #AccPayments a

    INNER JOIN #ClaCases c ON c.NameID = a.ReceiverID

    AND a.DueDate BETWEEN DATEADD(month,-6,c.IncidentDate) AND c.IncidentDate

    ) d WHERE MonthPicker = 1

    The HAVING clause is a special filter which operates on the results of GROUP BY.


    [font="Arial"]Low-hanging fruit picker and defender of the moggies[/font]

    For better assistance in answering your questions, please read this[/url].


    Understanding and using APPLY, (I)[/url] and (II)[/url] Paul White[/url]

    Hidden RBAR: Triangular Joins[/url] / The "Numbers" or "Tally" Table: What it is and how it replaces a loop[/url] Jeff Moden[/url]

  • Thanks Chris for your solution, today I learned what DENSE_RANK() is! Quite a nifty function. 🙂

  • Better late than never 🙂

    select cc.ClaCaseID,cc.NameID , cc.IncidentDate

    ,case when exists (select PaymentStatus

    from

    (select ap.ReceiverID,ap.PaymentStatus

    ,ROW_NUMBER() over(partition by cc.ClaCaseID,ap.ReceiverID order by DueDate) rn

    , ROW_NUMBER() over(partition by cc.ClaCaseID,ap.ReceiverID order by DueDate )

    - DENSE_RANK() over (partition by cc.ClaCaseID,ap.ReceiverID,PaymentStatus order by DueDate ) grp

    from (select ReceiverID

    ,PaymentStatus

    ,duedate

    ,row_number() over (PARTITION by ReceiverID,YEAR(DueDate),MONTH(DueDate) order by duedate desc) n

    from #AccPayments a

    where a.ReceiverID=cc.NameID and DueDate >= DATEADD(m,-6,cc.IncidentDate) and DueDate<=cc.IncidentDate

    ) ap

    where n=1

    ) x

    group by PaymentStatus,grp having PaymentStatus=9 and MAX(rn)-MIN(rn) +1 =COUNT(*) and COUNT(*)>=3

    ) then 'Y' else 'N' end Missed

    from #ClaCases cc

    order by cc.ClaCaseID,cc.NameID , cc.IncidentDate

  • @ChrisM@Home Thanks for the explanation and for your query, it really was enlightening:-D

    @VIG Better late than never indeed! I'm busy testing ChrisM@Home's code so I can provide feedback at a later stage. Thanks very much for all your effort 🙂

Viewing 13 posts - 31 through 42 (of 42 total)

You must be logged in to reply to this topic. Login to reply