March 27, 2019 at 2:41 pm
I have been trying a few various methods to see if I can get this to work, but I haven't had any luck.
Here is what I am trying to accomplish.
Every day, there are cases that get closed. We are wanting to track cases that have been 're-opened' after having been already closed once, but there is nothing in the information provided that tells us this is a re-opened case. The only way to do this is to check to see if the Case ID and the Report Date and see if the there is a duplicate Case Id that exists and was closed prior to this report date. To complicate matters, here is some additional info:
1) A common situation is that a case is closed, re-opened and then closed again within the same day(sometimes multiple times). This should count as a re-open, each time it is done after the first instance, even if it's the same day ( I assume we would group by case ID?)
2) I run a 5 Day reporting window, so a case should NOT count as a re-open if for instance on 3/20/2019 the case was closed for the first time, and then re-opened at some point and closed again 3/26/2019 until 3/26/2019. On 3/20, 3/21, 3/22, and 3/25(report days skips weekends and holidays, this is already built in, do not need anything fo that) it should NOT be marked as a re-open because the case still only has one instance on or before the report date we are looking at. On 3/26 it would be marked as a re-open because it would then have been closed for a second time on or before the report date.
Here are some queries:
CREATE TABLE ResolvedCases(
Case_ID varchar(20),
Case_Closed_On datetime,
Report_Date date,
Is_ReOpened_Case VarChar(3) NULL
)
INSERT INTO ResolvedCases VALUES('US1236', '2019-02-16 12:30:45', '2/16/2019')
INSERT INTO ResolvedCases VALUES('US1238', '2019-02-28 15:30:45', '2/28/2019')
INSERT INTO ResolvedCases VALUES('US1234', '2019-03-19 12:30:45', '3/19/2019')
INSERT INTO ResolvedCases VALUES('US1234', '2019-03-19 15:30:45', '3/19/2019')
INSERT INTO ResolvedCases VALUES('US1235', '2019-03-20 9:30:45', '3/20/2019')
INSERT INTO ResolvedCases VALUES('US1235', '2019-03-23 12:40:45', '3/23/2019')
INSERT INTO ResolvedCases VALUES('US1236', '2019-03-20 12:30:45', '3/24/2019')
INSERT INTO ResolvedCases VALUES('US1237', '2019-03-25 12:30:45', '3/25/2019')
Expected Results(Only showing the cases with Report_Date between 3/20 and 3/26):
Case_ID Case_Closed_On Report_Date Is_ReOpened_Case
US1234 2019-03-19 12:30:45 3/19/2019 No (There is a duplicate case Id on 3/19 but it didn't happen until 3:30 PM---at 12:30PM this hadn't occurred yet so it was not a re-open at that time)
US1234 2019-03-19 15:30:45 3/19/2019 Yes
US1235 2019-03-20 9:30:45 3/20/2019 No (There is a duplicate case Id on 3/23 but on 3/20 this hadn't occurred yet so it was no a re-open on that date)
US1235 2019-03-23 12:40:45 3/23/2019 Yes
US1236 2019-03-20 12:30:45 3/24/2019 Yes (Because of the case closed on 2/16/2019 even though it doesn't show in this query)
US1237 2019-03-25 12:30:45 3/25/2019 No
Any help would be appreciated with this...
I have something that shows the count of the case ID which shows me all the duplicates for a given date range and have them grouped by Case_ID but I am not sure how to just mark each individual row as a re-open or not based on the requirements above...
March 27, 2019 at 4:08 pm
How far between open/close would count as a "reopen"? You can just use LAG() with a PARTITION to look backwards at the previous records.
March 27, 2019 at 4:11 pm
pietlinden - Wednesday, March 27, 2019 4:08 PMHow far between open/close would count as a "reopen"? You can just use LAG() with a PARTITION to look backwards at the previous records.
As far back as the records go(all the way back to 2017). If at any point the case_ID was already closed, regardless of the time difference, it counts as a re-open.
March 27, 2019 at 4:24 pm
Then use LAG() and maybe ROW_NUMBER().
Any record with a non-null LAG() value will be a reopen.
March 27, 2019 at 4:50 pm
pietlinden - Wednesday, March 27, 2019 4:24 PMThen use LAG() and maybe ROW_NUMBER().
Any record with a non-null LAG() value will be a reopen.
It looks like I would need to make sure the items are sorted from oldest to newest before calling it because it looks like it only searches rows prior to that
March 27, 2019 at 5:21 pm
You can sort inside the partition. (Using AdventureWorks2008R2)
SELECT soh.CustomerID
, soh.OrderDate
, LAG(soh.OrderDate,1) OVER (PARTITION BY soh.CustomerID ORDER BY soh.OrderDate) AS PrevSale
FROM Sales.SalesOrderHeader soh
ORDER BY soh.CustomerID
, soh.OrderDate;
March 27, 2019 at 6:24 pm
pietlinden - Wednesday, March 27, 2019 5:21 PMYou can sort inside the partition. (Using AdventureWorks2008R2)
SELECT soh.CustomerID
, soh.OrderDate
, LAG(soh.OrderDate,1) OVER (PARTITION BY soh.CustomerID ORDER BY soh.OrderDate) AS PrevSale
FROM Sales.SalesOrderHeader soh
ORDER BY soh.CustomerID
, soh.OrderDate;
This works, but is there a way to limit the date range to return but still look through the entire DB? It seems if I put a WHERE soh.SalesOrderDate BETWEEN xxx AND xxx that it will only return items within that date range as prev sales instead of all items in the Sales.SalesOrderHeader table...
March 27, 2019 at 6:31 pm
Do you mean like this?
SELECT *
FROM (
SELECT soh.CustomerID
, soh.OrderDate
, LAG(soh.OrderDate,1) OVER (PARTITION BY soh.CustomerID ORDER BY soh.OrderDate) AS PrevSale
FROM Sales.SalesOrderHeader soh
) x
WHERE x.PrevSale <= '2007-01-01';
The WHERE x.PrevSale… clause is being applied after the LAG etc is calculated, because that's being calculated in the *inner* query. Make sense?
March 27, 2019 at 8:23 pm
pietlinden - Wednesday, March 27, 2019 6:31 PMDo you mean like this?
SELECT *
FROM (
SELECT soh.CustomerID
, soh.OrderDate
, LAG(soh.OrderDate,1) OVER (PARTITION BY soh.CustomerID ORDER BY soh.OrderDate) AS PrevSale
FROM Sales.SalesOrderHeader soh
) x
WHERE x.PrevSale <= '2007-01-01';The WHERE x.PrevSale… clause is being applied after the LAG etc is calculated, because that's being calculated in the *inner* query. Make sense?
Brilliant! Thank you! Works perfectly.
March 27, 2019 at 10:53 pm
matter2003 - Wednesday, March 27, 2019 8:23 PMpietlinden - Wednesday, March 27, 2019 6:31 PMDo you mean like this?
SELECT *
FROM (
SELECT soh.CustomerID
, soh.OrderDate
, LAG(soh.OrderDate,1) OVER (PARTITION BY soh.CustomerID ORDER BY soh.OrderDate) AS PrevSale
FROM Sales.SalesOrderHeader soh
) x
WHERE x.PrevSale <= '2007-01-01';The WHERE x.PrevSale… clause is being applied after the LAG etc is calculated, because that's being calculated in the *inner* query. Make sense?
Brilliant! Thank you! Works perfectly.
Hmm...noticing one weird thing...it is adding a duplicate row at times where the Closed Date is the same as the ReOpened date. This shouldn't be happening, it should only be including re-opened dates that are before the closed date...any ideas what might be happening here?
March 28, 2019 at 11:01 am
select Case_ID
,Case_Closed_On
,Report_Date
,case
when 0 < (select count(*)
from ResolvedCases t1
where t1.Case_ID = t.Case_ID
and t1.Case_Closed_On < t.Case_Closed_On) then 'Yes'
else 'No'
end as Re_Open
from ResolvedCases t
where t.Report_Date between '2019-03-19' and '2019-03-26'
order by t.Case_ID, t.Case_Closed_On;
Viewing 11 posts - 1 through 10 (of 10 total)
You must be logged in to reply to this topic. Login to reply