Counting distinct periods of consecutive days

Question

Counting distinct periods of consecutive days

harsha.majety

SSCommitted

Points: 1537
More actions
May 1, 2012 at 12:55 am

#255603

Comments posted to this topic are about the item Counting distinct periods of consecutive days

Viewing 7 posts - 1 through 6 (of 6 total)

You must be logged in to reply to this topic. Login to reply

casey-172678 SSC Journeyman Points: 90 More actions · Answer 1

It looks to me like your example counts the last day in a consecutive day grouping as a visit. Here are some wrinkles:

First, your data are pure dates. I see this sort of problem more often with input data that are date/time. Second, you are counting the visit on the last day. This is all well and good if you're selecting from an unfiltered data set. But what if you're needing to use this technique against a date range, say patient visits this month? How would you set it up so that your results from querying each month in the year will logically match the results for the whole year, etc.

jcrawf02 SSC-Insane Points: 24198 More actions · Answer 2

report logic might have a slight flaw. I get why you'd do this for ER visits over midnight, but what if the patient comes in on Friday at noon, goes home and gets sick again or hit by a bus, shows up Saturday afternoon? Unless you're trying to treat these as root cause issues, and want to know that the 2nd visit was avoidable because the 1st didn't treat everything, which is very subjective.

just something to think about, there's probably not a 100% correct answer to this one without clinical review

---------------------------------------------------------
How best to post your question[/url]
How to post performance problems[/url]
Tally Table:What it is and how it replaces a loop[/url]

"stewsterl 80804 (10/16/2009)I guess when you stop and try to understand the solution provided you not only learn, but save yourself some headaches when you need to make any slight changes."

John Rees SSChasing Mays Points: 621 More actions · Answer 3

casey-172678 (10/4/2012)
It looks to me like your example counts the last day in a consecutive day grouping as a visit. Here are some wrinkles:
First, your data are pure dates. I see this sort of problem more often with input data that are date/time. Second, you are counting the visit on the last day. This is all well and good if you're selecting from an unfiltered data set. But what if you're needing to use this technique against a date range, say patient visits this month? How would you set it up so that your results from querying each month in the year will logically match the results for the whole year, etc.

I think the script actually counts the number of days that don't have a preceding day, so it should work correctly if you want to select ranges of days for reports.

So if you want the number of visits that started in Jan 2012 you would add

where tbl1.er_date between '1/1/2012' and '1/31/2012'

Gatekeeper SSCommitted Points: 1633 More actions · Answer 4

jcrawf02 (10/4/2012)
report logic might have a slight flaw. I get why you'd do this for ER visits over midnight, but what if the patient comes in on Friday at noon, goes home and gets sick again or hit by a bus, shows up Saturday afternoon? Unless you're trying to treat these as root cause issues, and want to know that the 2nd visit was avoidable because the 1st didn't treat everything, which is very subjective.
just something to think about, there's probably not a 100% correct answer to this one without clinical review

Actually, we do that quite often when handling claims. We call them interim claims and allow up to three days between when combining. It's far more common to have someone come back because of further complications rather than something unrelated.

Harsha,

I've found that when you given one report like that, you'll be asked to do it another way. For a DOJ project, I used this so it could be cut many ways.

IF OBJECT_ID('tempdb..#Claims') IS NOT NULL

DROP TABLE #Claims

IF OBJECT_ID('tempdb..#ClaimSpans') IS NOT NULL

DROP TABLE #ClaimSpans

CREATE TABLE #Claims (ClaimID int IDENTITY(1,1) PRIMARY KEY, PCN varchar(10), Admit date, Discharge date)

INSERT INTO #Claims (PCN, Admit, Discharge)

VALUES

('000000000A', '2012-01-01', '2012-01-03')

, ('000000000A', '2012-01-05', '2012-01-09')

, ('000000000A', '2012-01-10', '2012-01-11')

, ('000000000A', '2012-01-12', '2012-01-12')

, ('000000000A', '2012-01-14', '2012-01-19')

, ('000000000A', '2012-01-20', '2012-01-31')

, ('000000000B', '2012-01-05', '2012-01-09')

, ('000000000B', '2012-01-10', '2012-01-11')

, ('000000000B', '2012-02-05', '2012-02-09')

, ('000000000B', '2012-02-10', '2012-02-11')

, ('000000000B', '2012-02-12', '2012-02-14')

, ('000000000B', '2012-02-15', '2012-02-16')

, ('000000000B', '2012-02-27', '2012-02-28')

, ('000000000B', '2012-03-9', '2012-03-11')

, ('000000000C', '2012-03-05', '2012-03-09')

, ('000000000D', '2012-04-10', '2012-04-11')

;WITH Buckets (RowNum, PCN, Yr, Mnth, Admit) as

(

SELECT

ROW_NUMBER() OVER (PARTITION BY PCN ORDER BY PCN, Admit) as RowNum,

PCN,

YEAR(Admit) as Yr,

MONTH(Admit) as Mnth,

Admit

FROM #Claims

GROUP BY PCN, YEAR(Admit), MONTH(Admit), Admit

)

SELECT z.PCN, z.Yr, z.Mnth, z.Admit, y.Admit as PrevAdmit, DATEDIFF(d, y.Admit, z.Admit) as Days, z.RowNum, y.RowNum as PrevRowNum

INTO #ClaimSpans

FROM Buckets as z

LEFT JOIN Buckets as y

ON z.RowNum - 1 = y.RowNum AND z.PCN = y.PCN

SELECT * FROM #ClaimSpans

-- find claims for patients that came back between 2-4 days

SELECT * FROM #ClaimSpans WHERE Days BETWEEN 2 and 4

-- find the count of patients that started in one group below and either stayed within the same group or moved to another span

;WITH Buckets (PCN, Bucket, Days, OldRowNum, RowNum)

as

(

SELECT

PCN,

CASE

WHEN Days BETWEEN 0 AND 1 THEN '0 to 1'

WHEN Days BETWEEN 2 AND 4 THEN '2 to 4'

WHEN Days BETWEEN 4 AND 7 THEN '4 to 7'

WHEN Days BETWEEN 8 AND 10 THEN '8 to 10'

WHEN Days BETWEEN 11 AND 14 THEN '11 to 14'

ELSE '15 to 20' END as Bucket,

Days,

RowNum as OldRowNum,

ROW_NUMBER() OVER (PARTITION BY PCN ORDER BY PCN, RowNum) as RowNum -- recalc RowNum

FROM #ClaimSpans

WHERE RowNum > 1 -- first claim won't join to a previous claim

AND Days <= 20 -- claims over 20 days won't count

)

SELECT

r.FromBucket,

r.ToBucket,

COUNT(*) as Cnt

FROM

(

SELECT

a.Bucket as FromBucket,

b.Bucket as ToBucket

FROM Buckets as a

JOIN Buckets as b

ON a.PCN = b.PCN AND a.RowNum = b.RowNum - 1 AND a.OldRowNum = b.OldRowNum - 1 -- OldRowNum makes sure that there wasn't a break in the sequence, like when the 20+ days were pulled

) as r

GROUP BY FromBucket, ToBucket

-- clean up

IF OBJECT_ID('tempdb..#Claims') IS NOT NULL

DROP TABLE #Claims

IF OBJECT_ID('tempdb..#ClaimSpans') IS NOT NULL

DROP TABLE #ClaimSpans

/* Anything is possible but is it worth it? */

a.myasnikov SSC Enthusiast Points: 147 More actions · Answer 5

Here's a good reference article.

Group Islands of Contiguous Dates (SQL Spackle)

http://www.sqlservercentral.com/articles/T-SQL/71550/

akljfhnlaflkj SSC Guru Points: 76202 More actions · Answer 6

akljfhnlaflkj

SSC Guru

Points: 76202

May 9, 2016 at 1:11 pm

#1877568

Thanks for the script.