Hello, wondering if somebody might have the most efficient way of creating a query of Daily Counts based on Start/Stop dates. Thanks for any help.
create table #CustDetail
(ID varchar(6)
,Startdate date
,EndDate date
)
Insert into #CustDetail
values
('154784', '2019-12-20', '2020-02-20')
,('541785', '2020-01-02', '2020-03-05')
,('658746', '2020-01-28', '2020-01-31')
Looking for
Dec 19 = 11
Jan 20 = 63
Feb 20 = 49
Mar 20 = 5
Since you're counting across months the quick way to generate the daily rows would be to use a tally function. This uses dbo.fnTally which see this link for details. The counts don't quite match exactly to what you've listed but maybe it's still a good approach. Something like this
drop table if exists #CustDetail;
go
create table #CustDetail
(ID varchar(6)
,Startdate date
,EndDate date)
Insert into #CustDetail values
('154784', '2019-12-20', '2020-02-20')
,('541785', '2020-01-02', '2020-03-05')
,('658746', '2020-01-28', '2020-01-31');
select dt_part.dt_yr, dt_part.dt_mo, count(*) mo_count
from #CustDetail cd
cross apply dbo.fnTally(0, datediff(day, cd.Startdate, cd.EndDate)) fn
cross apply (select dateadd(day, fn.n, cd.Startdate) dt) dt_range
cross apply (select datepart(year, dt_range.dt) dt_yr,
datepart(month, dt_range.dt) dt_mo) dt_part
group by dt_part.dt_yr, dt_part.dt_mo
order by dt_part.dt_yr, dt_part.dt_mo;
Output
dt_yrdt_momo_count
20191212
2020165
2020249
202035
Aus dem Paradies, das Cantor uns geschaffen, soll uns niemand vertreiben können
November 16, 2020 at 7:50 am
Using Steve's good test data, here's a slightly different method, which also provides the MMM DD format requested in the original post. It's also the very first time that I found EOMONTH() to be useful for much of anything.
SELECT MMM_YY = RIGHT(CONVERT(CHAR(9),eom.MonthDate,6),6)
,Days = COUNT(eom.MonthDate)
FROM #CustDetail dtl
CROSS APPLY dbo.fnTally(0,DATEDIFF(dd,dtl.StartDate,dtl.EndDate)) daycount
CROSS APPLY (SELECT EOMONTH(DATEADD(dd,daycount.N,dtl.StartDate))) eom (MonthDate)
GROUP BY eom.MonthDate
ORDER BY eom.MonthDate
;
Here's the output... my numbers agree with Steve's...
In case you're wondering, you can get the fnTally function at the similarly named link in my signature line below.
--Jeff Moden
Change is inevitable... Change for the better is not.
November 16, 2020 at 9:23 am
WITH cteDates(this, limit)
AS (
SELECT MIN(Startdate),
MAX(EndDate)
FROM #CustDetail
UNION ALL
SELECT DATEADD(DAY, 1, this),
limit
FROM cteDates
WHERE this < limit
)
SELECT SUBSTRING(CONVERT(CHAR(9), DATEADD(MONTH, DATEDIFF(MONTH, 0, this), 0), 6), 4, 6),
COUNT(*)
FROM #CustDetail AS cd
INNER JOIN cteDates AS d ON d.this BETWEEN cd.Startdate AND cd.EndDate
GROUP BY DATEDIFF(MONTH, 0, this)
ORDER BY DATEDIFF(MONTH, 0, this)
OPTION (MAXRECURSION 0);
N 56°04'39.16"
E 12°55'05.25"
November 16, 2020 at 1:18 pm
I'm actually quite a bit surprised at you using an incremental recursive CTE for this, Peter. Your code runs in < 1ms like the other two do but your code consumes 493 logical reads instead of just 1 like the other two do.
For anyone else reading this, please see the following article for why incremental recursive CTEs aren't good for counting (producing a sequence of values).
https://www.sqlservercentral.com/articles/hidden-rbar-counting-with-recursive-ctes
--Jeff Moden
Change is inevitable... Change for the better is not.
November 16, 2020 at 1:57 pm
My goal was to demonstrate you don't need two cross apply.
SELECT SUBSTRING(CONVERT(CHAR(9), DATEADD(MONTH, DATEDIFF(MONTH, 0, t.thedate), 0), 6), 4, 6),
COUNT(*)
FROM #CustDetail AS cd
INNER JOIN dbo.TallyDates AS t ON t.thedate BETWEEN cd.Startdate AND cd.EndDate
GROUP BY DATEDIFF(MONTH, 0, t.thedate)
ORDER BY DATEDIFF(MONTH, 0, t.thedate);
N 56°04'39.16"
E 12°55'05.25"
November 16, 2020 at 3:24 pm
Since you only want monthly totals, I don't see any reason for the overhead to create a row for every single day; instead, you really only need a row for each month. Then again, SQL has fooled me before and is so optimized for joins that maybe the multi-row method actually performs as well or slightly better.
I don't have the tally function so I used an in-line tally table instead.
The assumption seemed to be that dates would be fairly close. I didn't want to rely on that, but to allow for 10 years if needed. Since 100 months doesn't quite meet that, I allowed for 1000 months just in case.
I haven't done a performance check -- to see if anything can be tweaked -- but since these are all just math calcs, presumably they will be very fast anyway.
;WITH
cte_tally10 AS (
SELECT * FROM (VALUES(0),(0),(0),(0),(0),(0),(0),(0),(0),(0)) AS month#s(month#)
),
cte_tally1000 AS (
SELECT 0 AS month# UNION ALL
SELECT ROW_NUMBER() OVER(ORDER BY (SELECT NULL)) AS month#
FROM cte_tally10 t1 CROSS JOIN cte_tally10 t2 CROSS JOIN cte_tally10 t3
)
SELECT
DATEADD(MONTH, start_date_month# + t.month#, 0) AS month,
SUM(DATEDIFF(DAY, curr_month_startdate, curr_month_enddate)) AS days_diff
FROM #CustDetail
CROSS APPLY (
SELECT DATEDIFF(MONTH, 0, Startdate) AS start_date_month#,
DATEDIFF(MONTH, 0, Enddate) AS end_date_month#
) AS calcs1
INNER JOIN cte_tally1000 t ON t.month# BETWEEN 0 AND end_date_month# - start_date_month#
CROSS APPLY (
SELECT
CASE WHEN t.month# = 0 THEN Startdate ELSE DATEADD(MONTH, start_date_month# + t.month#, 0) END AS curr_month_startdate,
CASE WHEN start_date_month# + t.month# = end_date_month#
THEN DATEADD(DAY, 1, Enddate)
ELSE DATEADD(MONTH, start_date_month# + t.month# + 1, 0) END AS curr_month_enddate
) AS calcs2
GROUP BY
DATEADD(MONTH, start_date_month# + t.month#, 0)
SQL DBA,SQL Server MVP(07, 08, 09) "It's a dog-eat-dog world, and I'm wearing Milk-Bone underwear." "Norm", on "Cheers". Also from "Cheers", from "Carla": "You need to know 3 things about Tortelli men: Tortelli men draw women like flies; Tortelli men treat women like flies; Tortelli men's brains are in their flies".
November 16, 2020 at 4:30 pm
My goal was to demonstrate you don't need two cross apply.
SELECT SUBSTRING(CONVERT(CHAR(9), DATEADD(MONTH, DATEDIFF(MONTH, 0, t.thedate), 0), 6), 4, 6),
COUNT(*)
FROM #CustDetail AS cd
INNER JOIN dbo.TallyDates AS t ON t.thedate BETWEEN cd.Startdate AND cd.EndDate
GROUP BY DATEDIFF(MONTH, 0, t.thedate)
ORDER BY DATEDIFF(MONTH, 0, t.thedate);
That's cool. Can you post your dbo.TallyDates table schema? Thanks, Peter.
--Jeff Moden
Change is inevitable... Change for the better is not.
November 16, 2020 at 7:19 pm
Wouldn't it be easier to add an ordinal number column to your calendar table? 36525 rows would give you an entire century of dates. Now the number of days between two given dates is a simple integer subtraction.
Please post DDL and follow ANSI/ISO standards when asking for help.
November 16, 2020 at 7:30 pm
great responses and is what I need. Really appreciate it. Some is over my head at the moment, but working as expected. If I had another request to count the patients by day. From my example above, a count of each day starting on 2019-12-20:
count of 1 each day from '2019-12-20' to '2020-01-01'
a count of 2 each day from '2020-01-02' to '2020-01-27'
a count of 3 each day from '2020-01-28' to '2020-01-30'
a count of 2 each day from '2020-02-01 to '2020-02-20'
same concept or would there be another route you'd take?
Viewing 10 posts - 1 through 9 (of 9 total)
You must be logged in to reply to this topic. Login to reply