Daily Counts by Month based on Start/End date

  • Hello, wondering if somebody might have the most efficient way of creating a query of Daily Counts based on Start/Stop dates. Thanks for any help.

    create table #CustDetail

    (ID varchar(6)

    ,Startdate date

    ,EndDate date

    )

    Insert into #CustDetail

    values

    ('154784', '2019-12-20', '2020-02-20')

    ,('541785', '2020-01-02', '2020-03-05')

    ,('658746', '2020-01-28', '2020-01-31')

     

    Looking for

    Dec 19 = 11

    Jan 20 = 63

    Feb 20 = 49

    Mar 20 = 5

     

     

  • Since you're counting across months the quick way to generate the daily rows would be to use a tally function.  This uses dbo.fnTally which see this link for details.  The counts don't quite match exactly to what you've listed but maybe it's still a good approach.  Something like this

    drop table if exists #CustDetail;
    go
    create table #CustDetail
    (ID varchar(6)
    ,Startdate date
    ,EndDate date)

    Insert into #CustDetail values
    ('154784', '2019-12-20', '2020-02-20')
    ,('541785', '2020-01-02', '2020-03-05')
    ,('658746', '2020-01-28', '2020-01-31');

    select dt_part.dt_yr, dt_part.dt_mo, count(*) mo_count
    from #CustDetail cd
    cross apply dbo.fnTally(0, datediff(day, cd.Startdate, cd.EndDate)) fn
    cross apply (select dateadd(day, fn.n, cd.Startdate) dt) dt_range
    cross apply (select datepart(year, dt_range.dt) dt_yr,
    datepart(month, dt_range.dt) dt_mo) dt_part
    group by dt_part.dt_yr, dt_part.dt_mo
    order by dt_part.dt_yr, dt_part.dt_mo;

    Output

    dt_yrdt_momo_count
    20191212
    2020165
    2020249
    202035

    Aus dem Paradies, das Cantor uns geschaffen, soll uns niemand vertreiben können

  • Using Steve's good test data, here's a slightly different method, which also provides the MMM DD format requested in the original post.  It's also the very first time that I found EOMONTH() to be useful for much of anything.

     SELECT  MMM_YY = RIGHT(CONVERT(CHAR(9),eom.MonthDate,6),6)
    ,Days = COUNT(eom.MonthDate)
    FROM #CustDetail dtl
    CROSS APPLY dbo.fnTally(0,DATEDIFF(dd,dtl.StartDate,dtl.EndDate)) daycount
    CROSS APPLY (SELECT EOMONTH(DATEADD(dd,daycount.N,dtl.StartDate))) eom (MonthDate)
    GROUP BY eom.MonthDate
    ORDER BY eom.MonthDate
    ;

    Here's the output... my numbers agree with Steve's...

    In case you're wondering, you can get the fnTally function at the similarly named link  in my signature line below.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • WITH cteDates(this, limit)
    AS (
    SELECT MIN(Startdate),
    MAX(EndDate)
    FROM #CustDetail

    UNION ALL

    SELECT DATEADD(DAY, 1, this),
    limit
    FROM cteDates
    WHERE this < limit
    )
    SELECT SUBSTRING(CONVERT(CHAR(9), DATEADD(MONTH, DATEDIFF(MONTH, 0, this), 0), 6), 4, 6),
    COUNT(*)
    FROM #CustDetail AS cd
    INNER JOIN cteDates AS d ON d.this BETWEEN cd.Startdate AND cd.EndDate
    GROUP BY DATEDIFF(MONTH, 0, this)
    ORDER BY DATEDIFF(MONTH, 0, this)
    OPTION (MAXRECURSION 0);

    N 56°04'39.16"
    E 12°55'05.25"

  • I'm actually quite a bit surprised at you using an incremental recursive CTE for this, Peter.  Your code runs in < 1ms like the other two do but your code consumes 493 logical reads instead of just 1 like the other two do.

    For anyone else reading this, please see the following article for why incremental recursive CTEs aren't good for counting (producing a sequence of values).

    https://www.sqlservercentral.com/articles/hidden-rbar-counting-with-recursive-ctes

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • My goal was to demonstrate you don't need two cross apply.

     

    SELECT          SUBSTRING(CONVERT(CHAR(9), DATEADD(MONTH, DATEDIFF(MONTH, 0, t.thedate), 0), 6), 4, 6),
    COUNT(*)
    FROM #CustDetail AS cd
    INNER JOIN dbo.TallyDates AS t ON t.thedate BETWEEN cd.Startdate AND cd.EndDate
    GROUP BY DATEDIFF(MONTH, 0, t.thedate)
    ORDER BY DATEDIFF(MONTH, 0, t.thedate);

    • This reply was modified 4 years ago by  SwePeso.

    N 56°04'39.16"
    E 12°55'05.25"

  • Since you only want monthly totals, I don't see any reason for the overhead to create a row for every single day; instead, you really only need a row for each month.  Then again, SQL has fooled me before and is so optimized for joins that maybe the multi-row method actually performs as well or slightly better.

    I don't have the tally function so I used an in-line tally table instead.

    The assumption seemed to be that dates would be fairly close.  I didn't want to rely on that, but to allow for 10 years if needed.  Since 100 months doesn't quite meet that, I allowed for 1000 months just in case.

    I haven't done a performance check -- to see if anything can be tweaked -- but since these are all just math calcs, presumably they will be very fast anyway.

    ;WITH
    cte_tally10 AS (
    SELECT * FROM (VALUES(0),(0),(0),(0),(0),(0),(0),(0),(0),(0)) AS month#s(month#)
    ),
    cte_tally1000 AS (
    SELECT 0 AS month# UNION ALL
    SELECT ROW_NUMBER() OVER(ORDER BY (SELECT NULL)) AS month#
    FROM cte_tally10 t1 CROSS JOIN cte_tally10 t2 CROSS JOIN cte_tally10 t3
    )
    SELECT
    DATEADD(MONTH, start_date_month# + t.month#, 0) AS month,
    SUM(DATEDIFF(DAY, curr_month_startdate, curr_month_enddate)) AS days_diff
    FROM #CustDetail
    CROSS APPLY (
    SELECT DATEDIFF(MONTH, 0, Startdate) AS start_date_month#,
    DATEDIFF(MONTH, 0, Enddate) AS end_date_month#
    ) AS calcs1
    INNER JOIN cte_tally1000 t ON t.month# BETWEEN 0 AND end_date_month# - start_date_month#
    CROSS APPLY (
    SELECT
    CASE WHEN t.month# = 0 THEN Startdate ELSE DATEADD(MONTH, start_date_month# + t.month#, 0) END AS curr_month_startdate,
    CASE WHEN start_date_month# + t.month# = end_date_month#
    THEN DATEADD(DAY, 1, Enddate)
    ELSE DATEADD(MONTH, start_date_month# + t.month# + 1, 0) END AS curr_month_enddate
    ) AS calcs2
    GROUP BY
    DATEADD(MONTH, start_date_month# + t.month#, 0)

    SQL DBA,SQL Server MVP(07, 08, 09) "It's a dog-eat-dog world, and I'm wearing Milk-Bone underwear." "Norm", on "Cheers". Also from "Cheers", from "Carla": "You need to know 3 things about Tortelli men: Tortelli men draw women like flies; Tortelli men treat women like flies; Tortelli men's brains are in their flies".

  • SwePeso wrote:

    My goal was to demonstrate you don't need two cross apply.

    SELECT          SUBSTRING(CONVERT(CHAR(9), DATEADD(MONTH, DATEDIFF(MONTH, 0, t.thedate), 0), 6), 4, 6),
    COUNT(*)
    FROM #CustDetail AS cd
    INNER JOIN dbo.TallyDates AS t ON t.thedate BETWEEN cd.Startdate AND cd.EndDate
    GROUP BY DATEDIFF(MONTH, 0, t.thedate)
    ORDER BY DATEDIFF(MONTH, 0, t.thedate);

    That's cool.  Can you post your dbo.TallyDates table schema?  Thanks, Peter.

     

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Wouldn't it be easier to add an ordinal number column to your calendar table? 36525 rows would give you an entire century of dates. Now the number of days between two given dates is a simple integer subtraction.

    Please post DDL and follow ANSI/ISO standards when asking for help. 

  • great responses and is what I need.  Really appreciate it. Some is over my head at the moment, but working as expected. If I had another request to count the patients by day. From my example above,  a count of each day starting on 2019-12-20:

    count of 1 each day from '2019-12-20'   to '2020-01-01'

    a count of 2 each day from '2020-01-02' to '2020-01-27'

    a count of 3 each day from '2020-01-28' to '2020-01-30'

    a count of 2 each day from '2020-02-01 to '2020-02-20'

     

    same concept or would there be another route you'd take?

Viewing 10 posts - 1 through 9 (of 9 total)

You must be logged in to reply to this topic. Login to reply