Missing Months in a GROUP BY statement

  • I am trying to get a count by product, month, year even if there are is no record for that particular month.

    Current outcome:

    Product Month Year Count

    XYZ January 2014 20

    XYZ February 2014 14

    XYZ April 2014 34

    ...

    Desired outcome:

    Product Month Year Count

    XYZ January 2014 20

    XYZ February 2014 14

    XYZ March 2014 0

    XYZ April 2014 34

    ...

    The join statement is simple:

    Select Product, Month, Year, Count(*) As Count

    From dbo.Products

    Group By Product, Month, Year

    I have also tried the following code and left joining it with my main query but the product is left out as is seen:

    DECLARE @Start DATETIME, @End DATETIME;

    SELECT @StartDate = '20140101', @EndDate = '20141231';

    WITH dt(dt) AS

    (

    SELECT DATEADD(MONTH, n, DATEADD(MONTH, DATEDIFF(MONTH, 0, @Start), 0))

    FROM ( SELECT TOP (DATEDIFF(MONTH, @Start, @End) + 1)

    n = ROW_NUMBER() OVER (ORDER BY [object_id]) - 1

    FROM sys.all_objects ORDER BY [object_id] ) AS n

    )

    2nd attempt:

    Product Month Year Count

    XYZ January 2014 20

    XYZ February 2014 14

    NULL March 2014 0

    XYZ April 2014 34

    ...

    What I want is this (as is shown above). Is this possible?

    Desired outcome:

    Product Month Year Count

    XYZ January 2014 20

    XYZ February 2014 14

    XYZ March 2014 0

    XYZ April 2014 34

    ...

  • With DDL and some consumable sample data, I'm sure you'd get a working solution pretty quickly.

    Barring that, the right track is to use a calendar table:

    http://www.sqlservercentral.com/blogs/dwainsql/2014/03/30/calendar-tables-in-t-sql/


    My mantra: No loops! No CURSORs! No RBAR! Hoo-uh![/I]

    My thought question: Have you ever been told that your query runs too fast?

    My advice:
    INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
    The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.

    Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
    Since random numbers are too important to be left to chance, let's generate some![/url]
    Learn to understand recursive CTEs by example.[/url]
    [url url=http://www.sqlservercentral.com/articles/St

  • To do what you want, you need a Calendar table with all the Months/Years you want, and then outer join that to your Invoices table.

    Something like...

    SELECT c.Year, c.MonthNumber, SUM(i.InvoiceAmount) AS TotalSales

    FROM Calendar c LEFT JOIN Invoice i ON (c.Yr=i.SaleYr AND c.Mon=i.SaleMonth)

    GROUP BY c.Year, c.MonthNumber

  • I do have a calendar table:

    DECLARE @Start DATETIME, @End DATETIME;

    SELECT @StartDate = '20140101', @EndDate = '20141231';

    WITH dt(dt) AS

    (

    SELECT DATEADD(MONTH, n, DATEADD(MONTH, DATEDIFF(MONTH, 0, @Start), 0))

    FROM ( SELECT TOP (DATEDIFF(MONTH, @Start, @End) + 1)

    n = ROW_NUMBER() OVER (ORDER BY [object_id]) - 1

    FROM sys.all_objects ORDER BY [object_id] ) AS n

    )

    However the LEFT OUTER JOIN doesn't work in my case because there is a product in the query.

    This query would work because there is no product involved.

    SELECT c.Year, c.MonthNumber, SUM(P.InvoiceAmount) AS TotalSales

    FROM Calendar c LEFT JOIN dbo.Product P ON (c.Yr=P.SaleYr AND c.Mon=P.SaleMonth)

    GROUP BY c.Year, c.MonthNumber

    However this query wouldn't because I would get a NULL for each month that is absent in the products table for that product:

    SELECT P.Product, c.Year, c.MonthNumber, SUM(P.InvoiceAmount) AS TotalSales

    FROM Calendar c LEFT JOIN dbo.Product P ON (c.Yr=P.SaleYr AND c.Mon=P.SaleMonth)

    GROUP BY P.Product, c.Year, c.MonthNumber

    Hence these results:

    Product Month Year Count

    XYZ January 2014 20

    XYZ February 2014 14

    NULL March 2014 0

    XYZ April 2014 34

    ...

    The link also takes into consideration only the dates of in the GROUP BY clause and not any additional fields.

  • Not to repeat myself or nuthin'

    dwain.c (1/20/2015)


    With DDL and some consumable sample data, I'm sure you'd get a working solution pretty quickly.


    My mantra: No loops! No CURSORs! No RBAR! Hoo-uh![/I]

    My thought question: Have you ever been told that your query runs too fast?

    My advice:
    INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
    The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.

    Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
    Since random numbers are too important to be left to chance, let's generate some![/url]
    Learn to understand recursive CTEs by example.[/url]
    [url url=http://www.sqlservercentral.com/articles/St

  • I think I've found the solution but unfortunately it involves a CROSS JOIN. Let's say we have a Products table:

    dbo.Products

    Product

    Year

    Month

    Count

    and a Calendar table that was created on the fly (details given in my original post):

    dbo.Date

    Year

    Month

    Here is the query that I built and so far it's working. It appears that the solution is to isolate the distinct products and then cross join them to the date/calendar table after which you do a left join on the entire product table in order to get the aggregrate sums that you need.

    Select DP.Product, D.Year, D.Month, SUM(P.[Count]) AS COUNT

    From (Select Distinct Product

    From dbo.Product) DP

    CROSS JOIN dbo.Date D

    LEFT OUTER JOIN dbo.Product P ON (DP.Product = P.Product AND D.[Year] = P.[Year] AND D.[Month] = P.[Month])

    GROUP BY DP.Product, D.Year, D.Month

    Any constructive criticism here? I'm not a fan of the fact that I have to use a CROSS JOIN but I see no other way. What do you think?

  • dwain.c (1/20/2015)


    Not to repeat myself or nuthin'

    dwain.c (1/20/2015)


    With DDL and some consumable sample data, I'm sure you'd get a working solution pretty quickly.

    I suggest you read this:

    http://www.sqlservercentral.com/articles/Best+Practices/61537/

    I am not a fan of CROSS JOINs either.


    My mantra: No loops! No CURSORs! No RBAR! Hoo-uh![/I]

    My thought question: Have you ever been told that your query runs too fast?

    My advice:
    INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
    The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.

    Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
    Since random numbers are too important to be left to chance, let's generate some![/url]
    Learn to understand recursive CTEs by example.[/url]
    [url url=http://www.sqlservercentral.com/articles/St

  • Since you're new, we'll try to be gentle... no guarantees though. Help yourself by reading Jeff's article "Forum Etiquette: How to post data/code on a forum to get the best help"

    http://www.sqlservercentral.com/articles/Best+Practices/61537/

    We're not trying to be difficult, it's just that we can't really fix what we can't see. If you follow Jeff's instructions, you'll find that you're much more likely to get a working, tested solution. It's not that folks don't want to help, it's just that you haven't provided enough information for anyone to recreate your problem, so they can't help.

Viewing 8 posts - 1 through 7 (of 7 total)

You must be logged in to reply to this topic. Login to reply