Count Group By and CASE

  • Today I learned that Group and Count do not pay attention to "THEN" in the case statement. I thought it would count what I set the value to with "THEN". Woops. I want my output counts to look like this:

    Commercial: 4

    Medicare: 1

    Medicaid: 1

    But as you can see with this sample below I get two rows for "Commercial" since there is technically a count of 4 with Plan1 & Plan11.

    I found something close to what I was doing but could not get it to work with my data.

    Here is sample data:

    IF OBJECT_ID('tempdb..#lab') IS NOT NULL DROP TABLE #lab
    SELECT * INTO #lab FROM (VALUES
    ( 'Plan1' ),
    ( 'Plan2' ),
    ( 'Plan3' ),
    ( 'Plan11' ),
    ( 'Plan11' ),
    ( 'Plan11' )) d

    ( Plans )

    SELECT

    case
    when x.Plans = 'Plan1' then 'Commercial'
    when x.Plans = 'Plan11' then 'Commercial'
    when x.Plans = 'Plan2' then 'Medicare'
    when x.Plans = 'Plan3' then 'Medicaid'
    end as Plans
    ,count(*) [Plan Totals]
    FROM #lab x
    GROUP BY x.Plans

    -Mark
    MSSQL 2019 Standard, Azure Hosted. Techie/Sysadmin by trade; Three years as a "DBA" now.

  • The reason you have an issue is because you are grouping on x.Plans - and not the result of the case expression.  Change it to:

    SELECT
    case
    when x.Plans = 'Plan1' then 'Commercial'
    when x.Plans = 'Plan11' then 'Commercial'
    when x.Plans = 'Plan2' then 'Medicare'
    when x.Plans = 'Plan3' then 'Medicaid'
    end as Plans
    ,count(*) [Plan Totals]
    FROM #lab x
    GROUP BY
    case
    when x.Plans = 'Plan1' then 'Commercial'
    when x.Plans = 'Plan11' then 'Commercial'
    when x.Plans = 'Plan2' then 'Medicare'
    when x.Plans = 'Plan3' then 'Medicaid'
    end

    Jeffrey Williams
    “We are all faced with a series of great opportunities brilliantly disguised as impossible situations.”

    ― Charles R. Swindoll

    How to post questions to get better answers faster
    Managing Transaction Logs

  • Thanks, Jeffery. Learned something new, again. I've applied this to my actual query and it works, as I had hoped now.

    -Mark
    MSSQL 2019 Standard, Azure Hosted. Techie/Sysadmin by trade; Three years as a "DBA" now.

  • Disregard. I found my error. 🙂

    • This reply was modified 3 years, 6 months ago by  usererror.

    -Mark
    MSSQL 2019 Standard, Azure Hosted. Techie/Sysadmin by trade; Three years as a "DBA" now.

  • I was looking at the result as well and it looks good.

    But I did notice that the order in the group by clause (case statement) has to be in exactly the same order. You can change the order but however you change it, the select order must be exactly the same as the group by order.

    Here I just swap the middle two rows in the group by clause:

    SELECT
    case
    when x.Plans = 'Plan11' then 'Commercial'
    when x.Plans = 'Plan2' then 'Medicare'
    when x.Plans = 'Plan3' then 'Medicaid'
    when x.Plans = 'Plan1' then 'Commercial'
    end as Plans
    ,count(*) [Plan Totals]
    FROM #lab x
    GROUP BY
    case
    when x.Plans = 'Plan11' then 'Commercial'
    when x.Plans = 'Plan3' then 'Medicaid'
    when x.Plans = 'Plan2' then 'Medicare'
    when x.Plans = 'Plan1' then 'Commercial'
    end

    I get the following error:

    Msg 8120, Level 16, State 1, Line 28
    Column '#lab.Plans' is invalid in the select list because it is not contained in either an aggregate function or the GROUP BY clause.
    Msg 8120, Level 16, State 1, Line 29
    Column '#lab.Plans' is invalid in the select list because it is not contained in either an aggregate function or the GROUP BY clause.
    Msg 8120, Level 16, State 1, Line 30
    Column '#lab.Plans' is invalid in the select list because it is not contained in either an aggregate function or the GROUP BY clause.
    Msg 8120, Level 16, State 1, Line 31
    Column '#lab.Plans' is invalid in the select list because it is not contained in either an aggregate function or the GROUP BY clause.

    Why is that?

  • Because of the way CASE expressions work - a case expression will stop when it hits a true condition, and changing the order of the WHEN portion can change the results of the expression.

    To make it easier - we can modify the query and use one of several techniques to DRY (don't repeat yourself) the code.  We can use a CTE or derived table or CROSS APPLY.  Here is an example using a CTE:

      WITH results
    AS (
    SELECT case when x.Plans = 'Plan11' then 'Commercial'
    when x.Plans = 'Plan2' then 'Medicare'
    when x.Plans = 'Plan3' then 'Medicaid'
    when x.Plans = 'Plan1' then 'Commercial'
    end as Plans
    FROM #lab x
    )
    Select Plans
    , [Plan Totals] = count(*)
    From results
    GROUP BY
    Plans;

     

    • This reply was modified 3 years, 6 months ago by  Jeffrey Williams. Reason: Moved the count(*) to the outer query where it should be

    Jeffrey Williams
    “We are all faced with a series of great opportunities brilliantly disguised as impossible situations.”

    ― Charles R. Swindoll

    How to post questions to get better answers faster
    Managing Transaction Logs

  • Thanks for your help on this. I too realized my two CASE statements had to be exactly identical, and they were not.

    @jeff - I thought about the CTE method, but I've only used written one CTE successfully on my own to this point a few months ago...I'll keep monkeying around with the example you sent.

    -Mark
    MSSQL 2019 Standard, Azure Hosted. Techie/Sysadmin by trade; Three years as a "DBA" now.

  • Right.

    But we are changing the order of the select (or the Group By) not the when clause. And if you do - you get an error that says that #lab.Plans doesn't exist in the select list.

    But it is in the list, just in a different order in the order by clause.

  • tshad wrote:

    Right.

    But we are changing the order of the select (or the Group By) not the when clause. And if you do - you get an error that says that #lab.Plans doesn't exist in the select list.

    But it is in the list, just in a different order in the order by clause.

    You changed the CASE expression - swapping the WHEN statements, which make them different expressions and SQL Server generates the errors because the CASE expression in the SELECT statement doesn't match the CASE expression in the GROUP BY, and therefore the results could be different.

    Jeffrey Williams
    “We are all faced with a series of great opportunities brilliantly disguised as impossible situations.”

    ― Charles R. Swindoll

    How to post questions to get better answers faster
    Managing Transaction Logs

  • usererror wrote:

    @Jeff - I thought about the CTE method, but I've only used written one CTE successfully on my own to this point a few months ago...I'll keep monkeying around with the example you sent.

    The form of CTE that I showed is the simplest - it is really just an inline-view.  It is the same usage as a derived table - and for me, it is easier to read and understand than using derived tables.

    Here is the same query using a derived table:

    Select Plans
    , [Plan Totals] = count(*)
    From (
    SELECT case when x.Plans = 'Plan11' then 'Commercial'
    when x.Plans = 'Plan2' then 'Medicare'
    when x.Plans = 'Plan3' then 'Medicaid'
    when x.Plans = 'Plan1' then 'Commercial'
    end as Plans
    ) As t
    GROUP BY
    Plans;

    Jeffrey Williams
    “We are all faced with a series of great opportunities brilliantly disguised as impossible situations.”

    ― Charles R. Swindoll

    How to post questions to get better answers faster
    Managing Transaction Logs

  • I see.

    So you are saying that it is just comparing the text of the case statement in the select clause with the text of the case statement in the group by clause. They have to be exactly the same. Has nothing to do with the results.

    Thanks.

  • >> Today I learned that GROUP BY and COUNT() do not pay attention to "THEN" in the CASE statement [sic]. I thought it would count what I set the value to with "THEN". <<

    The first problem is that you think case is a statement; no, it's an expression. Expressions return a single scalar value and have nothing to do with flow of control or anything else. Here's a quick rewrite of your code:

    INSERT INTO Foobar (plan_code)

    VALUES

    ('Plan1'),

    ('Plan2'),

    ('Plan3'),

    ('Plan11');

    SELECT

    CASE Foobar.plan_code

    WHEN = 'Plan1' THEN 'Commercial'

    WHEN = 'Plan11' THEN 'Commercial'

    WHEN = 'Plan2' THEN 'Medicare'

    WHEN = 'Plan3' THEN 'Medicaid'

    ELSE NULL END AS plan_category,

    COUNT (*) AS plan_counts

    FROM Foobar AS X

    GROUP BY X.plan_category;

    I you did your grouping on the wrong column; you wanted counts by category and not by the code.

    Please post DDL and follow ANSI/ISO standards when asking for help. 

  • Just to pile on, COUNT(expr) only counts non-null values, while Count(*)   counts rows.    See below.

    declare @table table (RowID int, RowData varchar(20))
    insert into @table
    values (1,'Hi Mom'), (2, NULL)

    select * from @table
    select count(*) as [Count(*)], count(RowData) as [Count(RowDate)], sum(1) as [Sum(1)]
    from @table

     

    __________________________________________________

    Against stupidity the gods themselves contend in vain. -- Friedrich Schiller
    Stop, children, what's that sound? Everybody look what's going down. -- Stephen Stills

Viewing 13 posts - 1 through 12 (of 12 total)

You must be logged in to reply to this topic. Login to reply