Group by issue

  • Hi All,

    It's been a while not being here, and my SQL has gotten a little rusty in days of old. Now back in the thick of it so apologies if I at first have a few questions.

    So I have a query where I need to find out an incident that has happened many times when it should only happen once.

    I had the original query where it was done for the last 180 days by person ref and ref number. The id is unique for each time they submitted something to the reference number, this should only have one ID but has happened multiple times.

    I need to look at how many of these happen each month to see if we have a major problem or not. So the first query identified a count of the id based on person ref and ref number for the last 180 days where the count is more than 1. So far I have adapted the query to this:

    select count(distinct a.id) as 'Total',

    month(a.create_date) as 'Month',

    year(a.CREATE_DATE) as 'Year'

    --a.person_ref

    --b.ref_number

    from tableA a

    inner join tableB b

    on a.id = b.id

    where

    year(a.CREATE_DATE) > '2013'

    group by a.person_ref, b.ref_number,

    year(a.CREATE_DATE), month(a.create_date)

    having count(distinct a.id) > 1

    order by year(a.create_date),count(*) desc

    I have tried with rollup and tried to imbed another query in this. Both of which gave me gobbledegook (technical term).

    Ideally what I want is per month and year a total of a count of the id's where they are more than one.

    Thanks for any assistance, and if I didn't give enough info let me know.

  • Actually, that query appears to fit your stated understanding quite well. However, saying that what you got out of it was "gobbledegook" doesn't actually help us determine what went wrong. We have no sample data and no CREATE statements for the tables in question, so I don't see any obvious answer, nor any way to derive one. Please help us help you. While you might not be able to provide sample data, knowing the table structure by you scripting out the tables involved might give us some clue.

    Steve (aka sgmunson) 🙂 🙂 🙂
    Rent Servers for Income (picks and shovels strategy)

  • Hi Steve,

    Dang they keep calling me an expert don't tell me I did something right? 😉

    When I added with rollup to the end of the group by piece, it listed some columns as having null values, some had a month in the month column, some had a year in, some had figures. Very odd as my immediate thought was rollup but it didn't work quite well enough.

    Sample data from the query looks like this:

    TotalMonth Year

    1212015

    1112015

    912015

    812015

    812015

    812015

    812015

    712015

    712015

    712015

    712015

    712015

    I am getting into this lovely stuff again, when you say table structure what do you need? Column types or how the tables fit together for example?

  • Take the following OUT of your GROUP BY clause:

    a.person_ref, b.ref_number,

    That should solve it. Dang, I didn't notice this earlier...

    Steve (aka sgmunson) 🙂 🙂 🙂
    Rent Servers for Income (picks and shovels strategy)

  • karen.blake (6/10/2015)


    When I added with rollup to the end of the group by piece, it listed some columns as having null values, some had a month in the month column, some had a year in, some had figures. Very odd as my immediate thought was rollup but it didn't work quite well enough.

    NULL values when you use ROLLUP can represent one of two things:

    1) your data has NULL values and the group is on this NULL value.

    2) your data is being aggregated for that particular column in that particular row.

    So if you are calculating a ROLLUP on year, month, and day, your ROLLUP data will include

    1) One row for each unique year, month, and day.

    2) One row for each unique year and month. The day column will be NULL in this case.

    3) One row for each unique year. Both the month and day columns will be NULL in this case.

    4) One row for the grand total. The year, month, and day columns will all be NULL in this case.

    CUBE behaves similarly, but it uses all combinations of columns instead of the nested sets like ROLLUP.

    You can use the GROUPING(column_name) function to determine if the current item represents a NULL value in the data or an aggregate on that column and change what you output when it is an aggregate.

    https://technet.microsoft.com/en-us/library/ms178544(v=sql.110).aspx

    Drew

    J. Drew Allen
    Business Intelligence Analyst
    Philadelphia, PA

  • karen.blake (6/10/2015)


    I am getting into this lovely stuff again, when you say table structure what do you need? Column types or how the tables fit together for example?

    Hi and welcome back.

    The most helpful method to most of the denizens of this site can be found at the first article under "Helpful Articles" in my signature line below. If you ever have a performance issue, see the link below that.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • sgmunson (6/10/2015)


    Take the following OUT of your GROUP BY clause:

    a.person_ref, b.ref_number,

    That should solve it. Dang, I didn't notice this earlier...

    Apologies I was running the query per month just to get the data out that was needed and shelved this for the min.

    I did try this (should have said) but because we are looking for multiples again the same person ref and ref number this ended up doing a count of everything in the tables as opposed to specifically identifying the multiple entries where this issue occurred. The problem occurs with multiple incident id's against the same person_ref and ref_number. There should only be 1.

    Drew, thanks I will take a read of that. I did try cube as well, it's bugging me as I know it has to be something simple!

    Jeff, thank you I am going to take the time to read through all the stuff :)otherwise what I am posting will be gobbledegook!

    Thank you all for your help

  • Then just add HAVING COUNT(*) > 1 just after your GROUP BY and you will see only those months that have more than one record contribute....

    Steve (aka sgmunson) 🙂 🙂 🙂
    Rent Servers for Income (picks and shovels strategy)

Viewing 8 posts - 1 through 7 (of 7 total)

You must be logged in to reply to this topic. Login to reply