Union query displays multiple records after Group By

  • I have a query as below:

    SELECT district_nm, urban_appr, urban_in, NULL, NULL, NULL, NULL

    FROM dbo.YourTable WHERE scheme_type = 'U'

    UNION

    SELECT district_nm, NULL, NULL, rural_appr, rural_in, NULL, NULL

    FROM dbo.YourTable WHERE scheme_type = 'R'

    UNION

    SELECT district_nm, NULL, NULL, NULL, NULL, ground_appr, ground_in

    FROM dbo.YourTable WHERE scheme_type = 'E'

    GROUP BY district_nm;

    Suppose, any district has records for all categories, i.e., U, R and E, than the above query displays multiple records for a district for each category. I want all records in a single row for a particular district.

    "Here is a test to find out whether your mission in life is complete. If you're alive, it isn't. "

    Richard Bach

  • please prvode some set up scripts, sample data that demonstrates your problem and the results that you require.....am sure you will then get a tried and tested answer.

    regards

    ________________________________________________________________
    you can lead a user to data....but you cannot make them think
    and remember....every day is a school day

  • Data is not possible for me, but I am attaching output that I am getting when below given query is used:

    SELECT district_nm,Count(urban_appr) as urban_appr, Count(urban_inprogress) as urban_in, NULL, NULL, NULL, NULL

    FROM dbo.SchemeReports WHERE scheme_type = 'Urban'

    GROUP BY district_nm

    UNION

    SELECT district_nm,NULL, NULL, Count(rural_appr) as rural_appr, Count(rural_inprogress) as rural_in, NULL, NULL

    FROM dbo.SchemeReports WHERE scheme_type = 'Rural'

    GROUP BY district_nm

    UNION

    SELECT district_nm,NULL, NULL, NULL, NULL, Count(esthal_appr) as esthal_appr, Count(esthal_inprogress) as ground_in

    FROM dbo.SchemeReports WHERE scheme_type = 'Esthal'

    GROUP BY district_nm

    ORDER BY district_nm;

    "Here is a test to find out whether your mission in life is complete. If you're alive, it isn't. "

    Richard Bach

  • @RDBMS (11/5/2011)


    Data is not possible for me, but I am attaching output that I am getting when below given query is used:

    it doesn't have to be real data...just some sample data...you can make this up so that it demonstrates your real data.

    surely you can post the table definition?....rename the columns if needed to maintain confidentiality

    ________________________________________________________________
    you can lead a user to data....but you cannot make them think
    and remember....every day is a school day

  • I'm not quite certain what you want to do; GROUP BY implies aggregation, but you have no aggregate functions (MIN, MAX, COUNT, etc.) present. I'm also not sure if UNION is what you're looking for; UNION implicitly performs aggregation and removes duplicates. If that's not the behavior you want (or need), then make sure to use UNION ALL; this won't remove duplicates and will perform faster. Without more details on your data, it's hard to give you an exact solution. However, here's one possibility:

    SELECT district_nm,

    MAX(urban_appr) AS urban_appr,

    MAX(urban_in) as urban_in,

    MAX(rural_appr) as rural_appr,

    MAX(rural_in) as rural_in,

    MAX(ground_appr) as ground_appr,

    MAX(ground_in) as ground_in

    FROM

    (

    SELECT district_nm, urban_appr, urban_in, 0 AS rural_appr, 0 AS rural_in, 0 AS ground_appr, 0 AS ground_in

    FROM dbo.YourTable WHERE scheme_type = 'U'

    UNION ALL

    SELECT district_nm, 0 AS urban_appr, 0 AS urban_in, rural_appr, rural_in, 0 AS ground_appr, 0 AS ground_in

    FROM dbo.YourTable WHERE scheme_type = 'R'

    UNION ALL

    SELECT district_nm, 0 AS urban_appr, 0 AS urban_in, 0 AS rural_appr, 0 AS rural_in, ground_appr, ground_in

    FROM dbo.YourTable WHERE scheme_type = 'E'

    ) AS A

    GROUP BY A.district_nm;

    A few things to note in this example:

    • I'm assuming that only one row will be returned for each scheme_type; using MAX() as the aggregate function would provide incorrect results if this isn't really the case.
    • The UNIONed SELECTS are used as a derived table for the GROUP BY to operate on
    • I've replaced the NULL values with 0 in each SELECT; performing aggregate functions on columns that include NULLs usually isn't a good idea, as the NULL values will be excluded. In this particular case that's probably OK, but personally, I like to ensure that the aggregate function is working on a result set that contains only known values (i.e. no NULLs)
    • Note the aliases being used for the columns set to 0 (previously NULL) in each SELECT statement. This ensures that each column has a name in the resulting derived table, so that you can refer to them in the MAX() functions
    • The use of UNION ALL in order to avoid unneeded aggregation within the derived table

    Again, this makes a lot of presuppositions about your actual data and the business rules you're trying to implement, but hopefully it will get you started in the right direction. If you need further assistance, please do post example data and schema definitions as well.

  • Drop the UNIONs and read the source table only once:

    SELECT d.district_nm,

    urban_appr = SUM(COUNT_urban_appr),

    urban_in = SUM(COUNT_urban_inprogress),

    rural_appr = SUM(COUNT_rural_appr),

    rural_in = SUM(COUNT_rural_inprogress),

    esthal_appr = SUM(COUNT_esthal_appr),

    ground_in = SUM(COUNT_esthal_inprogress)

    FROM (

    SELECT

    district_nm,

    scheme_type,

    COUNT_urban_appr= SUM(CASE WHEN urban_appr IS NULL THEN 0 ELSE 1 END),

    COUNT_urban_inprogress= SUM(CASE WHEN urban_inprogress IS NULL THEN 0 ELSE 1 END),

    COUNT_rural_appr= SUM(CASE WHEN rural_appr IS NULL THEN 0 ELSE 1 END),

    COUNT_rural_inprogress= SUM(CASE WHEN rural_inprogress IS NULL THEN 0 ELSE 1 END),

    COUNT_esthal_appr= SUM(CASE WHEN esthal_appr IS NULL THEN 0 ELSE 1 END),

    COUNT_esthal_inprogress = SUM(CASE WHEN esthal_inprogress IS NULL THEN 0 ELSE 1 END)

    FROM dbo.SchemeReports

    WHERE scheme_type IN ('Urban','Rural','Esthal')

    GROUP BY district_nm, scheme_type

    ) d

    GROUP BY district_nm


    [font="Arial"]Low-hanging fruit picker and defender of the moggies[/font]

    For better assistance in answering your questions, please read this[/url].


    Understanding and using APPLY, (I)[/url] and (II)[/url] Paul White[/url]

    Hidden RBAR: Triangular Joins[/url] / The "Numbers" or "Tally" Table: What it is and how it replaces a loop[/url] Jeff Moden[/url]

  • @chrism-2 definitely, that makes more sense. Although, I'm curious about the execution plans; IIRC, for UNION structures like this, I think I've seen the optimizer choose to do a single read of the source data, rearranging the predicates of each SELECT into a single predicate along the lines of scheme_type = 'U' OR scheme_type = 'R' OR scheme_type = 'E' (where each SELECT is being run against the same source table, of course). However, I'm not sure if it would be able to do that in all situations or not. I'm not in front of SSMS right now or I'd test it.

    That said, even if the two execution plans were to be the same, I don't think it hurts to give the optimizer a shove in the right direction, so I suspect your approach is the safest one.

  • I guess you missed my second post above where I added complete code with COUNT and image as attachment.

    "Here is a test to find out whether your mission in life is complete. If you're alive, it isn't. "

    Richard Bach

  • @RDBMS I sure did, sorry! My bad!

  • @RDBMS

    have you now got a solution?

    regards

    ________________________________________________________________
    you can lead a user to data....but you cannot make them think
    and remember....every day is a school day

  • I am struggling to create a sample data for you.

    "Here is a test to find out whether your mission in life is complete. If you're alive, it isn't. "

    Richard Bach

  • ...ok, so assume you havent got a solution yet?

    can you post the table "create" script.....maybe we can generate some sample data for you.

    In SSMS...right click on your table > Script table as > Create to

    ________________________________________________________________
    you can lead a user to data....but you cannot make them think
    and remember....every day is a school day

  • Have you tested the solution I posted yet? There's a reason for choosing SUM() over COUNT(*) which I'm expecting you will have to use.

    Oh, and it's all an Illusion...


    [font="Arial"]Low-hanging fruit picker and defender of the moggies[/font]

    For better assistance in answering your questions, please read this[/url].


    Understanding and using APPLY, (I)[/url] and (II)[/url] Paul White[/url]

    Hidden RBAR: Triangular Joins[/url] / The "Numbers" or "Tally" Table: What it is and how it replaces a loop[/url] Jeff Moden[/url]

Viewing 13 posts - 1 through 12 (of 12 total)

You must be logged in to reply to this topic. Login to reply