September 8, 2013 at 9:44 pm
What would be the best way to write a sql query where the result set contains a group by where one column is grouped by one set of columns and another column is grouped by another set of columns and still have the two different group by appear on the same row.
Lets say there is a table such as follows:
IF OBJECT_ID('tempdb..#a') IS NOT NULL DROP TABLE #a
CREATE TABLE #a (Grp_A CHAR(2), Grp_B CHAR(2), Grp_C CHAR(2), Val_A TINYINT, Val_B TINYINT)
Sample Data:
INSERT INTO #a SELECT 'A1', 'B1', 'C1', 1, 1
INSERT INTO #a SELECT 'A1', 'B1', 'C2', 1, 1
INSERT INTO #a SELECT 'A1', 'B2', 'C3', 1, 1
INSERT INTO #a SELECT 'A2', 'B2', 'C4', 1, 1
The desired result set for the above would be:
SELECT A1.Grp_A, A1.Grp_B, A1.Grp_C, SUM(Val_A) AS GrpBy_ABC, A2.GrpBy_AB
FROM #a A1
INNER JOIN (
SELECT A.Grp_A, A.Grp_B, SUM(Val_B) AS GrpBy_AB FROM #a A GROUP BY A.Grp_A, A.Grp_B
) AS A2
ON (A1.Grp_A = A2.Grp_A AND A1.Grp_B = A2.Grp_B)
GROUP BY A1.Grp_A, A1.Grp_B, A1.Grp_C, A2.GrpBy_AB
While the above query gives me the desired result, it basically involves two scans on the base table which gives me the following concern. For example, if the base table contains a few million rows and if there are around 6 Group by columns, it can be a pretty expensive query cos there would be two huge scans and a join involving 6 columns.
Using grouping sets helps me to do one scan on the base table and get the aggregates.
SELECT GROUPING_ID(A1.Grp_A, A1.Grp_B, A1.Grp_C) AS GroupId,
A1.Grp_A, A1.Grp_B, A1.Grp_C,
SUM(Val_A) AS GrpBy_ABC, SUM(Val_B) AS GrpBy_AB
FROM #a A1
GROUP BY GROUPING SETS(
(A1.Grp_A, A1.Grp_B, A1.Grp_C),
(A1.Grp_A, A1.Grp_B)
)
But the output is not in the desired format since the second GROUPING sET information is not returned on the same row but on a new row. And to get the output into desired format, I will have to do a self join the intermediate output obtained by GROUPING SETS which in essence results in two more 2 huge scans and ends up being worse than the above query using self join on the base table.
So what can be the best way to approach this query? This would be part of a Load query, so while there is no requirement to have the query run in a few seconds, I would still like to do it in the best way possible.
P.S. Do other DBMS or SQL dialects allow for such a query to be written? Basically something like this;
SELECT A.Grp_A, A.Grp_B, A.Grp_C,
SUM(Val_A GROUP BY A.Grp_A, A.Grp_B, A.Grp_C) AS GrpBy_ABC,
SUM(Val_B GROUP BY A.Grp_A, A.Grp_B) AS GrpBy_AB
FROM #a A
In essence, there is only one scan and there is no need for a self join.
September 9, 2013 at 4:37 am
Probably not the best way, but seems to work
SELECT DISTINCT
Grp_A, Grp_B, Grp_C,
SUM(Val_A) OVER(PARTITION BY Grp_A, Grp_B, Grp_C) AS GrpBy_ABC,
SUM(Val_B) OVER(PARTITION BY Grp_A, Grp_B) AS GrpBy_AB
FROM #a
____________________________________________________
Deja View - The strange feeling that somewhere, sometime you've optimised this query before
How to get the best help on a forum
http://www.sqlservercentral.com/articles/Best+Practices/61537September 9, 2013 at 6:46 am
Mark-101232 (9/9/2013)
Probably not the best way, but seems to work
SELECT DISTINCT
Grp_A, Grp_B, Grp_C,
SUM(Val_A) OVER(PARTITION BY Grp_A, Grp_B, Grp_C) AS GrpBy_ABC,
SUM(Val_B) OVER(PARTITION BY Grp_A, Grp_B) AS GrpBy_AB
FROM #a
So does this:
SELECT
d.Grp_A, d.Grp_B, d.Grp_C,
d.SUM_Val_A,
SUM_Val_B = SUM(d.PartSUM_Val_B) OVER (PARTITION BY d.Grp_A, d.Grp_B)
FROM (
SELECT
A1.Grp_A, A1.Grp_B, A1.Grp_C,
SUM(Val_A) AS SUM_Val_A,
SUM(Val_B) AS PartSUM_Val_B
FROM #a A1
GROUP BY A1.Grp_A, A1.Grp_B, A1.Grp_C
) d
both queries read the base table once then use table spools for the dirty work.
For fast, accurate and documented assistance in answering your questions, please read this article.
Understanding and using APPLY, (I) and (II) Paul White
Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden
September 9, 2013 at 7:50 pm
Mark and Chris...thank you for your input. I havent yet got a chance to test them; but they do look promising.
Viewing 4 posts - 1 through 3 (of 3 total)
You must be logged in to reply to this topic. Login to reply