Avoid grouping twice on a query

Question

Post reply

Avoid grouping twice on a query

Nelson Casanova

SSCommitted

Points: 1570
More actions
February 25, 2016 at 5:46 am

#308252

I got a problem doing a multiple join to retrieve some data...
Tables look like:
Employees
EmpID Name
1 ANDREW
2 STEPHEN
3 DAVID
Companies
ComID Name
1 Company1
2 Company2
Payments
PayID Date
1 2016-02-25
UserPayments
RecID PayID ComID EmpID WeekNo Amt1 Amt2
1 1 1 1 1 20.5 20.5
2 1 1 1 2 20.5 20.5
3 1 1 1 3 20.5 20.5
4 1 1 1 4 20.5 20.5
5 1 2 2 1 20.5 20.5
6 1 2 2 2 20.5 20.5
7 1 2 2 3 20.5 20.5
8 1 2 2 4 20.5 20.5
9 1 2 3 1 20.5 20.5
10 1 2 3 2 20.5 20.5
11 1 2 3 3 20.5 20.5
12 1 2 3 4 20.5 20.5
Result
PayID EmpName UserName Amt1Total Amt2Total
1 Company1 ANDREW 82 82
1 Company2 STEPHEN 82 82
1 Company2 DAVID 82 82
So my problem is that my query need to filter twice and group within a subquery and then use DISTINCT not to show every record x4...
I thought I would be an easy task but so far I've tried several hours and still not seeing an alternative query for doing this...
Here's what I got now.
SELECT DISTINCT
p.PayID,
e.Name,
c.Name,
xx.a,
xx.b
FROM
UserPayments p
LEFT JOIN
Employees e
ON p.EmpID = e.EmpID
LEFT JOIN
Companies c
ON p.ComID = c.ComID
LEFT JOIN
(SELECT
p.EmpID,
SUM(p.Amt1) a,
SUM(p.Amt2) b
FROM
UserPayments p
WHERE p.PayID = 1
GROUP BY p.EmpID
) xx
ON p.EmpID = xx.EmpID
WHERE p.PayID = 1
Whis query works but its just the nature of using DISTINCT and having the aggregate SUMs on a subquery join... I don't know... Is it the most effective way of doing this?
Regards
My stack: VS2015 ~ C# ~ MSSQL 2014[/url] ~ DevCraft ~ ReSharper ~ DataGrip ~ Linqer[/url]

Viewing 11 posts - 1 through 10 (of 10 total)

You must be logged in to reply to this topic. Login to reply

tindog SSCarpal Tunnel Points: 4892 More actions · Answer 1

After adding your sample data into some temporary tables, this returned the same results as above:

SELECTp.PayID,

e.Name,

c.Name,

SUM(p.Amt1) SumAmt1,

SUM(p.Amt2) SumAmt2

FROM#UserPayments p

INNER JOIN#Employees e ON p.EmpID = e.EmpID

INNER JOIN#Companies c ON p.ComID = c.CID

WHEREp.PayID = 1

GROUP BYp.PayID,

e.Name,

c.Name

It doesn't look like you need a subquery.

Note: you may need to change the 'joins' as I've used INNER above - is LEFT necessary?

Nelson Casanova SSCommitted Points: 1570 More actions · Answer 2

tindog (2/25/2016)
After adding your sample data into some temporary tables, this returned the same results as above:
SELECTp.PayID,
e.Name,
c.Name,
SUM(p.Amt1) SumAmt1,
SUM(p.Amt2) SumAmt2
FROM#UserPayments p
INNER JOIN#Employees e ON p.EmpID = e.EmpID
INNER JOIN#Companies c ON p.ComID = c.CID
WHEREp.PayID = 1
GROUP BYp.PayID,
e.Name,
c.Name
It doesn't look like you need a subquery.
Note: you may need to change the 'joins' as I've used INNER above - is LEFT necessary?

Hey there, thanks for your reply...

I did that originally, but having so many fields on the GROUP BY clause makes it look even dodgier....

Look (with the original table and field names (same thing though):

SELECT

tcs.CRSEntryNo,

tp.PersonNo,

tp.PenNo,

tp.LName,

tp.FName,

te.EmployerName,

tcs.DocType,

SUM(tcs.EmployeeValue) AS Employee,

SUM(tcs.EmployerValue) AS Employer

FROM

TblCrsSub tcs

INNER JOIN

TblPersonal tp ON tcs.PenNo = tp.PenNo

INNER JOIN

TblEmployer te ON tcs.Ref = te.EmployerCode

WHERE

tcs.CRSEntryNo = 4045

GROUP BY

tcs.CRSEntryNo,

tp.PersonNo,

tp.PenNo,

tp.LName,

tp.FName,

te.EmployerName,

tcs.DocType

But whats even worst, if I do something like:

SELECT

CRSENtryNo,

SUM(EmployeeValue) AS Tee,

SUM(EmployerValue) AS Ter

FROM TblCrsSub

WHERE CRSEntryNo = 4045

GROUP BY CRSEntryNo

I get different values for both aggregated functions (?) I'm currently going through the records to see why the difference is happening

My stack: VS2015 ~ C# ~ MSSQL 2014[/url] ~ DevCraft ~ ReSharper ~ DataGrip ~ Linqer[/url]

tindog SSCarpal Tunnel Points: 4892 More actions · Answer 3

What do you mean by 'looks dodgier'? Does it not return the correct data?

What's the issue with the first query in your recent post? Also, the second query can't be right, it's selecting everything and grouping by nothing.

Nelson Casanova SSCommitted Points: 1570 More actions · Answer 4

tindog (2/25/2016)
What do you mean by 'looks dodgier'? Does it not return the correct data?
What's the issue with the first query in your recent post? Also, the second query can't be right, it's selecting everything and grouping by nothing.

Its nothing, just the way it looks I think, GROUP BY everything, you know... Maybe i'm just biased since thats the way queries look like when you use SSMS query builder...

And yes, they do produce different total amounts, I fixed the second query now...

My stack: VS2015 ~ C# ~ MSSQL 2014[/url] ~ DevCraft ~ ReSharper ~ DataGrip ~ Linqer[/url]

Luis Cazares SSC Guru Points: 183706 More actions · Answer 5

Dodgier according to who?

If your problem is that you have too many columns in the GROUP BY clause, you could use a different approach.

WITH cteCrsSub AS(

SELECT tcs.CRSEntryNo,

tcs.PenNo,

tcs.Ref,

SUM(tcs.EmployeeValue) AS Employee,

SUM(tcs.EmployerValue) AS Employer

FROM TblCrsSub tcs

WHERE

tcs.CRSEntryNo = 4045

GROUP BY tcs.CRSEntryNo,

tcs.PenNo,

tcs.Ref

)

SELECT

tcs.CRSEntryNo,

tp.PersonNo,

tp.PenNo,

tp.LName,

tp.FName,

te.EmployerName,

tcs.DocType,

tcs.Employee,

tcs.Employer

FROM cteCrsSub tcs

JOIN TblPersonal tp ON tcs.PenNo = tp.PenNo

JOIN TblEmployer te ON tcs.Ref = te.EmployerCode;

I used a CTE because I find derived tables dodgy. 😀

Your initial problem was that you were including the table twice, once aggregated and once as it is.

Luis C.
General Disclaimer:
Are you seriously taking the advice and code from someone from the internet without testing it? Do you at least understand it? Or can it easily kill your server?

How to post data/code on a forum to get the best help: Option 1 / Option 2

tindog SSCarpal Tunnel Points: 4892 More actions · Answer 6

sys.user (2/25/2016)
tindog (2/25/2016)
What do you mean by 'looks dodgier'? Does it not return the correct data?
What's the issue with the first query in your recent post? Also, the second query can't be right, it's selecting everything and grouping by nothing.
Its nothing, just the way it looks I think, GROUP BY everything, you know... Maybe i'm just biased since thats the way queries look like when you use SSMS query builder...
And yes, they do produce different total amounts, I fixed the second query now...

Is making a query look pretty really worth your time and possible performance sacrifices? As long as it's readable and performs well, you're all set.

Can you provide sample data for those tables - they look to be different to the tables in your original post.

Nelson Casanova SSCommitted Points: 1570 More actions · Answer 7

Luis Cazares (2/25/2016)
Dodgier according to who?
If your problem is that you have too many columns in the GROUP BY clause, you could use a different approach.
I used a CTE because I find derived tables dodgy. 😀
Your initial problem was that you were including the table twice, once aggregated and once as it is.

tindog (2/25/2016)
Is making a query look pretty really worth your time and possible performance sacrifices? As long as it's readable and performs well, you're all set.
Can you provide sample data for those tables - they look to be different to the tables in your original post.

I'm fully aware of how esoterical it sounds when I evaluate a query by the way it looks, but you know it was one of those moments... If its right or normal in this case to group by a bunch of fields why wouldnt it be better to just stick to use DISTINCT on all the values but the total and get the total from an aggregate subquery?

My stack: VS2015 ~ C# ~ MSSQL 2014[/url] ~ DevCraft ~ ReSharper ~ DataGrip ~ Linqer[/url]

tindog SSCarpal Tunnel Points: 4892 More actions · Answer 8

sys.user (2/25/2016)
Luis Cazares (2/25/2016)
Dodgier according to who?
If your problem is that you have too many columns in the GROUP BY clause, you could use a different approach.
I used a CTE because I find derived tables dodgy. 😀
Your initial problem was that you were including the table twice, once aggregated and once as it is.
tindog (2/25/2016)
Is making a query look pretty really worth your time and possible performance sacrifices? As long as it's readable and performs well, you're all set.
Can you provide sample data for those tables - they look to be different to the tables in your original post.
I'm fully aware of how esoterical it sounds when I evaluate a query by the way it looks, but you know it was one of those moments... If its right or normal in this case to group by a bunch of fields why wouldnt it be better to just stick to use DISTINCT on all the values but the total and get the total from an aggregate subquery?

Because if you query the table twice unnecessarily, performance will decrease.

Luis Cazares SSC Guru Points: 183706 More actions · Answer 9

sys.user (2/25/2016)
Luis Cazares (2/25/2016)
Dodgier according to who?
If your problem is that you have too many columns in the GROUP BY clause, you could use a different approach.
I used a CTE because I find derived tables dodgy. 😀
Your initial problem was that you were including the table twice, once aggregated and once as it is.
tindog (2/25/2016)
Is making a query look pretty really worth your time and possible performance sacrifices? As long as it's readable and performs well, you're all set.
Can you provide sample data for those tables - they look to be different to the tables in your original post.
I'm fully aware of how esoterical it sounds when I evaluate a query by the way it looks, but you know it was one of those moments... If its right or normal in this case to group by a bunch of fields why wouldnt it be better to just stick to use DISTINCT on all the values but the total and get the total from an aggregate subquery?

Because DISTINCT is actually a common indicator of a problem, people use it to mask problems caused by incorrect joins. You were reading the table twice and then discarded the unnecessary duplicates caused by the additional table read.

As I showed, the subquery wasn't the problem and you could use it without problems, I just prefer to use it in the form of a CTE which leaves the main query "clean".

Luis C.
General Disclaimer:
Are you seriously taking the advice and code from someone from the internet without testing it? Do you at least understand it? Or can it easily kill your server?

How to post data/code on a forum to get the best help: Option 1 / Option 2

Nelson Casanova SSCommitted Points: 1570 More actions · Answer 10

Thats alright then... Thanks both for your help.

My stack: VS2015 ~ C# ~ MSSQL 2014[/url] ~ DevCraft ~ ReSharper ~ DataGrip ~ Linqer[/url]