Group By Help

Question

Group By Help

JayWinter

Right there with Babe

Points: 769
More actions
July 30, 2012 at 4:52 pm

#392856

I have a script that uses the GROUP BY clause and returns the SUM much greater than expected. Here is the script with GROUP BY followed by the script without GROUP BY. The value returned by SUM(RETAIL_SALES.[Sales Units LW]) as [Sales Units LW] is 18426. The correct value should be 6142. I appreciate your help.
STYLE COLOR SEASON YR MO WK Sales Units LW
HK87202 FWHG 2012 20127418426.00
STYLE COLOR SEASON YR MO WK Sales Units LW
HK87202 FWHG 2012201274 366
HK87202 FWHG 2012201274 796
HK87202 FWHG 20122012741189
HK87202 FWHG 20122012741814
HK87202 FWHG 20122012741977
TOTAL 6142
SELECT DISTINCT
ITEMMAST.STYLE as STYLE
,ITEMMAST.COLOR as COLOR
,Max(ITEMMAST.SEASON) as SEASON
,RETAIL_SALES.YR
,RETAIL_SALES.MO
,RETAIL_SALES.WK
,SUM(RETAIL_SALES.[Sales Units LW]) as [Sales Units LW]
FROM Evy_RH_Objects.dbo.RETAIL_SALES RETAIL_SALES
LEFT OUTER JOIN RH2007_EvyLive.dbo.ITEMMAST ITEMMAST on ITEMMAST.CUSTNO='WALM01' and (ITEMMAST.SKU=RETAIL_SALES.SKU or ITEMMAST.ITEMUPC=RETAIL_SALES.SKU)
WHERE
RETAIL_SALES.CUST_NO='WALM01'
and RETAIL_SALES.WK=4
and ITEMMAST.STYLE='HK87202'
GROUP BY ITEMMAST.STYLE, ITEMMAST.COLOR, RETAIL_SALES.YR, RETAIL_SALES.MO, RETAIL_SALES.WK
=================================================================================
SELECT DISTINCT
ITEMMAST.STYLE as STYLE
,ITEMMAST.COLOR as COLOR
,ITEMMAST.SEASON as SEASON
,RETAIL_SALES.YR
,RETAIL_SALES.MO
,RETAIL_SALES.WK
,RETAIL_SALES.[Sales Units LW] as [Sales Units LW]
FROM Evy_RH_Objects.dbo.RETAIL_SALES RETAIL_SALES
LEFT OUTER JOIN RH2007_EvyLive.dbo.ITEMMAST ITEMMAST on ITEMMAST.CUSTNO='WALM01' and (ITEMMAST.SKU=RETAIL_SALES.SKU or ITEMMAST.ITEMUPC=RETAIL_SALES.SKU)
WHERE
RETAIL_SALES.CUST_NO='WALM01'
and RETAIL_SALES.WK=4
and ITEMMAST.STYLE='HK87202'

Viewing 5 posts - 1 through 4 (of 4 total)

You must be logged in to reply to this topic. Login to reply

Jeff Moden SSC Guru Points: 1004414 More actions · Answer 1

You may have a many-to-many join going on. You should probably also have things like "ITEMMAST.CUSTNO='WALM01'" in a WHERE clause instead of an ON especially when outer joins are involved.

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

Phil Parkin SSC Guru Points: 246988 More actions · Answer 2

Following on from Jeff's comment - if you remove the GROUP BY, you should be able to check whether more rows are being returned than you expect/want.

The absence of evidence is not evidence of absence.
Martin Rees

You can lead a horse to water, but a pencil must be lead.
Stan Laurel

JayWinter Right there with Babe Points: 769 More actions · Answer 3

I did remove the GROUP BY and and achieved the correct result. My original post shows a 2nd script without GROUP BY.

drew.allen SSC Guru Points: 76997 More actions · Answer 4

JayWinter (7/31/2012)
I did remove the GROUP BY and and achieved the correct result. My original post shows a 2nd script without GROUP BY.

Your DISTINCT clause is hiding the problem. DISTINCT is processed after the GROUP BY, so any duplicates will be included in your totals for the GROUP BY, but will be excluded in your QA query.

DISTINCT is also superfluous in conjunction with a GROUP BY anyhow. The results of a simple GROUP BY statement are necessarily distinct. (That may not be the case if you have multiple grouping sets.)

Drew

J. Drew Allen
Business Intelligence Analyst
Philadelphia, PA