aggregation, group by clause, still learning how to produce results - please help #2

Question

Post reply

aggregation, group by clause, still learning how to produce results - please help #2

xenophilia

SSCommitted

Points: 1891
More actions
January 23, 2012 at 1:24 pm

#267018

I am trying to show the total number of invoices per customer, and the total purchase amounts by customer.
---my query
Select c.cus_code, count(i.inv_NUMBER) as "number of Invoices", round(SUM(line_units * line_price), 2)as "Invoice Total"
from line l
join invoice i
on l.inv_number = i.inv_number
join customer c
on i.cus_code = c.cus_code
group by c.CUS_CODE;
---my results, the numbers in the second column are wrong.
cus_codenumber of InvoicesInvoice Total
100116444
100123153.85
100146422.77
10015234.97
10018170.44
---what results should look like
cus_codenumber of InvoicesInvoice Total
100113444
100121153.85
100142422.77
10015134.97
10018170.44
---invoice table you see has 3 invoices for 10011, 1 invoice for 10012, etc
CUS_CODEINV_number
100111002
100111004
100111008
100121003
100141001
100141006
100151007
100181005
---DDL to recreate environment (creates 3 tables)
CREATE TABLE CUSTOMER (
CUS_CODE int,
CUS_LNAME varchar(15),
CUS_FNAME varchar(15),
CUS_INITIAL varchar(1),
CUS_AREACODE varchar(3),
CUS_PHONE varchar(8),
CUS_BALANCE float(8)
);
INSERT INTO CUSTOMER VALUES('10010','Ramas','Alfred','A','615','844-2573','0');
INSERT INTO CUSTOMER VALUES('10011','Dunne','Leona','K','713','894-1238','0');
INSERT INTO CUSTOMER VALUES('10012','Smith','Kathy','W','615','894-2285','345.859985351562');
INSERT INTO CUSTOMER VALUES('10013','Olowski','Paul','F','615','894-2180','536.75');
INSERT INTO CUSTOMER VALUES('10014','Orlando','Myron','','615','222-1672','0');
INSERT INTO CUSTOMER VALUES('10015','O''Brian','Amy','B','713','442-3381','0');
INSERT INTO CUSTOMER VALUES('10016','Brown','James','G','615','297-1228','221.190002441406');
INSERT INTO CUSTOMER VALUES('10017','Williams','George','','615','290-2556','768.929992675781');
INSERT INTO CUSTOMER VALUES('10018','Farriss','Anne','G','713','382-7185','216.550003051758');
INSERT INTO CUSTOMER VALUES('10019','Smith','Olette','K','615','297-3809','0');
/* -- */
CREATE TABLE INVOICE (
INV_NUMBER int,
CUS_CODE int,
INV_DATE datetime
);
INSERT INTO INVOICE VALUES('1001','10014','1/16/2008');
INSERT INTO INVOICE VALUES('1002','10011','1/16/2008');
INSERT INTO INVOICE VALUES('1003','10012','1/16/2008');
INSERT INTO INVOICE VALUES('1004','10011','1/17/2008');
INSERT INTO INVOICE VALUES('1005','10018','1/17/2008');
INSERT INTO INVOICE VALUES('1006','10014','1/17/2008');
INSERT INTO INVOICE VALUES('1007','10015','1/17/2008');
INSERT INTO INVOICE VALUES('1008','10011','1/17/2008');
/* -- */
CREATE TABLE LINE (
INV_NUMBER int,
LINE_NUMBER int,
P_CODE varchar(10),
LINE_UNITS float(8),
LINE_PRICE float(8)
);
INSERT INTO LINE VALUES('1001','1','13-Q2/P2','1','14.9899997711182');
INSERT INTO LINE VALUES('1001','2','23109-HB','1','9.94999980926514');
INSERT INTO LINE VALUES('1002','1','54778-2T','2','4.98999977111816');
INSERT INTO LINE VALUES('1003','1','2238/QPD','1','38.9500007629395');
INSERT INTO LINE VALUES('1003','2','1546-QQ2','1','39.9500007629395');
INSERT INTO LINE VALUES('1003','3','13-Q2/P2','5','14.9899997711182');
INSERT INTO LINE VALUES('1004','1','54778-2T','3','4.98999977111816');
INSERT INTO LINE VALUES('1004','2','23109-HB','2','9.94999980926514');
INSERT INTO LINE VALUES('1005','1','PVC23DRT','12','5.86999988555908');
INSERT INTO LINE VALUES('1006','1','SM-18277','3','6.98999977111816');
INSERT INTO LINE VALUES('1006','2','2232/QTY','1','109.919998168945');
INSERT INTO LINE VALUES('1006','3','23109-HB','1','9.94999980926514');
INSERT INTO LINE VALUES('1006','4','89-WRE-Q','1','256.989990234375');
INSERT INTO LINE VALUES('1007','1','13-Q2/P2','2','14.9899997711182');
INSERT INTO LINE VALUES('1007','2','54778-2T','1','4.98999977111816');
INSERT INTO LINE VALUES('1008','1','PVC23DRT','5','5.86999988555908');
INSERT INTO LINE VALUES('1008','2','WR3/TT3','3','119.949996948242');
INSERT INTO LINE VALUES('1008','3','23109-HB','1','9.94999980926514');
/* -- */
What do I need to do to query so that the number of invoice is accurate?
-

Viewing 12 posts - 1 through 11 (of 11 total)

You must be logged in to reply to this topic. Login to reply

Jason Selburg SSC-Insane Points: 24641 More actions · Answer 1

count(distinct i.inv_NUMBER)

______________________________________________________________________

Personal Motto: Why push the envelope when you can just open it?

If you follow the direction given HERE[/url] you'll likely increase the number and quality of responses you get to your question.

Jason L. Selburg

xenophilia SSCommitted Points: 1891 More actions · Answer 2

because that's what I do when I am learning:-P. Thank you for that. I would not have thought of it and it helps my learning. Sooooo appreciate it.

-

Possinator Hall of Fame Points: 3422 More actions · Answer 3

select

c.cus_code

,[number of Invoices]=COUNT(distinct i.INV_NUMBER)

,[Invoice Total]=round(SUM(l.LINE_PRICE * l.LINE_UNITS),2)

from

#CUSTOMER c

left outer join #INVOICE i on i.CUS_CODE = c.CUS_CODE

left outer join #LINE l on l.INV_NUMBER = i.INV_NUMBER

group by

c.cus_code

having

COUNT(distinct i.INV_NUMBER) > 0

[font="Courier New"]Looking for a Deadlock Victim Support Group..[/font]

xenophilia SSCommitted Points: 1891 More actions · Answer 4

Burninator, why are you going all sophisticated on me?

-

Paul White SSC Guru Points: 150467 More actions · Answer 5

Please take a look at the differences between this code and your original - there are several learning points in there...

CREATE TABLE dbo.Customer

(

Code integer NOT NULL PRIMARY KEY,

LastName varchar(15) NOT NULL,

FirstName varchar(15) NOT NULL,

Initial char(1) NOT NULL,

AreaCode varchar(3) NOT NULL,

Phone varchar(8) NOT NULL,

Balance smallmoney NOT NULL

);

GO

INSERT dbo.Customer

(Code, LastName, FirstName, Initial, AreaCode, Phone, Balance)

VALUES

(10010,'Ramas','Alfred','A','615','844-2573',$0.00),

(10011,'Dunne','Leona','K','713','894-1238',$0.00),

(10012,'Smith','Kathy','W','615','894-2285',$345.86),

(10013,'Olowski','Paul','F','615','894-2180',$536.75),

(10014,'Orlando','Myron','','615','222-1672',$0.00),

(10015,'O''Brian','Amy','B','713','442-3381',$0.00),

(10016,'Brown','James','G','615','297-1228',$221.19),

(10017,'Williams','George','','615','290-2556',$768.93),

(10018,'Farriss','Anne','G','713','382-7185',$216.55),

(10019,'Smith','Olette','K','615','297-3809',$0.00);

CREATE TABLE dbo.Invoice

(

InvoiceNumber integer NOT NULL PRIMARY KEY,

CustomerCode integer NOT NULL REFERENCES dbo.Customer,

InvoiceDate date NOT NULL

);

GO

INSERT dbo.Invoice

(InvoiceNumber, CustomerCode, InvoiceDate)

VALUES

(1001,10014,'2008-01-16'),

(1002,10011,'2008-01-16'),

(1003,10012,'2008-01-16'),

(1004,10011,'2008-01-17'),

(1005,10018,'2008-01-17'),

(1006,10014,'2008-01-17'),

(1007,10015,'2008-01-17'),

(1008,10011,'2008-01-17');

CREATE TABLE dbo.Line

(

InvoiceNumber integer NOT NULL REFERENCES dbo.Invoice,

LineNumber int NOT NULL,

Pcode varchar(10) NOT NULL,

Units tinyint NOT NULL,

LinePrice smallmoney NOT NULL,

PRIMARY KEY (InvoiceNumber, LineNumber)

);

GO

INSERT dbo.Line

(InvoiceNumber, LineNumber, Pcode, Units, LinePrice)

VALUES

(1001,1,'13-Q2/P2',1,14.99),

(1001,2,'23109-HB',1,9.95),

(1002,1,'54778-2T',2,4.99),

(1003,1,'2238/QPD',1,38.95),

(1003,2,'1546-QQ2',1,39.95),

(1003,3,'13-Q2/P2',5,14.99),

(1004,1,'54778-2T',3,4.99),

(1004,2,'23109-HB',2,9.95),

(1005,1,'PVC23DRT',12,5.87),

(1006,1,'SM-18277',3,6.99),

(1006,2,'2232/QTY',1,109.92),

(1006,3,'23109-HB',1,9.95),

(1006,4,'89-WRE-Q',1,256.99),

(1007,1,'13-Q2/P2',2,14.99),

(1007,2,'54778-2T',1,4.99),

(1008,1,'PVC23DRT',5,5.87),

(1008,2,'WR3/TT3',3,119.95),

(1008,3,'23109-HB',1,9.95);

-- Per customer totals

SELECT

InvoiceCount = COUNT_BIG(DISTINCT inv.InvoiceNumber),

SalesTotal = SUM(line.Units * line.LinePrice)

FROM dbo.Customer AS cus

JOIN dbo.Invoice AS inv ON

inv.CustomerCode = cus.Code

JOIN dbo.Line AS line ON

line.InvoiceNumber = inv.InvoiceNumber

GROUP BY

cus.Code;

Paul White
SQLPerformance.com
SQLkiwi blog
@SQL_Kiwi

xenophilia SSCommitted Points: 1891 More actions · Answer 6

Paul, I feel the love.:hehe: I now understand that a CTE is the tool for this job.

Thanks for the Anatomy and Physiology of CTE based queries. I studied your example and see how the aliases for the columns are built and how they are aggregated against in the outer query.

Not the main point but I wondered why you used Count_Big for a column whose value, invoice_number is int.

per msdn.microsoft.com "COUNT_BIG always returns a bigint data type value. COUNT always returns an int data type value"

-

xenophilia SSCommitted Points: 1891 More actions · Answer 7

Actually, the above CTE solution is the tool for the job for my other posting:

http://www.sqlservercentral.com/Forums/Topic1240528-392-1.aspx#bm1240564

For this scenario, using a CTE seems out of place.

-

Paul White SSC Guru Points: 150467 More actions · Answer 8

aitchkcandoo (1/23/2012)
Not the main point but I wondered why you used Count_Big for a column whose value, invoice_number is int. per msdn.microsoft.com "COUNT_BIG always returns a bigint data type value. COUNT always returns an int data type value"

Internally, the count function always returns a BIGINT (regardless of the type of the column of course - we are just counting the rows). When you use COUNT, the internal result is cast to an integer; with COUNT_BIG that cast isn't necessary. It's a very small thing, and just a habit I have developed to use COUNT_BIG by default. You should use whichever (COUNT or COUNT_BIG) makes sense in each case - if you want an integer count, use COUNT.

Paul White
SQLPerformance.com
SQLkiwi blog
@SQL_Kiwi

XploreBI SSChasing Mays Points: 601 More actions · Answer 9

This reminded me of a post some time ago by Marc Friedman where he talks about distinct aggregations and table spools. I took Paul’s solution and applied it (or at least tried) to Marc’s suggestion. The query plans are different and might be worth a look. Like you, aitchkcandoo, I'm also still learning.

Reference Marc Friedman - http://blogs.msdn.com/b/sqlqueryprocessing/archive/2008/09/22/distinct-aggregation-considered-harmful.aspx

SET STATISTICS PROFILE ON

SET STATISTICS TIME ON

;WITH CustInvoice AS (

SELECT Code,

LastName,

FirstName,

Initial,

AreaCode,

Phone,

Balance,

inv.InvoiceNumber as InvoiceNumber,

CustomerCode,

InvoiceDate,

LineNumber,

Units,

LinePrice

FROM dbo.Customer AS cus

JOIN dbo.Invoice AS inv

ON inv.CustomerCode = cus.Code

JOIN dbo.Line AS line

ON line.InvoiceNumber = inv.InvoiceNumber

)

, PartialSums AS (

SELECT CustomerCode,

InvoiceNumber,

COUNT(*) Cust,

SUM(Units * LinePrice) AS InvSum

FROM CustInvoice

GROUP BY CustomerCode, InvoiceNumber

)

SELECT

CustomerCode,

SUM(1) AS InvoiceCount,

SUM(InvSum) AS SalesTotal

FROM PartialSums

GROUP BY CustomerCode

Paul White SSC Guru Points: 150467 More actions · Answer 10

XploreBI (1/29/2012)
This reminded me of a post some time ago by Marc Friedman where he talks about distinct aggregations and table spools. I took Paul’s solution and applied it (or at least tried) to Marc’s suggestion. The query plans are different and might be worth a look. Like you, aitchkcandoo, I'm also still learning.

You might also enjoy: http://sqlblog.com/blogs/paul_white/archive/2011/12/04/is-distinct-aggregation-still-considered-harmful.aspx

Paul White
SQLPerformance.com
SQLkiwi blog
@SQL_Kiwi

xenophilia SSCommitted Points: 1891 More actions · Answer 11

xenophilia

SSCommitted

Points: 1891

January 30, 2012 at 1:14 am

#1439858

Read it. Thanks.

-