August 8, 2013 at 3:58 am
Hi all,
Looking for assistance in a SQL query to use in a report.
Finance people need to show “price variance” to our group owners and are currently spending a lot of time in Excel. I’m thinking/hoping I can get the data via a single SQL query.
I can get the data out of only three tables -Customers, Invoices and Invoiceitems - but it’s the grouping and summing I’m not sure how to do.
The output would be:
Customer Number. From “Customers”
Customer Name. From “Customers” and will join to “Invoices”
Part Number. From “Invoiceditems” - which can join to "Invoices" - as is all of the data below…
“Last year’s Price.” The sum of the qty * price within a date range
“This year’s price” The sum of the qty * price within a different date range
Qty shipped last year. Total qty of the part shipped grouped by customer within a date range
Qty shipped this year. Total qty of the part shipped grouped by customer within a date range
“A value for last year.” Last year’s price / by total qty within a date range
“A Value for this year.” This year’s price / by total qty within a date range
Example Output.
Last year customer “A” bought part “ABC” 100 times on ten different orders and paid two different prices.
This year customer “A” bought part “ABC” 75 times on 20 different orders and paid four different prices.
I need it for ALL customers though. A nine column output for maybe a thousand customers and around 500 separate part numbers.
Now, as I think about this and I have already discussed with colleagues, I see that the questions(s) are, er, not the ones necessarily I would want to ask. However, he who pays the piper, etc.
Apologies in advance if I've put this in the wrong place and/or gone into too much detail.
August 8, 2013 at 4:58 am
This should work.
Note that the code is untested since you did not provide actual table definitions and test data, but the principle should work.
;with
thisYear as (
-- Get summarized values for this year
select
c.CustomerID,
c.CustomerName,
ii.PartNumber,
sum(ii.qty*ii.price) as SumPrice,
sum(ii.qty) as SumQty,
sum(ii.qty*ii.price) / sum(ii.qty) as SumValue
from Customers c
join Invoices i on i.CustomerID = c.CustomerID
join InvoicedItems ii on ii.InvoiceID = i.InvoiceID
where i.InvoiceDate between @StartDate and @EndDate
group by CustomerID, CustomerName, PartNumber
)
, lastYear as (
-- Get summarized values for last year
select
c.CustomerID,
c.CustomerName,
ii.PartNumber,
sum(ii.qty*ii.price) as SumPrice,
sum(ii.qty) as SumQty,
sum(ii.qty*ii.price) / sum(ii.qty) as SumValue
from Customers c
join Invoices i on i.CustomerID = c.CustomerID
join InvoicedItems ii on ii.InvoiceID = i.InvoiceID
where i.InvoiceDate between dateadd(year, -1, @StartDate) and dateadd(year, -1, @EndDate)
group by CustomerID, CustomerName, PartNumber
)
-- Join this year and last year to present corresponding values on the same line
select
coalesce(ty.CustomerID, ly.CustomerID) as CustomerID,
coalesce(ty.CustomerName, ly.CustomerName) as CustomerName,
coalesce(ty.PartNumber, ly.PartNumber) as PartNumber,
ly.SumPrice as LastYearPrice,
ty.SumPrice as ThisYearPrice,
ly.SumQty as LastYearQty,
ty.SumQty as ThisYearQty,
ly.SumValue as LastYearValue,
ty.SumValue as ThisYearValue
from thisYear ty
full join lastYear ly
on ly.CustomerID = ty.CustomerID and ly.PartNumber = ty.PartNumber
order by 1,2,3
August 8, 2013 at 6:10 am
Wow, Stefan, that was quick!
Huge thanks for taking the time.
I'll be looking at it over the next few days and will definitely report back.
Thanks again. 🙂
August 8, 2013 at 4:14 pm
Here's my version of it. I delayed the lookup of CustomerName to avoid having to GROUP on it -- GROUP BY is an especially expensive operation on varchar columns.
I also get both years in a single SELECT to (try to) avoid double-reading of the tables.
Finally, I added some code to get the distinct counts you mentioned in your original q.
SELECT
derived.customerId AS Customer_Id,
c.name AS Customer_Name,
derived.Last_Yr_Price,
derived.Last_Yr_Qty_Shipped,
derived.Last_Yr_Price / derived.Last_Yr_Qty_Shipped AS Last_Yr_Value,
derived.Last_Yr_Times_Bought,
derived.Last_Yr_Diff_Prices_Paid,
derived.This_Yr_Price,
derived.This_Yr_Qty_Shipped,
derived.This_Yr_Price / derived.This_Yr_Qty_Shipped AS This_Yr_Value,
derived.This_Yr_Times_Bought,
derived.This_Yr_Diff_Prices_Paid
FROM (
SELECT
i.customerId,
ii.partNumber,
SUM(CASE WHEN i.invoiceDate < @StartDate THEN ii.qty * ii.price ELSE 0 END) AS Last_Yr_Price,
SUM(CASE WHEN i.invoiceDate < @StartDate THEN ii.qty ELSE 0 END) AS Last_Yr_Qty_Shipped,
COUNT(DISTINCT CASE WHEN i.invoiceDate < @StartDate THEN i.invoiceId END) AS Last_Yr_Times_Bought,
COUNT(DISTINCT CASE WHEN i.invoiceDate < @StartDate THEN ii.price END) AS Last_Yr_Diff_Prices_Paid,
SUM(CASE WHEN i.invoiceDate >= @StartDate THEN ii.qty * ii.price ELSE 0 END) AS This_Yr_Price,
SUM(CASE WHEN i.invoiceDate >= @StartDate THEN ii.qty ELSE 0 END) AS This_Yr_Qty_Shipped,
COUNT(DISTINCT CASE WHEN i.invoiceDate >= @StartDate THEN i.invoiceId END) AS This_Yr_Times_Bought,
COUNT(DISTINCT CASE WHEN i.invoiceDate >= @StartDate THEN ii.price END) AS This_Yr_Diff_Prices_Paid
FROM dbo.Invoices i
INNER JOIN dbo.InvoicedItems ii ON
ii.invoiceId = i.invoiceId
WHERE
i.invoiceDate >= DATEADD(YEAR, -1, @StartDate) AND
i.invoiceDate < DATEADD(DAY, 1, @EndDate)
GROUP BY
i.customerId,
ii.partNumber
) AS derived
INNER JOIN dbo.Customers c ON
c.customerId = derived.customerId
ORDER BY
derived.customerId
SQL DBA,SQL Server MVP(07, 08, 09) "It's a dog-eat-dog world, and I'm wearing Milk-Bone underwear." "Norm", on "Cheers". Also from "Cheers", from "Carla": "You need to know 3 things about Tortelli men: Tortelli men draw women like flies; Tortelli men treat women like flies; Tortelli men's brains are in their flies".
August 9, 2013 at 1:33 am
I delayed the lookup of CustomerName to avoid having to GROUP on it -- GROUP BY is an especially expensive operation on varchar columns.
OK, I suppose this could be a valid point. I dont think it makes much difference in this case though.
I also get both years in a single SELECT to (try to) avoid double-reading of the tables.
Unfortunately your code will read all data for all invoices from 1 year before @startdate to @enddate. This could be a big problem if you for example are looking at a single month.
Assuming appropriate indexes my code will only look at invoices for the interesting time period this year and previous year.
This could be a big performance win.
Your code will also not work correctly when calculating values for the previous year.
It is not enough to say WHEN i.invoiceDate < @StartDate.
You should instead say WHEN i.invoiceDate < dateadd(year, -1, @EndDate)
Your code will also produce a lot of lines with 0 values for products that where bought in the interval between the period this year, and the period previous year.
In summary, I think my version will work a lot better in practice 🙂
August 9, 2013 at 2:05 am
Read the tables only once. Shamelessly nicked from both Scott's and Stefan's code:
WITH Preagg AS (
SELECT
x.Period,
c.CustomerID,
c.CustomerName,
ii.PartNumber,
SumPrice= SUM(ii.qty*ii.price),
SumQty= SUM(ii.qty),
SumValue= SUM(ii.qty*ii.price) / SUM(ii.qty)
FROM Customers c
INNER JOIN Invoices i ON i.CustomerID = c.CustomerID
INNER JOIN InvoicedItems ii ON ii.InvoiceID = i.InvoiceID
CROSS APPLY (
SELECT Period = CASE
WHEN i.InvoiceDate BETWEEN @StartDate AND @EndDate THEN 'ThisYear'
WHEN i.InvoiceDate BETWEEN
dateadd(year, -1, @StartDate) AND dateadd(year, -1, @EndDate) THEN 'LastYear'
ELSE NULL END
) x
WHERE x.Period IS NOT NULL
GROUP BY x.Period, CustomerID, CustomerName, PartNumber
)
SELECT CustomerID, CustomerName, PartNumber,
LastYearPrice= SUM(CASE WHEN Period = 'LastYear' THEN SumPrice ELSE 0 END),
ThisYearPrice= SUM(CASE WHEN Period = 'ThisYear' THEN SumPrice ELSE 0 END),
LastYearQty= SUM(CASE WHEN Period = 'LastYear' THEN SumQty ELSE 0 END),
ThisYearQty= SUM(CASE WHEN Period = 'ThisYear' THEN SumQty ELSE 0 END),
LastYearValue= SUM(CASE WHEN Period = 'LastYear' THEN SumValue ELSE 0 END),
ThisYearValue= SUM(CASE WHEN Period = 'ThisYear' THEN SumValue ELSE 0 END)
FROM Preagg
GROUP BY CustomerID, CustomerName, PartNumber
For fast, accurate and documented assistance in answering your questions, please read this article.
Understanding and using APPLY, (I) and (II) Paul White
Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden
August 9, 2013 at 5:08 am
Guys,
Thanks for the continuing input. Really interesting.
Stefan,
Still at the testing stage, but using your script I now have a working solution.
The "odd" column names are determined by our group owners.
And my specific dates are for the testing phase. Once I move it into a report writer I'll add date parameters.
-----------------------------------------
;with
thisYear as (
-- Get summarized values for this year
select
customers.id,
customers.name,
invoiceitems.partid,
sum(invoiceitems.quantity * invoiceitems.currencyunitprice) as SumPrice,
sum(invoiceitems.quantity) as SumQty,
sum(invoiceitems.quantity * invoiceitems.currencyunitprice) / sum(invoiceitems.quantity) as SumValue
from customers
LEFT OUTER JOIN invoices ON
customers.id = invoices.traderid
LEFT OUTER JOIN invoiceitems ON
invoices.id = invoiceitems.invoiceid
WHERE invoices.tradertype = 'C'
AND invoiceitems.partid NOT LIKE 'C%'
AND invoices.id LIKE 'SI%'
AND invoices.createddate BETWEEN '01 August 2011' AND '31 July 2012'
Group By
customers.id,customers.name, invoiceitems.partid
)
, lastYear as (
-- Get summarized values for last year
select
customers.id,
customers.name,
invoiceitems.partid,
sum(invoiceitems.quantity * invoiceitems.currencyunitprice) as SumPrice,
sum(invoiceitems.quantity) as SumQty,
sum(invoiceitems.quantity * invoiceitems.currencyunitprice) / sum(invoiceitems.quantity) as SumValue
from customers
LEFT OUTER JOIN invoices ON
customers.id = invoices.traderid
LEFT OUTER JOIN invoiceitems ON
invoices.id = invoiceitems.invoiceid
WHERE invoices.tradertype = 'C'
AND invoiceitems.partid NOT LIKE 'C%'
AND invoices.id LIKE 'SI%'
AND invoices.createddate BETWEEN '01 August 2012' AND '31 July 2013'
Group By
customers.id,customers.name, invoiceitems.partid
)
-- Join this year and last year to present corresponding values on the same line
select
coalesce(ty.id, ly.id) as CustomerID,
coalesce(ty.name, ly.name) as CustomerName,
coalesce(ty.partid, ly.partid) as PartNumber,
ly.SumPrice as PYTD_Ext_Price,
ty.SumPrice as CYTD_Ext_Price,
ly.SumQty as PYTD_Qty,
ty.SumQty as CYTD_Qty,
ly.SumValue as ASP_PYTD,
ty.SumValue as ASP_CYTD
from thisYear ty
full join lastYear ly
on ly.id = ty.id and ly.partid = ty.partid
order by 1,2,3
August 9, 2013 at 8:33 am
Stefan_G (8/9/2013)
I also get both years in a single SELECT to (try to) avoid double-reading of the tables.
Unfortunately your code will read all data for all invoices from 1 year before @startdate to @enddate. This could be a big problem if you for example are looking at a single month.
Assuming appropriate indexes my code will only look at invoices for the interesting time period this year and previous year.
It is not enough to say WHEN i.invoiceDate < @StartDate.
Easy enough to change the WHERE on the date to an OR with the two specific date ranges.
If the @StartDate is the start of the current period -- which seemed to be what your code assumed -- why isn't that check enough to determine if the date is for the prior year or the current year??
SQL DBA,SQL Server MVP(07, 08, 09) "It's a dog-eat-dog world, and I'm wearing Milk-Bone underwear." "Norm", on "Cheers". Also from "Cheers", from "Carla": "You need to know 3 things about Tortelli men: Tortelli men draw women like flies; Tortelli men treat women like flies; Tortelli men's brains are in their flies".
August 9, 2013 at 10:22 am
Easy enough to change the WHERE on the date to an OR with the two specific date ranges.
If the @StartDate is the start of the current period -- which seemed to be what your code assumed -- why isn't that check enough to determine if the date is for the prior year or the current year??
Well, your check WOULD have been enough if you had another WHERE condition so you only aggregated data from the interesting periods - not data between the period last year and this year. But the way your code was written, any data before the starting period this year would have been aggregated in the value for last year.
After thinking about this a little more, I now actually think that your solution with a modified WHERE condition would be very good.
(i.InvoiceDate >= @StartDate and i.InvoiceDate < @EndDate)
or
(i.InvoiceDate >= dateadd(year, -1, @StartDate) and i.InvoiceDate < dateadd(year, -1, @EndDate))
Assuming there is an index on InvoiceDate, SQL server will translate a WHERE condition like this to two index seeks which is exactly what I was looking for with my solution.
August 9, 2013 at 3:16 pm
Stefan_G (8/9/2013)
Easy enough to change the WHERE on the date to an OR with the two specific date ranges.
If the @StartDate is the start of the current period -- which seemed to be what your code assumed -- why isn't that check enough to determine if the date is for the prior year or the current year??
Well, your check WOULD have been enough if you had another WHERE condition so you only aggregated data from the interesting periods - not data between the period last year and this year. But the way your code was written, any data before the starting period this year would have been aggregated in the value for last year.
After thinking about this a little more, I now actually think that your solution with a modified WHERE condition would be very good.
(i.InvoiceDate >= @StartDate and i.InvoiceDate < @EndDate)
or
(i.InvoiceDate >= dateadd(year, -1, @StartDate) and i.InvoiceDate < dateadd(year, -1, @EndDate))
Assuming there is an index on InvoiceDate, SQL server will translate a WHERE condition like this to two index seeks which is exactly what I was looking for with my solution.
I wouldn't expect the Invoices table to be clustered on InvoiceDate; if it's not, SQL may well can the table anyway unless there's a covering index.
SQL DBA,SQL Server MVP(07, 08, 09) "It's a dog-eat-dog world, and I'm wearing Milk-Bone underwear." "Norm", on "Cheers". Also from "Cheers", from "Carla": "You need to know 3 things about Tortelli men: Tortelli men draw women like flies; Tortelli men treat women like flies; Tortelli men's brains are in their flies".
Viewing 10 posts - 1 through 9 (of 9 total)
You must be logged in to reply to this topic. Login to reply