June 5, 2008 at 4:26 pm
Hi gang,
I have a solution to this but it seems that there must be a better way. The problem is pretty routine, I would imagine...
I have a DW table that is flattened out nicely. It contains order information, one row per order/item pair. The table also contains promo_code and promo_discount_amt columns. Let's say the data looks something like this:
ID ORDER_KEY ITEM_ID PROMO_DISC_CODE PROMO_DISC_AMT
1 4 100 ABC 25.00
2 4 101 ABC 25.00
3 4 102 ABC 25.00
4 5 200 XYZ 100.00
5 5 201 XYZ 100.00
The promo info applies to the order not the line item. What I'd like to do is get the total promo discount amount (which should be 125.00).
The way I solved this is using a CTE like this:
WITH Promos (OrderKey, DiscAmt) AS (
SELECT DISTINCT [order_key], promo_disc_amt FROM FactOrder
WHERE [promo_disc_code] IS NOT NULL
)
SELECT SUM(DiscAmt) FROM Promos
It just seems that there must be a simple way to aggregate this without resorting to temp table... am I missing something or is this the SOP for doing this sort of thing?
Cheers,
Chris
June 5, 2008 at 5:03 pm
Wow, total flub on my part! :blush:
Thanks for the catch KB.
Edit: Removed my first short sighted response.
June 5, 2008 at 5:37 pm
SUM(DISTINCT PROMO_DISC_AMT) is not the right solution if PROMO_DISC_CODE can be repeated in the data (which I guess must be a possibility).
If your data looks like :
INSERT INTO @FACTORDER
SELECT 4, 100, 'ABC', 25.00 UNION ALL
SELECT 4, 101, 'ABC', 25.00 UNION ALL
SELECT 4, 102, 'ABC', 25.00 UNION ALL
SELECT 5, 200, 'XYZ', 100.00 UNION ALL
SELECT 5, 201, 'XYZ', 100.00 UNION ALL
SELECT 6, 300, 'ABC', 25.00 UNION ALL
SELECT 6, 301, 'ABC', 25.00
SELECT SUM(DISTINCT PROMO_DISC_AMT) FROM @FACTORDER
WHERE PROMO_DISC_CODE IS NOT NULL
This will still give you 125 instead of 150 which I believe is the required solution here.
June 5, 2008 at 8:12 pm
You could also do this with the windowed version of an aggregate.
SELECT [order_key], promo_disc_amt,
Sum(promo_disc_amt) over (partition by NULL) totalPromo
FROM FactOrder
WHERE [promo_disc_code] IS NOT NULL
group by orderkey
----------------------------------------------------------------------------------
Your lack of planning does not constitute an emergency on my part...unless you're my manager...or a director and above...or a really loud-spoken end-user..All right - what was my emergency again?
June 5, 2008 at 10:54 pm
Hi all,
Thanks for the responses.
Matt, that looks like what I was looking for! What is "partition" in this case? Or is it a keyword used with OVER?
Cheers,
Chris
June 6, 2008 at 9:32 am
When you're "windowing" aggregate functions - PARTITION BY is concaptually the same as GROUP BY. It defines the window you're operating on.
An example might work better:
Select customerID,
invoiceID,
invoiceamt,
sum(invoiceAmt) over (PARTITION BY CUSTOMERID) customerTotal
from
customerInvoices
The above example would allow you to have the customer's total of all invoices on each line (so you could things like "percentage of group total" calcs without a separate sub-query and without a group by. Or rather an EXPLICIT separate sub-query (if you look at the exec plan - there's still some extra steps essentially boiling down to a worktable being created with the summary info, etc...).
As of right now - these tend to work well on SMALL datasets. They don't do so well on large sets, where the derived tables tend to do better.
The interesting behavior here is that the windowed aggregates operate on the recordset that exists after the GROUP BY, WHERE and JOIN have been applied, so in your case it's working on the distinct values, counting each just once....
----------------------------------------------------------------------------------
Your lack of planning does not constitute an emergency on my part...unless you're my manager...or a director and above...or a really loud-spoken end-user..All right - what was my emergency again?
June 6, 2008 at 11:11 am
Hi Matt,
Maybe I am doing something wrong, but I am getting the total of all promo_disc_amt for every line in the criteria, including the dupes.
If I do:
SELECT DISTINCT order_key, promo_disc_amt
FROM FactOrder
WHERE [promo_disc_code] IS NOT NULL AND [date_purchased_key] = 860
I get 32 rows, each with a distinct order_key and 25.00 as the value, save one which is 50. So the total discount should be 825.
When I run:
SELECT DISTINCT [order_key], Sum(promo_disc_amt) over (partition by NULL) totalPromo
FROM FactOrder
WHERE [promo_disc_code] IS NOT NULL AND [date_purchased_key] = 860
I get 32 rows, each with a distinct order_key and 1825.00 as the value, which is the same value as:
SELECT SUM(promo_disc_amt)
FROM FactOrder
WHERE [promo_disc_code] IS NOT NULL AND [date_purchased_key] = 860
What am I missing?
Thanks for the help!
Chris
June 6, 2008 at 12:18 pm
Raconteur (6/6/2008)
Hi Matt,Maybe I am doing something wrong, but I am getting the total of all promo_disc_amt for every line in the criteria, including the dupes.
If I do:
SELECT DISTINCT order_key, promo_disc_amt
FROM FactOrder
WHERE [promo_disc_code] IS NOT NULL AND [date_purchased_key] = 860
I get 32 rows, each with a distinct order_key and 25.00 as the value, save one which is 50. So the total discount should be 825.
When I run:
SELECT DISTINCT [order_key], Sum(promo_disc_amt) over (partition by NULL) totalPromo
FROM FactOrder
WHERE [promo_disc_code] IS NOT NULL AND [date_purchased_key] = 860
I get 32 rows, each with a distinct order_key and 1825.00 as the value, which is the same value as:
SELECT SUM(promo_disc_amt)
FROM FactOrder
WHERE [promo_disc_code] IS NOT NULL AND [date_purchased_key] = 860
What am I missing?
Thanks for the help!
Chris
Use the group by instead of the distinct - you should get different results. The group by throws out the duplicates FIRST, then the windowed function fires. DISTINCT happens after the windowed function fires, so it still evaluates the dupes. One of the interesting things aobut windowing - you have to be really careful about the "sequence of events".
I think you're looking for (this looks strange but should work):
SELECT [order_key], Sum(promo_disc_amt) over (partition by NULL) totalPromo
FROM FactOrder
WHERE [promo_disc_code] IS NOT NULL AND [date_purchased_key] = 860
group by orderkey,promo_disc_amt
----------------------------------------------------------------------------------
Your lack of planning does not constitute an emergency on my part...unless you're my manager...or a director and above...or a really loud-spoken end-user..All right - what was my emergency again?
June 6, 2008 at 1:29 pm
Mr. Matt... I bow to your expertise and the depth of your knowledge!
Works perfectly.
Thank you!!
June 6, 2008 at 8:48 pm
Thanks! Been dabbling with the windowed stuff for a little bit - trying to figure them out.
And - I do like the handle - story teller is always an interesting choice for this board...:)
----------------------------------------------------------------------------------
Your lack of planning does not constitute an emergency on my part...unless you're my manager...or a director and above...or a really loud-spoken end-user..All right - what was my emergency again?
June 6, 2008 at 9:52 pm
Cheers!
It's more of a job-title than anything... I am actually a writer/director. Software consulting is just my hobby. 🙂
C
June 7, 2008 at 12:13 pm
Nice one! At the risk of stating the obvious - I wasn't nearly that inspired when I picked mine....:D Good to see someone who was!
----------------------------------------------------------------------------------
Your lack of planning does not constitute an emergency on my part...unless you're my manager...or a director and above...or a really loud-spoken end-user..All right - what was my emergency again?
June 9, 2008 at 7:09 am
It is nice to see examples of various features (like OVER (PARTITION BY NULL)), but wouldn't a simple MIN() with group by do the same? I may have missed something in the examples, but as I understand it, this should also give correct result - that is, if what is required is a sum of all discounts (for a given condition):
SELECT SUM(Q.disc) as disc_total
FROM
(select order_key, min(promo_disc_amt) as disc
from FactOrder
where promo_disc_code IS NOT NULL AND date_purchased_key = 860
group by order_key) as Q
If the discount applies to entire order, not to particular orderlines, there should be no problems with it (although I didn't have a chance to test it, since you didn't post table structure and data sample).
June 9, 2008 at 9:17 am
Hi Vladan,
That looks like it will work as well, but, and I know I wasn't very clear in my original post, I was trying to avoid the temp table/subquery, which is what I was doing in the CTE.
Cheers,
Chris
Viewing 14 posts - 1 through 13 (of 13 total)
You must be logged in to reply to this topic. Login to reply