June 1, 2006 at 4:47 pm
I am trying to make a query off of our sales data that alerts managers if the customer is purchasing more or less then the usual. Problem is I can not think of how to do it. I want to look back a specific amount of orders (Top 5) for only ones that match historical buys. Currently I do not match any because the top5 is getting in the way. I really do not want to use a temp table. Can you think of any other way?
sample:
Select Sales_Order_Id, Customer, item, Invoiced_Qty, Avg_Qty
From Tbl_Sales
Inner Join
(Select Ledger.Customer, Ledger.Item, Avg(Sold_Qty) as Avg_Qty
From
(Select Top 5 Customer, Item, Sold_Qty, Posted_Date From Ledger Order BY Posted_date DESC)Last_5_Orders)Posted_Summary
On Posted_Summary.Customer = Sales.Customer AND Posted_summary.Item = Sales.Item
June 1, 2006 at 5:24 pm
So to clarify, you want to get all those customers whose latest order is for a larger quantity than the average for that customer's last five paid orders? Should we assume that the latest order is not necessarily paid? You realise that there is a good chance that any given purchase will be above average (50% I intuit - don't ask for a proof)? Also, is "Posted_Summary.Customer = Sales.Customer AND Posted_summary.Item = Sales.Item" really enough to link the records? is there not an order ID or at the very least a datetime (I wouldn't be very happy using a date either though)?
Tim Wilkinson
"If it doesn't work in practice, you're using the wrong theory"
- Immanuel Kant
June 1, 2006 at 9:16 pm
These are unposted sales orders. This is an alert to notify of Quantity out of tolerance. Cost has no berring. Basically this is to help in forecasting (short term) production wise. Customer and item are enough to bind on. The query is set up right I just need to know how to do the join.
June 1, 2006 at 10:35 pm
I would create a view that records customer ID, item ID and the average cost of their last 5 orders (Cust_ID, ItemID, AvgLast5Orders) using similar syntax to what you have in your join, then look up the view in your statement... means you don't have a horrible looking and hard to interpret statement and will probably be more maintainable.
June 2, 2006 at 7:56 am
I'd go with Brent's suggestion of breaking the TOP 5 bit out into a separate view. Depending on other factors, possibly even an indexed view.
Incidentally, I also noticed in this bit: Select Ledger.Customer, Ledger.Item, Avg(Sold_Qty) as Avg_Qty, you were using Ledger as the table. You should be using Last_5_Orders, as that's the alias you gave the TOP 5 query. Not sure if that was just an error when simplifying the query for the post or not.
June 2, 2006 at 8:28 am
I understand what you guys are saying but when I do the top 5 thing I only get 5 records. I need the top 5 per customer/item combination. Am I not seeing something obvious?
June 2, 2006 at 8:42 am
Yes, your TOP 5 limits records to the top 5, not the top 5 per customer/item. You could do a correlated TOP 5 subquery but I wouldn't recommend it. There are other ways...
Tim Wilkinson
"If it doesn't work in practice, you're using the wrong theory"
- Immanuel Kant
June 2, 2006 at 8:43 am
What about a UDF. I think I may go that way
June 2, 2006 at 8:55 am
Can you not use a GROUP BY to list all the customer/item combinations and a user-defined function to get the average of the latest five? Something like this:
SELECT Customer, Item, fn_AveLast5OrderQty(Sold_Qty)
FROM Ledger
GROUP BY Customer, Item
Of course, the hard part is writing the function! If you post DDL and sample data I'm sure somebody will be able to help you.
John
June 2, 2006 at 9:00 am
Writing a function to do that is straightforward - but if it's running a TOP 5 against the table for every input row, the performance could be dire! The tricky bit is to do the query in a nice bit of SQL.
JM - I think the params to the function would be customer and item, not Sold_Qty.
Tim Wilkinson
"If it doesn't work in practice, you're using the wrong theory"
- Immanuel Kant
June 2, 2006 at 9:08 am
stax68, the performance somewhat depends on where this will be used. Since it sounds like a flag during the order process, or when a manager is reviewing the order, it would likely only be evaluated for less than 50 rows. The performance difference isn't that great, and the UDF code will be a LOT easier to maintain.
The subquery would be a lot easier if it was orders within the last month, say, instead of just the last five.
June 2, 2006 at 9:14 am
Yes with small recordsets the impact of inefficient code can be negligable.
Tim Wilkinson
"If it doesn't work in practice, you're using the wrong theory"
- Immanuel Kant
June 2, 2006 at 9:35 am
Here's one way of doing it in a single SQL statement, but it's fairly messy and essentially the same as the UDF method. Might give the optimiser some scope for improvement over UDFs though. Given that there's obviously a WHERE clause in the original code, Marshall's point is a good one, that it's probably irrelevant in this case.
Only one other point - you results are indeterministic, since you don't have a unique key to order on in your TOP 5. Again, probably more of a theoetical point, since if you have a time portion on your dates there's less scope for ties, and the query plan (or other unofficial regularities) may dictate a certain order anyway. And if no-one is going to view the same data twice, it doesn't really matter much I suppose...
adventureworks
soh.SalesPersonID, soh.CustomerID, v.AvgSubTot, soh.SubTotal
Sales.SalesOrderHeader soh
select
Sales.SalesOrderHeader soh2
SalesOrderID in
(select top 5 SalesOrderID
)
by soh2.SalesPersonID, soh2.CustomerID
V
V.SalesPersonID = soh.SalesPersonID
V.CustomerID = soh.CustomerID
Tim Wilkinson
"If it doesn't work in practice, you're using the wrong theory"
- Immanuel Kant
June 2, 2006 at 9:36 am
"JM - I think the params to the function would be customer and item, not Sold_Qty"
Tim
I haven't got a great deal of experience with functions, but if we make an analogy with the AVG function, AVG(Sold_Qty) will give you an average of all sale quantities for a given customer and item. My function would be the same, but would return the average for the five most recent sales. Perhaps I would need to put Posted_Date as a second parameter? Maybe I'll think this through over the weekend...
John
June 2, 2006 at 9:48 am
5 most recent sales to whom, of what?
The (scalar) UDF in a SQL statement acts like a correlated subquery. Just as the subquery is joined on Customer and Item, the function needs those values to fetch the right data.
Tim Wilkinson
"If it doesn't work in practice, you're using the wrong theory"
- Immanuel Kant
Viewing 15 posts - 1 through 14 (of 14 total)
You must be logged in to reply to this topic. Login to reply