December 7, 2012 at 9:28 am
This seems as though it would be terribly simple...but I am stuck.
I need to distribute as "average sales" value among a group that is categorized by employee size, industry and a category.
A sample of the data:
Category Avg_Sales Emp_Size Industry SalesRank Emp_Rank DesiredOutcome
01-Widgets $576 1 - 4 Accommodations 3 1 $576
01-Widgets $832 5 - 9 Accommodations 1 2 $832
01-Widgets $672 10 - 19 Accommodations 2 3 $832
01-Widgets $451 20 - 49 Accommodations 5 4 $832
01-Widgets$459 50 - 99 Accommodations 4 5 $832
What I need to do is distribute the Avg_Sales value according to the Emp_Rank BUT ensure that the greaterEmp_Rank (i.e. larger company)
never has a smaller Avg_Sales value than a smaller Emp_Rank within the same Category and Industry.
I started this by adding the SalesRank and Emp_Rank columns thinking I could complete this with some case statements and some simple math.....I've hit a wall and any and all help is appreciated.
Thank you in Advance.
Lonnie M
December 7, 2012 at 9:31 am
I apologize for the format in which this question posted. Not only did the spacing between columns not follow when I "posted" my color for the desired column looks like html. Not sure how to correct this.
December 7, 2012 at 10:49 am
I wouldn't worry about the format right away. I need a better explanation of what you actually mean by "distribute". It's often best when posting problems to provide a clear, but detailed description of what data you have, what you want to do with it, AND a good picture of the expected results for a given sample set of data, with specific examples for things that might be exceptions to a given rule. Most importantly, ALL the "rules" that you need to operate with need to be specified.
That said, you appear to want to allocate sales volume across various product and industry categories, but exactly how isn't clear, other than that you need to have your listed rule apply to that allocation. Am I on the right track, or off on some unknown page?
Steve (aka sgmunson) 🙂 🙂 🙂
Rent Servers for Income (picks and shovels strategy)
December 7, 2012 at 11:35 am
I think you are on the mark with what I am after. The "formatting" thing was supposed to be the clear example you speak of.
The "Desired Column" is the far right column.
I need to distribute the "maximum" average sales amount among the Categories/Industry and Employee sizes. The variable here is employee size. As you can see all employee size's have an "Avg_Sales" value unique to them, I have to distribute these values so that the larger employee size never has a lesser Avg_Sales amount than the previous employee size. (Employee size 5-9 can never have a lesser average sales amount than the employee size 1-4).
Hope this helps.
December 7, 2012 at 12:04 pm
And now you've added "maximum average sales amount" to the mix, which you have not defined. My problem is that I don't really know what the average sales numbers you have actually represent, nor do I know how or why they get applied to the various categories, employee-sizes, and industries. This is, no doubt, clear as a bell for you, but as I can't know what the numbers are supposed to mean, you have to explain that in order to have a shot at me knowing what, exactly, you're trying to do.
I know it can be both painful and difficult to have to explain every detail, but without being able to have a clear, specified, defined meaning for the words average, maximum, and distributed, as they apply to your particular situation, the rest might as well be gibberish. One persons maximum might be across a particular set of data that they haven't specifically mentioned, or another's might be across a specfic category of records, and be something that only someone working for the company they work for would know. All too often, the requests for help here contain very vague statements that even if computers could speak English, they might still easily be thrown for the proverbial loop...
Does this help clarify the level of detail needed? Sometimes, two people just clash on communication style...
Steve (aka sgmunson) 🙂 🙂 🙂
Rent Servers for Income (picks and shovels strategy)
December 7, 2012 at 12:21 pm
Thank you again for your response. I'm painfully aware how difficult it can be to communicate, but I appreciate your effort!
I believe a clearly illustrated matrix/spreadsheet would explain it best but I cannot provide this as this software seems to remove all white spaces.
1) Each row of data contains an industry/category/employee size and an "actual" average sales amount.
2) I need to distribute a (what I referred to as "maximum") sales amount to each row of data.
3) The amount I need to distribute may be the row's actual average sales amount or the average sales amount from the smaller employee size (whichever is greater)
4) What cannot happen is that a larger employee size (such as 50-99 employees) would have a smaller average sales amount than a smaller employee size (such as 1-4 employees). If that were to happen (which it does in reality) the larger employee size would assume the average sales amount of the smaller employee size.
5) Example (I'll use .... as placeholders for space to help with readability)
Industry.............Category.....Employees.....Avg_Sales....."DESIRED OUTPUT"
Accomodations....Widgets........1-4..............100..................100
Accomodations....Widgets........5-9..............75...................100
Accomodations....Widgets........10-19...........175.................175
Accomodations....Widgets........20-49...........225.................225
Accomodations....Widgets........50-99...........200.................225
December 7, 2012 at 12:54 pm
based on the first post, here's the table, but i think that's already been created by an existing select statement;
we need data like this to generate a working query to generate the desired output.
With SomeSampleData (Category,Avg_Sales,Emp_Size,Industry,SalesRank,Emp_Rank )
AS
(
SELECT '01-Widgets',' $576',' 1 - 4',' Accommodations',' 3',' 1' UNION ALL
SELECT '01-Widgets',' $832',' 5 - 9',' Accommodations',' 1',' 2' UNION ALL
SELECT '01-Widgets',' $672',' 10 - 19',' Accommodations',' 2',' 3' UNION ALL
SELECT '01-Widgets',' $451',' 20 - 49',' Accommodations',' 5',' 4' UNION ALL
SELECT '01-Widgets',' $459',' 50 - 99',' Accommodations',' 4',' 5'
)
select * from SomeSampleData
Lowell
December 7, 2012 at 2:01 pm
Thank you for taking the time to put into T-SQL. I should have done this to begin with. Hopefully with this and the above post (2 prior) the situation will be understandable.
December 7, 2012 at 4:15 pm
lmeinke (12/7/2012)
Thank you for taking the time to put into T-SQL. I should have done this to begin with. Hopefully with this and the above post (2 prior) the situation will be understandable.
This requires ordered data. The only thing I can see that relects the correct order is the combination of Industry, Category, and Employees with the caveat that Employees must be converted to something sortable. For most code, that's going to require a very expensive triangular join to even come close to doing this in a set-based fashion (and, no... Recursive CTEs are not set-based and are frequently slower than a cursor in situations such as this).
My question to you is, can the table be modified to accept one or two persisted calculated columns to make this query lightning quick? If not, then I need to know how many rows the table will have in it so I can test an alternate method for performance.
--Jeff Moden
Change is inevitable... Change for the better is not.
December 10, 2012 at 7:20 am
Jeff,
Thank you for your response. Yes, the table can be modified at this point. I started down that road by adding numeric columns representing employee size and avg_sales but am simply at a loss how to "use" them.
December 13, 2012 at 12:27 pm
Jeff,
If you have any more insight into this issue I would sure appreciate it.
Thank you,
Lonnie
December 13, 2012 at 3:26 pm
With some thought going into how to add some dummy records to the scenario, you might be able to do something like this, and avoid the triangular join, as you appear to already have the employee size ranked.
;WITH SampleData (Category,Avg_Sales,Emp_Size,Industry,SalesRank,Emp_Rank) AS (
SELECT '01-Widgets', 0, '0-0', 'Accommodations',' 0', 0 UNION ALL
SELECT '01-Widgets', 100, '1-4', 'Accommodations',' 3', 1 UNION ALL
SELECT '01-Widgets', 75, '5-9', 'Accommodations',' 1', 2 UNION ALL
SELECT '01-Widgets', 175, '10-19', 'Accommodations',' 2', 3 UNION ALL
SELECT '01-Widgets', 225, '20-49', 'Accommodations',' 5', 4 UNION ALL
SELECT '01-Widgets', 200, '50-99', 'Accommodations',' 4', 5
)
SELECT SD.Industry, SD.Category, SD.Emp_Size,
CASE
WHEN SD0.Avg_Sales > SD.Avg_Sales THEN SD0.Avg_Sales
ELSE SD.Avg_Sales
END AS Avg_Sales
FROM SampleData AS SD
INNER JOIN SampleData AS SD0
ON SD.Industry = SD0.Industry
AND SD.Category = SD0.Category
AND SD.Emp_Rank = SD0.Emp_Rank + 1
WHERE SD.Emp_Rank > 0
Let me know if you think this might work, or if you don't see the concept behind the "zero dummy record".
Steve (aka sgmunson) 🙂 🙂 🙂
Rent Servers for Income (picks and shovels strategy)
December 13, 2012 at 5:23 pm
I'm not sure I really get what you're after but perhaps a Quirky Update will do the job:
CREATE TABLE #SalesByEmployeeCategory
(Category VARCHAR(100)
,Avg_Sales MONEY
,Emp_Size VARCHAR(100)
,Industry VARCHAR(100)
,DesiredOutput MONEY
,EmpRank AS (CAST(LEFT(Emp_Size, CHARINDEX(' ', Emp_Size)) AS INT)) PERSISTED
PRIMARY KEY CLUSTERED (Industry, Category, EmpRank))
INSERT INTO #SalesByEmployeeCategory
(Category, Avg_Sales, Emp_Size, Industry)
SELECT '01-Widgets',' $100','1 - 4','Accommodations' UNION ALL
SELECT '01-Widgets',' $75','5 - 9','Accommodations' UNION ALL
SELECT '01-Widgets',' $175','10 - 19','Accommodations' UNION ALL
SELECT '01-Widgets',' $225','20 - 49','Accommodations' UNION ALL
SELECT '01-Widgets',' $200','50 - 99','Accommodations' UNION ALL
SELECT '02-Furniture',' $100','1 - 4','Manufacturing' UNION ALL
SELECT '02-Furniture',' $75','5 - 9','Manufacturing' UNION ALL
SELECT '02-Furniture',' $175','10 - 19','Manufacturing' UNION ALL
SELECT '02-Furniture',' $225','20 - 49','Manufacturing' UNION ALL
SELECT '02-Furniture',' $200','50 - 99','Manufacturing'
DECLARE @IndCat VARCHAR(200) = ''
,@LastSales MONEY = $0
UPDATE #SalesByEmployeeCategory WITH(TABLOCKX)
SET DesiredOutput=CASE WHEN @LastSales > Avg_Sales AND @IndCat = Industry+Category
THEN @LastSales ELSE Avg_Sales END
,@LastSales=CASE WHEN @IndCat <> Industry+Category THEN 0
WHEN @LastSales > Avg_Sales THEN @LastSales ELSE Avg_Sales END
,@IndCat=Industry+Category
OPTION (MAXDOP 1)
SELECT *
FROM #SalesByEmployeeCategory
DROP TABLE #SalesByEmployeeCategory
If this seems to work for your case and you're not familiar with the QU, I suggest you read about the various rules (which I think I applied right) in this article by Jeff:
http://www.sqlservercentral.com/articles/T-SQL/68467/
My thought question: Have you ever been told that your query runs too fast?
My advice:
INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.
Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
Since random numbers are too important to be left to chance, let's generate some![/url]
Learn to understand recursive CTEs by example.[/url]
[url url=http://www.sqlservercentral.com/articles/St
December 14, 2012 at 3:19 pm
Thank you for the code and suggestion. I did run the code and it works for all but the very first entry in an industry/category group. I will look into the code a bit more but wanted to thank you.
Lonnie M
December 14, 2012 at 3:26 pm
lmeinke (12/14/2012)
Thank you for the code and suggestion. I did run the code and it works for all but the very first entry in an industry/category group. I will look into the code a bit more but wanted to thank you.Lonnie M
If you were replying to me, it's because you'd have to have a "dummy" record for each Industry and Category combination, with a 0 value for Emp_Rank and a 0 value for the Avg_Sales. You could create this set of records and then UNION ALL it with the rest before applying my query methodology. Creating it would be a simple GROUP BY with static values for Avg_Sales and Emp_Rank.
Steve (aka sgmunson) 🙂 🙂 🙂
Rent Servers for Income (picks and shovels strategy)
Viewing 15 posts - 1 through 15 (of 40 total)
You must be logged in to reply to this topic. Login to reply