January 16, 2009 at 2:09 pm
I've solved the problem for folks many times so I'm not asking for a solution here. I'm getting ready to rewrite one of my articles to include some additional information. One of the things I'm writing about is ranking that can't really be done using simple windowing functions like ROW_NUMBER(), RANK(), or DENSE_RANK() and, of course, I'm writing on how to do it without a Cursor, While Loop, or Triangular Join.
So... I got out an example and started to write about it and it suddenly dawned on me... I don't know why people actually need to do such a thing! I've never asked anyone why they need to do such a thing and, like I said, I don't need to know how to solve it... I need to know why people need to do it. The classic posts look something like this...
Here's my data...
CREATE TABLE #MyHead
(
RowNum INT IDENTITY(1,1),
SomeDate DATETIME,
SomeType VARCHAR(10)
)
INSERT INTO #MyHead
(SomeDate, SomeType)
SELECT '20090116 01:00','AA' UNION ALL
SELECT '20090116 01:10','AA' UNION ALL
SELECT '20090116 01:20','AA' UNION ALL
SELECT '20090116 01:30','AA' UNION ALL
SELECT '20090116 01:00','BB' UNION ALL
SELECT '20090116 01:10','BB' UNION ALL
SELECT '20090116 01:20','BB' UNION ALL
SELECT '20090116 01:30','BB' UNION ALL
SELECT '20090116 01:40','BB' UNION ALL
SELECT '20090116 01:50','BB' UNION ALL
SELECT '20090116 01:40','AA' UNION ALL
SELECT '20090116 01:50','AA' UNION ALL
SELECT '20090116 02:00','BB' UNION ALL
SELECT '20090116 02:10','BB'
I need an output that looks like this...
RowNum SomeDate SomeType SomeSequence
----------- ----------------------- ---------- ------------
1 2009-01-16 01:00:00.000 AA 1
2 2009-01-16 01:10:00.000 AA 2
3 2009-01-16 01:20:00.000 AA 3
4 2009-01-16 01:30:00.000 AA 4
5 2009-01-16 01:00:00.000 BB 1
6 2009-01-16 01:10:00.000 BB 2
7 2009-01-16 01:20:00.000 BB 3
8 2009-01-16 01:30:00.000 BB 4
9 2009-01-16 01:40:00.000 BB 5
10 2009-01-16 01:50:00.000 BB 6
11 2009-01-16 01:40:00.000 AA 1
12 2009-01-16 01:50:00.000 AA 2
13 2009-01-16 02:00:00.000 BB 1
14 2009-01-16 02:10:00.000 BB 2
Notice that the SomeSequence value starts over everytime SomeType changes even though they all occur on the same day and have some shared times.
Why on Earth do people need to do such a thing? [font="Arial Black"]What are the business reasons for doing such a thing?[/font]
Thanks for the help or maybe even just some ideas, folks.
--Jeff Moden
Change is inevitable... Change for the better is not.
January 16, 2009 at 2:18 pm
I've needed (perhaps that's too strong of a word, perhaps I just couldn't come up with a better alternative) to do just this type of silliness while extracting data from my nicely normalized database to put into a flat file full of repeating groups to send to another organization.
Although I probably could have done it with a numbers table, I ended up doing it this way. Oh yeah doing it in SQL 2000 so no windowing functions...
-Luke.
January 16, 2009 at 2:29 pm
If you were a marketing person for CDW - knowing what the top five selling products are within each sales department would be something important. The last x times Mrs Jones/every person in the ICU got their medicine, etc....
----------------------------------------------------------------------------------
Your lack of planning does not constitute an emergency on my part...unless you're my manager...or a director and above...or a really loud-spoken end-user..All right - what was my emergency again?
January 16, 2009 at 2:36 pm
clearly any data without an arbitrary number next to it is worthless! everyone knows that! how else do you know who's number 1, or to give a goals to numbers 2,3 and 4!
Lowell
January 16, 2009 at 3:05 pm
Luke L (1/16/2009)
...to put into a flat file full of repeating groups to send to another organization.
Heh.. that's what you did... my question would be, why did they want it that way?
--Jeff Moden
Change is inevitable... Change for the better is not.
January 16, 2009 at 3:09 pm
Matt Miller (1/16/2009)
If you were a marketing person for CDW - knowing what the top five selling products are within each sales department would be something important. The last x times Mrs Jones/every person in the ICU got their medicine, etc....
Thanks, Matt... I gotta think about those answers. I guess I don't see how either of those business reasons would be resolved by the problem these folks have presented in the past. Like Lowell said, it all seems arbitrary.
--Jeff Moden
Change is inevitable... Change for the better is not.
January 16, 2009 at 3:10 pm
The excuse I usually here is this:
It's easier to create and transfer one file instead of multiple files.
January 16, 2009 at 3:11 pm
Lowell (1/16/2009)
clearly any data without an arbitrary number next to it is worthless! everyone knows that! how else do you know who's number 1, or to give a goals to numbers 2,3 and 4!
BWAA-HAA!!! Absolutely spot on and inline with my thinking on this problem! That's why I asked the question. Maybe I'm just getting too old and can't think right anymore, but I see absolutely no practical use for this.:P
--Jeff Moden
Change is inevitable... Change for the better is not.
January 16, 2009 at 3:14 pm
Lynn Pettis (1/16/2009)
The excuse I usually here is this:It's easier to create and transfer one file instead of multiple files.
Now THAT I get! It's a bit strange in how they don't actually make all of SomeType contiguous especially within the same day, but that's a possibility! I've just gotta make a better reason out of that because it's an odd one.
Thanks, Lynn.
--Jeff Moden
Change is inevitable... Change for the better is not.
January 16, 2009 at 3:16 pm
Matt Miller (1/16/2009)
If you were a marketing person for CDW - knowing what the top five selling products are within each sales department would be something important. The last x times Mrs Jones/every person in the ICU got their medicine, etc....
SQL 2005 or later, I'd use Cross Apply and Rank/DenseRank for these. Not a UDF, just a correlated derived table in the From clause. That's my first answer, but I'd have to performance-test it before I'd take it live.
I agree with Jeff that quite often, we get asked for "how do I...?", answer the question, ask "why?" and get a reply that outlines the actual problem, which is usually better solved by some other means.
- Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
Property of The Thread
"Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon
January 16, 2009 at 3:24 pm
Answer: Because pigs can fly!
* Noel
January 16, 2009 at 3:34 pm
noeld (1/16/2009)
Answer: Because pigs can fly!
LOL! Oh Lordy, that's what I'm starting to think about the reasons behind this problem. ๐
--Jeff Moden
Change is inevitable... Change for the better is not.
January 16, 2009 at 6:11 pm
I came across the 'top 5' requirement. I needed to identify rows that have some values not smaller than 5 biggest values in the table. I guess it's a pretty valid reason ๐
Piotr
...and your only reply is slร inte mhath
January 16, 2009 at 7:18 pm
Piotr Rodak (1/16/2009)
I came across the 'top 5' requirement. I needed to identify rows that have some values not smaller than 5 biggest values in the table. I guess it's a pretty valid reason ๐Piotr
And the data was actually formatted in a fashion similar to what I posted and the "sequence" restarted just because the "type" changed?
--Jeff Moden
Change is inevitable... Change for the better is not.
January 16, 2009 at 7:37 pm
I had requirements for a datamart application where we had to produce many reports with rankings within various hierarchies. Example: product sales ranked within departments, product categories, regions, stores, etc. by day, week, month, quarter, year, or arbitrary periods of time. An even more interesting challenge was that some reports required multiple ranks on the same line: by unit sales, by dollars sales, by gross margin, by gross margin percentage, and by a weighted rank combining the those last four metrics.
After youโve done a few hundred variations of these in SQL 2000 without those fancy ROW_NUMBER(), RANK(), or DENSE_RANK() functions you will indeed have a reason to gripe, but until then I spit on your puny and pathetic single ranking queries. ๐
As for the business reasons, rankings are apparently vital for the management of retail inventories and no self respecting product buyer or inventory analyst can leave home without one.
Viewing 15 posts - 1 through 15 (of 94 total)
You must be logged in to reply to this topic. Login to reply