Query to Assign a Sequence Number

Question

Query to Assign a Sequence Number

Viewing 5 posts - 16 through 19 (of 19 total)

You must be logged in to reply to this topic. Login to reply

texpic SSCertifiable Points: 5887 More actions · Answer 1

I've been away from the computer most of the weekend (a first for me). Lots of interesting solutions. The simplicity of the code stuns me. Thanks!

simplicity may be the wrong word, elegance

Jeff Moden SSC Guru Points: 1004406 More actions · Answer 2

Thanks for the feedback, Texpic.

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

JonFox SSCrazy Points: 2157 More actions · Answer 3

Thanks for the encouragement Jeff! Frankly, this is why I've been trying to spend time on the forums lately...it seems almost every time I try to help someone, I learn something cool myself. 😀 The "cascading list sequencing" trick is something I hadn't encountered before, and it's great. I'm definitely adding that and ColdCoffee's "MAX() OVER" trick to my toolbox.

I've been playing with both solutions this morning trying to figure out which has the better performance. It actually looks like ColdCoffee's MAX() OVER version is more expensive than using the extra ROW_NUMBER(), at least from an I/O perspective; the execution plan shows a couple of extra Lazy Spools being aggregated and joined back to the final result set (I assume the result of the partitioned MAX() ) and this results in a significant amount of Worktable logical reads. I wonder if there's a way to get rid of that overhead, or if partitioning the MAX() function is inherently more expensive?

I have no idea at this point which is more effective from a CPU time perspective, though...I can't determine a clear winner with the small amount of test data.

Anyway, very cool stuff all the way around!

Jeff Moden SSC Guru Points: 1004406 More actions · Answer 4

JonFox (11/21/2011)
Thanks for the encouragement Jeff! Frankly, [font="Arial Black"]this is why I've been trying to spend time on the forums lately...it seems almost every time I try to help someone, I learn something cool myself[/font]. 😀 The "cascading list sequencing" trick is something I hadn't encountered before, and it's great. I'm definitely adding that and ColdCoffee's "MAX() OVER" trick to my toolbox.
I've been playing with both solutions this morning trying to figure out which has the better performance. It actually looks like ColdCoffee's MAX() OVER version is more expensive than using the extra ROW_NUMBER(), at least from an I/O perspective; the execution plan shows a couple of extra Lazy Spools being aggregated and joined back to the final result set (I assume the result of the partitioned MAX() ) and this results in a significant amount of Worktable logical reads. I wonder if there's a way to get rid of that overhead, or if partitioning the MAX() function is inherently more expensive?
I have no idea at this point which is more effective from a CPU time perspective, though...I can't determine a clear winner with the small amount of test data.
Anyway, very cool stuff all the way around!

The BOLD highlight in the above was an incredible and totally unexpected discovery for me when I first started posting. You just can't find a source of more real-world problems to solve and practice on and then learn from others than on these and similar forums.

And you definitely are made of the right stuff... look at the analysis you did above and realize that it's going to get a lot of people who read this interested in different ways of performance analysis and you'll never know just how much you've actually helped. Perhaps years down the road, you'll run into someone who says "Jon Fox, eh? Man, because of you and an old post about numbering repeating groups of yours and how you did an analysis, I was able to...". Well done, Jon.

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

homebrew01 SSC Guru Points: 55548 More actions · Answer 5

Jeff Moden (11/19/2011)
First, nicely done there Cold Coffee. I came up with an almost identical solution and saw yours when I was getting ready to post it. Your solution uses MAX() OVER (brilliant idea, BTW) which I believe is going to be more effecient than the method I'm using to calculate a pseudo-group number because I use an extra ROW_NUMBER() OVER to do it. That's provided that the RecordSeq column always reflects the correct order by Account and Date. If it doesn't, then there will be a problem. Replace the 8/6 date for the second row of "Jim" with 9/6 and see what I mean.

Sorry to revive an old thread, but wanted to say "Thanks"

I was doing something similar, with crazy loop processing running for hours and hours, and your code ran in about 15 seconds !