November 20, 2011 at 5:03 pm
I've been away from the computer most of the weekend (a first for me). Lots of interesting solutions. The simplicity of the code stuns me. Thanks!
simplicity may be the wrong word, elegance
November 21, 2011 at 4:55 am
Thanks for the feedback, Texpic.
--Jeff Moden
Change is inevitable... Change for the better is not.
November 21, 2011 at 7:47 am
Thanks for the encouragement Jeff! Frankly, this is why I've been trying to spend time on the forums lately...it seems almost every time I try to help someone, I learn something cool myself. 😀 The "cascading list sequencing" trick is something I hadn't encountered before, and it's great. I'm definitely adding that and ColdCoffee's "MAX() OVER" trick to my toolbox.
I've been playing with both solutions this morning trying to figure out which has the better performance. It actually looks like ColdCoffee's MAX() OVER version is more expensive than using the extra ROW_NUMBER(), at least from an I/O perspective; the execution plan shows a couple of extra Lazy Spools being aggregated and joined back to the final result set (I assume the result of the partitioned MAX() ) and this results in a significant amount of Worktable logical reads. I wonder if there's a way to get rid of that overhead, or if partitioning the MAX() function is inherently more expensive?
I have no idea at this point which is more effective from a CPU time perspective, though...I can't determine a clear winner with the small amount of test data.
Anyway, very cool stuff all the way around!
November 24, 2011 at 8:33 am
JonFox (11/21/2011)
Thanks for the encouragement Jeff! Frankly, [font="Arial Black"]this is why I've been trying to spend time on the forums lately...it seems almost every time I try to help someone, I learn something cool myself[/font]. 😀 The "cascading list sequencing" trick is something I hadn't encountered before, and it's great. I'm definitely adding that and ColdCoffee's "MAX() OVER" trick to my toolbox.I've been playing with both solutions this morning trying to figure out which has the better performance. It actually looks like ColdCoffee's MAX() OVER version is more expensive than using the extra ROW_NUMBER(), at least from an I/O perspective; the execution plan shows a couple of extra Lazy Spools being aggregated and joined back to the final result set (I assume the result of the partitioned MAX() ) and this results in a significant amount of Worktable logical reads. I wonder if there's a way to get rid of that overhead, or if partitioning the MAX() function is inherently more expensive?
I have no idea at this point which is more effective from a CPU time perspective, though...I can't determine a clear winner with the small amount of test data.
Anyway, very cool stuff all the way around!
The BOLD highlight in the above was an incredible and totally unexpected discovery for me when I first started posting. You just can't find a source of more real-world problems to solve and practice on and then learn from others than on these and similar forums.
And you definitely are made of the right stuff... look at the analysis you did above and realize that it's going to get a lot of people who read this interested in different ways of performance analysis and you'll never know just how much you've actually helped. Perhaps years down the road, you'll run into someone who says "Jon Fox, eh? Man, because of you and an old post about numbering repeating groups of yours and how you did an analysis, I was able to...". Well done, Jon.
--Jeff Moden
Change is inevitable... Change for the better is not.
November 25, 2015 at 1:40 pm
Jeff Moden (11/19/2011)
First, nicely done there Cold Coffee. I came up with an almost identical solution and saw yours when I was getting ready to post it. Your solution uses MAX() OVER (brilliant idea, BTW) which I believe is going to be more effecient than the method I'm using to calculate a pseudo-group number because I use an extra ROW_NUMBER() OVER to do it. That's provided that the RecordSeq column always reflects the correct order by Account and Date. If it doesn't, then there will be a problem. Replace the 8/6 date for the second row of "Jim" with 9/6 and see what I mean.
Sorry to revive an old thread, but wanted to say "Thanks"
I was doing something similar, with crazy loop processing running for hours and hours, and your code ran in about 15 seconds !
Viewing 5 posts - 16 through 19 (of 19 total)
You must be logged in to reply to this topic. Login to reply