Solving the “Running Totals” Problem in SQL Server 2005 with Partitioned Tables

Question

Solving the “Running Totals” Problem in SQL Server 2005 with Partitioned Tables

Viewing 15 posts - 16 through 30 (of 40 total)

You must be logged in to reply to this topic. Login to reply

Lynn Pettis SSC Guru Points: 442467 More actions · Answer 1

David Jackson (1/27/2009)
Lynn Pettis (1/27/2009)
Smendle (1/27/2009)
That can happen on multi-processor systems that SQL is allowed to use.
I am sure this is in response to the following:
I received a private message with a question regarding my article that I can't answer, so I thought I'd put it here with the hopes that some of the gurus (Jeff, Gail, Grant, Gus, et al) out there may be able to help out.
Why is the CPU time greater than the elapsed time
CPU time = 5063 ms, elapsed time = 3309 ms.
Thinking about it, that actually makes sense and isn't something I'd have come up with on my own, thanks. Do any of the other gurus out there also agree?
I concur. What happens if you add MAXDOP = 1 to that particular query?
FWIW, in a previous life MAXDOP was used as execution plans differed wildly on (single CPU, pre-hyper thread) Test kit than when they ran on the multi-CPU Live kit!
Nice article btw 😉
Dave Jackson

First, to remember my manners, thank you for the complement on the article.

I'd have to go back into my article to figure out which one it was, and then try it again at home with MAXDOP set to 1. The other thing I'll need is to find the time to do it. Maybe I'll get a few minutes some time this week, but right now I'm a bit busy with another article as well as other life activities that are clamoring for my time.

Grant Fritchey SSC Guru Points: 398898 More actions · Answer 2

Uh... Wow... Over-achieve much?

I feel a bit stupid to be commenting on this. I don't have much experience with partitioning beyond reading up on the theory and a few discussions. If I can make some spare time, I'd like to see what the execution plans looked like between your final solution and Jeff's original, purely academic.

Nice job.

"The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
- Theodore Roosevelt

Author of:
SQL Server Execution Plans
SQL Server Query Performance Tuning

Lynn Pettis SSC Guru Points: 442467 More actions · Answer 3

Grant Fritchey (1/27/2009)
Uh... Wow... Over-achieve much?
I feel a bit stupid to be commenting on this. I don't have much experience with partitioning beyond reading up on the theory and a few discussions. If I can make some spare time, I'd like to see what the execution plans looked like between your final solution and Jeff's original, purely academic.
Nice job.

Thank you, this makes me feel really good. I'll try to get the execution plans for you and post them here. It may a few days, as I do have several things going on right now.

Grant Fritchey SSC Guru Points: 398898 More actions · Answer 4

Lynn Pettis (1/27/2009)
Thank you, this makes me feel really good. I'll try to get the execution plans for and post them here. It may a few days, as I do have several things going on right now.

No rush. If I stumble across a spare hour or two I'll do it myself. I'm just curious.

"The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
- Theodore Roosevelt

Author of:
SQL Server Execution Plans
SQL Server Query Performance Tuning

Lynn Pettis SSC Guru Points: 442467 More actions · Answer 5

Actually, Grant, I'm curious also, and with your experience with working with execution plans, your feedback on that side would be quite interesting.

YeshuaAgapao Hall of Fame Points: 3495 More actions · Answer 6

I don't get this. Both of the solutions create a non-partitioned instance of the partitioned table. One form is a temp table, and the other form is a materialized view (by the clustered index). This seems to me that it defeats the purpose of partitioning the table in the first place.

Lynn Pettis SSC Guru Points: 442467 More actions · Answer 7

YeshuaAgapao (1/27/2009)
I don't get this. Both of the solutions create a non-partitioned instance of the partitioned table. One form is a temp table, and the other form is a materialized view (by the clustered index). This seems to me that it defeats the purpose of partitioning the table in the first place.

The purpose was to find a method for updating a partitioned table with running balances. The highly performant method discussed in Jeff Moden's article doesn't work on partitioned tables. So, how do you accomplish this task if you need to do it? The article shows you two methods for accomplishing this task.

Behind all this, is a principle, how do you accomplish any task involving partitioned tables that fails, but works fine when done with a non-partitioned (or standard) table. The article wasn't discussing the pro's and con's of storing running totals in the database vs computing them on the client-side, or the pro's and con's of partitioned tables.

I hope this explanation helps you with understanding the idea behind the article.

Jeff Moden SSC Guru Points: 1004392 More actions · Answer 8

I guess I should first say thank you for the wonderful acknowledgements. Thanks, Lynn.

It's also a real pleasure to read an article with a logical progression and a whole lot of "Alice's Restaurant" code examples to play with especially when they're that readable. Very nicely done, Mr. Pettis. 🙂

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

Lynn Pettis SSC Guru Points: 442467 More actions · Answer 9

And I must thank you in return, Jeff. If it hadn't been for you, and some of the other gurus on SSC, I never would have written this article. Your article was the basis of it, and your tutelage the catalyst.

I also appreciate your having taken the time to read my initial draft of the article. Your comments and suggestions, I feel, helped the article as well.

So, I have to give you a well done as well.

Adam Haines SSC-Insane Points: 23217 More actions · Answer 10

Lynn Pettis (1/27/2009)
Smendle (1/27/2009)
That can happen on multi-processor systems that SQL is allowed to use.
I am sure this is in response to the following:
I received a private message with a question regarding my article that I can't answer, so I thought I'd put it here with the hopes that some of the gurus (Jeff, Gail, Grant, Gus, et al) out there may be able to help out.
Why is the CPU time greater than the elapsed time
CPU time = 5063 ms, elapsed time = 3309 ms.
Thinking about it, that actually makes sense and isn't something I'd have come up with on my own, thanks. Do any of the other gurus out there also agree?

The answer to this question is parallelism. The cpu time is greater because the query is spread across multiple processors, which allows you to do a lot more cpu processing in a shorter period of time. This is how CPU time becomes greater than elapsed time. If you set max dop to 1, you will not see the same behavior.

My blog: http://jahaines.blogspot.com

Adam Haines SSC-Insane Points: 23217 More actions · Answer 11

Forgot to comment on the article :hehe:.

I thought the article was very informative, easy to read/understand, and offered great alternative methods to solving the "running total" enigma.

Nicely done.

My blog: http://jahaines.blogspot.com

Lynn Pettis SSC Guru Points: 442467 More actions · Answer 12

Adam Haines (1/27/2009)
Forgot to comment on the article :hehe:.
I thought the article was very informative, easy to read/understand, and offered great alternative methods to solving the "running total" enigma.
Nicely done.

Thank you, I appreciate your kind words regarding the article.

YeshuaAgapao Hall of Fame Points: 3495 More actions · Answer 13

Makes sense now. I would still suggest that if the running totals requirement was the dominant form of querying on the table, I would rather just un-partition the whole table rather than create a non-partitioned duplicate of it (temporarily with the temp table or worse, permanently with the non-paritioned indexed view). Otherwise I would really try not to fufill any kind of real-time reporting requirement on it, only batch pre-generation of the aggregates.

Lynn Pettis SSC Guru Points: 442467 More actions · Answer 14

YeshuaAgapao (1/28/2009)
Makes sense now. I would still suggest that if the running totals requirement was the dominant form of querying on the table, I would rather just un-partition the whole table rather than create a non-partitioned duplicate of it (temporarily with the temp table or worse, permanently with the non-paritioned indexed view). Otherwise I would really try not to fulfill any kind of realtime reporting requirement on it, only batch pre-generation of the aggregates.

I can't agree with un-partitioning the table. There may be other reasons that the table is partitioned. One reason, perhaps it isn't one million rows of data over ten years but 1 billion rows of data over ten years. With that, the majority of the queries against the table are limited to the current year and maybe the previous year with less than five percent of all queries access data older than that, but the data for all ten years needs to be available based on corporate policies.

Please remember, the actual purpose of this article wasn't just to show you how to compute running totals on partitioned tables. The purpose was to demonstrate a methodology that could possibly be used to address problems with updates to partitioned tables. Can it be used in all cases, maybe not, but it is a tool that is available.

Two other items. If you were to use the interim table method, you actually would only create a table with the necessary columns needed to complete the update. If the source table has 25 columns and is 2500 bytes wide, but you only need 6 columns and 75 bytes, that is all you would use. Same idea with the indexed view. You would only build the view using the columns you needed to perform the update.

YeshuaAgapao Hall of Fame Points: 3495 More actions · Answer 15

Yeah if you could restrict the rows (and columns if savings is alot) it would be more reasonable. None of your code had WHERE clauses so I assumed that you wanted to crunch the whole table. If there was a WHERE clause, does Jeff's version break even if the qualifying rows don't span partitions?