January 20, 2010 at 10:19 pm
Comments posted to this topic are about the item The power of batching Transactions
January 20, 2010 at 10:25 pm
I think the reason is that the batched transactions reduce the cost of getting and release locks compare to one transaction per statement...
January 21, 2010 at 7:22 am
This article leaves me with more questions than answers.
First off, there are no indexes on the table in the example. How would adding indexes and inserting non-sequential data affect the batch performance?
Second, where does this performance gain come from? Is it just from the reduced locking? Could it also be related to row size vs page size?
Is there a way, short of trial and error, to come up with at least a good starting point for an efficient batch size given a particular table schema?
January 21, 2010 at 10:01 am
Also, this only deals with 1 table. What about if you are inserting 1 record each into 10 tables (instead of 10 records into 1 table)?
I also guess that the 80 records performance boost has to do with the number of records in a page (although at 8k per page, that would mean a single record with an int column would be ~100 bytes, right?)
January 21, 2010 at 6:35 pm
Nicely written article by the author... can't take anything away from Manor for that.
I thoroughly understand the intent of the article... it's a method for making necessary RBAR faster. The problem is, it's still RBAR and there are a whole lot of folks that don't understand when RBAR is actually necessary. For example, it is patently [font="Arial Black"]not[/font] necessary to use any form of RBAR to generate random numbers... it can easily be done in a much higher performance set based fashion. All you have to do is spend about the same amount of time on one of the web search engines as what it takes to read this article and you'll find many ways to accomplish the task of generating random numbers in a set based fashion.
In situations where RBAR actually is necessary, you'll find that the necessary WHILE loop is NOT a performance problem because it's simply a vehicle for controlling multiple set based operations.
To wit, instead of folks spending time writing articles on how to make RBAR faster, I'd like to see them write articles on how to avoid RBAR in the first place. 😉
--Jeff Moden
Change is inevitable... Change for the better is not.
January 21, 2010 at 6:53 pm
Ah.... forgot to mention that if you absolutely must write RBAR (not likely if you learn to write nice simple set based code), you should also use SET NOCOUNT ON to improve the performance by not having to generate (250,000 in this case) "(1 row(s) affected)" messages. In the 2nd example given, SET NOCOUNT ON improved the performance from almost 18 seconds to just over 13 seconds on my box.
Of course, that's still a RBAR solution. Generating the same random numbers using one of the multiple methods to gen set based random numbers dropped the duration to less than 1.2 seconds to do the same thing. It's also easier on the I/O system...
The RBAR method generated more than 254,000 reads and an internal rowcount of 500,001 rows where the set based method generated less than 1000 reads and an internal rowcount of only 250,000 which is just what the good doctor ordered. (There is a slightly more complex method that will generate, get this, almost 0 reads... you've just got to look for it).
Heh... why don't I post that solution? I don't want to deprive you of having the fun of finding it and testing it on your own. 😉
--Jeff Moden
Change is inevitable... Change for the better is not.
Viewing 6 posts - 1 through 5 (of 5 total)
You must be logged in to reply to this topic. Login to reply