April 26, 2013 at 2:31 pm
We have a situation where the business requires some ETL to be done throughout the day on a production database. One of the procs updates an xml column for a set of rows. The developer has decided to change this proc to do a while loop and update row by row to relieve page contention and only take a row lock. Is this a good idea when speed of the proc is not a concern? Also, this is a while loop with a counter, not a cursor... so should this maybe be a cursor instead of a while loop?
Jared
CE - Microsoft
April 26, 2013 at 3:45 pm
SQLKnowItAll (4/26/2013)
We have a situation where the business requires some ETL to be done throughout the day on a production database. One of the procs updates an xml column for a set of rows. The developer has decided to change this proc to do a while loop and update row by row to relieve page contention and only take a row lock. Is this a good idea when speed of the proc is not a concern? Also, this is a while loop with a counter, not a cursor... so should this maybe be a cursor instead of a while loop?
The while loop with a counter can be useful to stop the sproc from going over the top consuming cpu & memory. An alternative is a "waitfor delay" 1 sec depending on the way the load is being defined.
I have used a similar method that did very small batches to update 12 million rows to a heavily used table well over decade ago (yep I know...., and yes the DB could have been considered to be a VLDB back then) and the business didn't even notice it. I still use this method successfully as it can be a "fire & forget" method whereas the 'do everything in one update' can cause major issues.
Regarding the "cursor vs loop", the cursor can consume memory and can cause locking. Whereas the loop can be more effective. Try and see. If it works and no users are complaining why worry??
April 26, 2013 at 3:52 pm
Jared, something else you might think of using is an in-database broker queue, which allows you to queue up asynchronous updates like this and have a built in wrapper for controls and poisoned messages and the like. Also allows you a bit more adjustability as to how heavy handed you want it to be for simultaneous updates by determining the amount of activations it can have.
However, yes, I've certainly done things like this for mid-day ETL into an OLTP system during prime usage hours. Cursor vs. counter will depend heavily on what the actual process is doing at the nitty-gritty level. Heck, I've even done iterative data extractions when I couldn't get at the data quick enough without scanning if I tried more than two pieces of data simultaneously. It's certainly a solid approach as long as you're okay with the asynchronous aspects of the update.
Never stop learning, even if it hurts. Ego bruises are practically mandatory as you learn unless you've never risked enough to make a mistake.
For better assistance in answering your questions[/url] | Forum Netiquette
For index/tuning help, follow these directions.[/url] |Tally Tables[/url]
Twitter: @AnyWayDBA
April 26, 2013 at 3:59 pm
Thanks for the replies. We will test both, of course. I was against the idea at first because of my feelings on changing a perfectly good set-based operation to RBAR. Of course, the logic makes sense that a row by row update will only take page locks and be less likely to escalate.
This step is simply using the modify() method to update an xml column. Not sure if that makes a difference.
Jared
CE - Microsoft
April 26, 2013 at 4:17 pm
SQLKnowItAll (4/26/2013)
Thanks for the replies. We will test both, of course. I was against the idea at first because of my feelings on changing a perfectly good set-based operation to RBAR. Of course, the logic makes sense that a row by row update will only take page locks and be less likely to escalate.This step is simply using the modify() method to update an xml column. Not sure if that makes a difference.
I'd still also try to do it in sets, just less data per 'salvo'. Get it down for say, 2 of them. Increment the process by powers of 2 until you find a comfortable medium. Try to work within your clustered index for the range.
Never stop learning, even if it hurts. Ego bruises are practically mandatory as you learn unless you've never risked enough to make a mistake.
For better assistance in answering your questions[/url] | Forum Netiquette
For index/tuning help, follow these directions.[/url] |Tally Tables[/url]
Twitter: @AnyWayDBA
April 26, 2013 at 4:31 pm
Evil Kraig F (4/26/2013)
SQLKnowItAll (4/26/2013)
Thanks for the replies. We will test both, of course. I was against the idea at first because of my feelings on changing a perfectly good set-based operation to RBAR. Of course, the logic makes sense that a row by row update will only take page locks and be less likely to escalate.This step is simply using the modify() method to update an xml column. Not sure if that makes a difference.
I'd still also try to do it in sets, just less data per 'salvo'. Get it down for say, 2 of them. Increment the process by powers of 2 until you find a comfortable medium. Try to work within your clustered index for the range.
I don't think that is possible. The way they are doing this could be 1 or 700 rows. The join will create a page if not database lock. This is a highly transactional database. Not sure why they are even doing this ETL on it, but I am not assigned to that team... Just doing code review this week 🙂
Jared
CE - Microsoft
Viewing 6 posts - 1 through 5 (of 5 total)
You must be logged in to reply to this topic. Login to reply