May 15, 2013 at 4:37 pm
Hello - I recently added a new column to a db table with 6.5 million rows. Now I need to populate this column data from another data source by joining on a column between this db table and the other data source.
Since this table has 6.5 million rows I'm wondering if there's a risk of table/row locking/blocking with this update statement?
How would you normally handle this type of update statement? Are there any special precautions you would take and does the large number of rows impact how you would implement this update statement?
May 15, 2013 at 5:50 pm
You should do the updates in chunks/ranges of 10 to 50 thousand rows.
This way you minimize locking and log reasonable sized transactions to the transaction log.
First do this in dev and test. Try to traverse the primary key to define the ranges.
May 16, 2013 at 9:18 am
What's the best implementation for doing this in chunks? Can you please provide an example?
Also, why would I would I want to do this in 10k-50k chunks as opposed to 2k-5k chunks?
May 16, 2013 at 11:47 am
The chunk size is is just what I have grown comfortable with through experience.
Traverse the primary key if possible (else another indexed field, with a smaller chunk) in a while loop.
That is the best I can do without knowing your schema.
May 16, 2013 at 12:16 pm
a very basic model to help you visualize the process:
SET ROWCOUNT 50000
WHILE 1 = 1
BEGIN
UPDATE dbo.myTable
SET MyField = MyValue
WHERE SomeCriteria = MyCondition
AND MyField <> MyValue --prevents re-updating the same rows over and over again
IF @@ROWCOUNT = 0
BREAK
END
Lowell
Viewing 5 posts - 1 through 4 (of 4 total)
You must be logged in to reply to this topic. Login to reply