Best way to update large number of rows?

  • I have to update one column in each of 40 million rows in a vendor-supplied table.

    The table has no PK and no foreign keys defined.

    It does have a non-unique index on two numeric columns.

    The vendor supplied a cursor-based script to update all the rows in one pass, but I don't believe this to be a good approach.

    I want to minimize the overall duration of the update process and I believe breaking this into multiple update operations will be faster overall.

    Does anyone have a tested method they have used to break a large update like this into smaller chunks? Or a different approach that worked?

    TIA!

    dj

  • You could use TOP and a while loop to control the number of rows updated. No matter what you should be either taking transaction log backups frequently or ensure you have enough disk space for your transaction log drive.


    * Noel

  • i didn't test such algorithm for a 40 million record, but i think that using cursors is the worst solution for such a problem.

    you have to talk a lot about slow performance with locking issues.

    So i suggest using a general update statement with dividing the process, Actually VBulliten uses this technique in their forums when modifying a lot of rows.

    You can make a script to update the records maybe 10 thousands per batch.

    But anyway, i'm wondering why such a large table does not have a primary key with a clustered index?

    I think you have to think about adding a new PK field and creating an index for it, and then applying the updates.

  • If you have the option (access). It can speed up the process to set the recovery type to Simple.

    Then script a while loop that updates the top 10k? 100k? (more) rows at one time.

    d

Viewing 4 posts - 1 through 3 (of 3 total)

You must be logged in to reply to this topic. Login to reply