Updating millions of records

  • Just an idea, I'm updating a table with over 240 million records on a daily basis, Generally about a million or so records per night, but may go upto 20 million per night.

    Now after some testing and re-testing, this is what is currently working for me.

    I grab from my staging table and insert the data into a temp table. I do my correction of data in this temp table and using an indicator I first check if I need to insert the data or set it to be updated.

    Some records might be 100% the same in my DW and in the staging, and then i delete them out of the staging. (I know I can leave it alone as well, but none the less)

    When I update / load (still need to move it to merge in 2008R2), I check to see how many records I need to insert. If there is over 5million inserts, I disable the indexes on the DW table, insert and rebuild them again.

    I've struggled with this table for the past year or so and have settled on this solution, seeing that I can insert/update 10million in about 50 mins, and a quad core 2.33mhz server with 8gig ram.

  • Does the free space on hard disk matter(Considering recovery model Simple).

    We have ~20GB free space on the drive where database files are stored.

  • Hello,

    maybe you have a problem blocking this volume of data. You can face this kind of massive updates using a cursor and commiting rows regularly, following this schema:

    DECLARE CUR CURSOR FOR SELECT... FOR UPDATE OF address

    DECLARE @i INTEGER

    SET @i = 0

    OPEN CUR

    FETCH CUR INTO...

    BEGIN TRANSACTION

    WHILE @@Fetch_Status = 0

    BEGIN

    update table_name

    set address = address1 + ' ' + address2 + ' ' + address3 WHERE CURRENT OF CUR

    SET @i = @i + 1

    IF @i >= 10000

    BEGIN

    COMMIT TRANSACTION

    BEGIN TRANSACTION

    SET @i = 0

    END

    FETCH CUR INTO...

    END

    COMMIT TRANSACTION

    Another option is to reduce the amount of updated rows adding a WHERE clause like this:

    update table_name

    set address = address1 + ' ' + address2 + ' ' + address3

    WHERE ISNULL(address, 'cats') <> ISNULL(address1 + ' ' + address2 + ' ' + address3, 'cats')

    Regards,

    Francesc

Viewing 3 posts - 16 through 17 (of 17 total)

You must be logged in to reply to this topic. Login to reply