Synchronize 2 tables

  • Hello

    I looking for a way to synchronize 2 tables using SQL and not using external tools (like Red Gate). I have a SQL script that re-creates a lookup table and it is typically about 3 million rows.

    What I've tried so far is first create a temp lookup table and then use the MERGE statement (SQL Server 2008 only) to synchronize with the actual table. I let it run for about 8 hours and I finally killed it because it was taking too long.

    The trick is the actual lookup table is constantly being used so I can't just delete and re-create.

    Any ideas or suggestions?

    Thanks!

  • does your lookup table have anything to help identify changed rows, like a UpdatedDate or RowVersion or anything ? any chance you could use replication instead?

    Lowell


    --help us help you! If you post a question, make sure you include a CREATE TABLE... statement and INSERT INTO... statement into that table to give the volunteers here representative data. with your description of the problem, we can provide a tested, verifiable solution to your question! asking the question the right way gets you a tested answer the fastest way possible!

  • Previously I've done this in a three step approach in the following order DELETE, UPDATE and INSERT using a temporary/staging table.

    /*Tableb is staging table, Tablea is live */

    DELETE FROM tableA

    FROM TABLEA a

    LEFT OUTER JOIN Tableb b ON a.ID = b.ID

    WHERE b.ID IS NULL

    INSERT INTO @TableAChecksum (ID, Checksum)

    SELECT ID, BINARY_CHECKSUM(*) FROM TABLEA

    INSERT INTO @TableBChecksum (ID, Checksum)

    SELECT ID, BINARY_CHECKSUM(*) FROM TABLEB

    UPDATE a

    SET a.Col1 = B.Col1, a.Col2 = b.Col2

    FROM TABLEA a

    INNER JOIN @TableAChecksum a1 ON a.ID = a1.ID

    INNER JOIN @TableBChecksum b ON a1.ID = b.ID

    WHERE a1.Checksum <> b.Checksum

    INSERT INTO TABLEA

    SELECT * FROM TABLEB b

    LEFT OUTER JOIN TABLEA a ON a.ID = b.ID

    WHERE a.ID IS NULL

  • cgreathouse (9/24/2010)


    Hello

    I looking for a way to synchronize 2 tables using SQL and not using external tools (like Red Gate). I have a SQL script that re-creates a lookup table and it is typically about 3 million rows.

    What I've tried so far is first create a temp lookup table and then use the MERGE statement (SQL Server 2008 only) to synchronize with the actual table. I let it run for about 8 hours and I finally killed it because it was taking too long.

    The trick is the actual lookup table is constantly being used so I can't just delete and re-create.

    Any ideas or suggestions?

    Thanks!

    Yes... it's an old gaming trick. When you can't move, change color.

    Build a copy of the table. Once it's complete, do a cascaded rename. Drop OLD table, rename current as old, rename new as current.

    You can also do similar with two tables and a synonym... while one is in use, build the other. When the build is complete, just alter the synonym.

    Total "down time" is typically less than 32ms for the cascaded rename and virtually doesn't exist during a synonym repoint.

    You may want to put an exclusive lock on the table in the code to ensure that no one is using it for the instant that the change is being made just to ensure that no one is using data that may have changed.

    Another way is to do a segmented update where a WHILE loop does several smaller merges which is also very effective.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

Viewing 4 posts - 1 through 3 (of 3 total)

You must be logged in to reply to this topic. Login to reply