Best way to merge two large client databases

  • titsiros (1/26/2016)


    I use a similar strategy but instead of having a nested select, I do an INNER join between the two databases

    insert into destinationDB.destinationTable (ID, s...)

    select destinationParentTable.ID, s...

    from sourceTable s

    inner join sourceParentTable sP on sP.ID=n.ID

    INNER join destinationParentTable dP on dP.NAME=sP.NAME

    left join destinationTable d on s.NAME=d.NAME

    where d.ID is null

    My only concern with this approach is that i cant find a way to log errors and commit the successes..any ideas?

    If you were to use a SSIS t-sql task, you can paste your code in there. SSIS has logging abilities. SSIS can also define a transaction so every things succeeds together. Are the two databases on the same instance or machine?

    ----------------------------------------------------

  • The databases are on different machines but for the purpose of the merge we could move them to the same one. This would also make sense in terms of performance.

    Unfortunately not using SSIS. I know i could use any programming language like C++ to iterate through the select statement results and generate an insert for each record in order to capture individual errors but worried this will impact performance...some tables have 80 million records.

    I was hoping to use insert/select statements which is neater and faster

  • titsiros (1/26/2016)


    The databases are on different machines but for the purpose of the merge we could move them to the same one. This would also make sense in terms of performance.

    Unfortunately not using SSIS. I know i could use any programming language like C++ to iterate through the select statement results and generate an insert for each record in order to capture individual errors but worried this will impact performance...some tables have 80 million records.

    I was hoping to use insert/select statements which is neater and faster

    First create the target tables with nothing but clustered and primary keys; no foreign keys or non-clustered indexes. Do your insert / selects into this. Then go back and create indexes and foreign keys.

    "Do not seek to follow in the footsteps of the wise. Instead, seek what they sought." - Matsuo Basho

  • titsiros (1/26/2016)


    The databases are on different machines but for the purpose of the merge we could move them to the same one. This would also make sense in terms of performance.

    Unfortunately not using SSIS. I know i could use any programming language like C++ to iterate through the select statement results and generate an insert for each record in order to capture individual errors but worried this will impact performance...some tables have 80 million records.

    I was hoping to use insert/select statements which is neater and faster

    Iterating is iterating no matter the language. I would stay with Set based SQL. My suggestion here, you determine if feasible, is to take the smaller of the sets and copy that table entirely to the database with the bigger set. This is ideal, but in any case you want to copy the other machine's table over into a temp table on the machine you are working on. Then you can even put a clustered index on that new temp table on the join column (especially since you mention a non small amount of rows). Working this way you avoid the linked server (if that is what you are using) from doing the joins across the network in your insert/update/delete operations.

    I would explain to someone how SSIS would help here with logging and rollbacks as well to try to make that a possibility at least going forward.

    ----------------------------------------------------

Viewing 4 posts - 46 through 48 (of 48 total)

You must be logged in to reply to this topic. Login to reply