Merging data from Database2 into Database1

Question

Post reply

Merging data from Database2 into Database1

Paul Anderson-304005

SSC-Addicted

Points: 451
More actions
January 31, 2007 at 7:03 am

#115495

Hi all,
Just wondered if anyone has attempted this in some shape or form.  Basically just want to merge the data from database 2 (source) into database 1 (target) where there is already data in the target database that must remain referentially in-tact. Database2 is basically a duplicate of Database1's structure. Only the data is different.
The problem is where Table1 has a foreign key reference into Table2 (in both databases). This is something like as follows (this is just to demonstrate the relatioships and is far from what we have in the tables):
Database 1
Table1
ID       Name           FKID
----------------------------------------------
1        T.Jones        22
2        J.Doh          25
3        D.Howser       26
4        J.Smith        22
Table2
ID       Favourite_Show
-----------------------------------------------
22       Chris Moyles
25       Jo Wiley
26       Scott Mills
Database 2
Table1
ID       Name           FKID
----------------------------------------------
1        J.R.Hartley    11
2        T.Mallet       11
3        J.R.Ewing      13
7        F.Le Grand     12
Table2
ID       Favourite_Show
-----------------------------------------------
11       Ken Bruce
12       Terry Wogan
13       Steve Wright
The ID column in Table1 is an identity. Table2 is not.
There are a lot of tables that have references to Table1. They basically have the same relationship as Table1 to Table2 (1-to-1). There are around 50 tables with this relationship.
Some tables can have over 500,000 records in. The database size is around 1GB although i don't need to copy across some of the data.
I thought it would be simple enough to create a DTS to do this and perform the following main actions:
1. Copy Table1 data into a Temp Table (and offsetting the ID on Insert) using Execute SQL Task.
2. Offset the ID's in Database1 (Target) by a specified increment usign Execute SQL Task.
3. Copy the data across from Database2 (Source) using DTS Object Transfer - which would effectively leave that dataset unchanged (all ID's would remain the same).
It seems like step 2 is hanging or taking over 30minutes to execute.
Anyone got any thoughts or suggestions on this?
Andez

Viewing 7 posts - 1 through 6 (of 6 total)

You must be logged in to reply to this topic. Login to reply

Paul Anderson-304005 SSC-Addicted Points: 451 More actions · Answer 1

Forgot to mention that I copied the SQL from Step to into Query Analyzer which took 20 minutes to execute.

Steve Jones - SSC Editor SSC Guru Points: 737403 More actions · Answer 2

Without seeing the code it's hard to know what is causing the delays. If there's lots of data, especially not indexed, it could be slow.

I'm not sure I'd do a specified increment. You could run into issues and if you don't have a unqiue key, you could get duplicates. Rather I'd use some algorithm to find the new IDs that are free. Or find a range and insret them all at once.

Paul Anderson-304005 SSC-Addicted Points: 451 More actions · Answer 3

Also forgot to mention that we would remove ALL FK References before executing the SQL to do the update. The FK Constraints would then be recreated at the end of the merge process.

The SQL to offset the table and it's references was basically executing the following (for the above example):

DECLARE @intIDIncrement INT

SET @intIDIncrement = 100

-- Offset the primary table

CREATE TABLE dbo.Tmp_Table2 (

ID INT IDENTITY( 1, 1 ),

Favourite_Show VARCHAR( 50 ) ) ON [PRIMARY]

SET IDENTITY_INSERT dbo.Tmp_Table2 ON

IF EXISTS(SELECT * FROM dbo.Table2)

EXEC('INSERT INTO dbo.Tmp_Table2 (ID, Favourite_Show)

SELECT ID + ' + @intIDIncrement + ', Favourite_Show FROM dbo.Table2 TABLOCKX')

SET IDENTITY_INSERT dbo.Tmp_BM_Unit OFF

DROP TABLE Table2

EXECUTE sp_rename N'dbo.Tmp_Table2', N'Table2', 'OBJECT'

-- Update any other table references

UPDATE Table1

SET FKID = FKID + @intIDIncrement

UPDATE Table3

SET FKID = FKID + @intIDIncrement

... And so on ...

UPDATE Table50

SET FKID = FKID + @intIDIncrement

All in all there are a total of 5,110,000 records in all of the tables that require updating using this approach.

All the tables have been indexed to make the system as efficient as possible.

Michael Lysons SSCertifiable Points: 6495 More actions · Answer 4

I'm struggling with the bit where Scott Mills is in the Favourite_Show table, but anyway...

Try copying D2.T2 into D1, then doing a lookup on D1.T2 against D2.T2 to see what the new ID is and storing that in a temp table against the old ID.

Then update D2.T1 by setting the FKID to the new ID as stored in the temp table. Then copy D2.T1 into D1 and it will be pointing to the correct data.

GrassHopper SSCarpal Tunnel Points: 4473 More actions · Answer 5

I haven't tested this, but does this make sense?

1 - Do a Select with a left join from db1.table1 to db1.table2 (selecting all fields except the identity field INTO a new table (tablex)

2 - Do the same as above for dB2.

3 - Do a Select * and UNION on both tables (tablex and tabley) INTO a new Table.

4- Update the new table with an identity field OR add the identity field in step 3 as a new field.

Does my logic work?

Paul Anderson-304005 SSC-Addicted Points: 451 More actions · Answer 6

Thanks for the contributions...

It sounds like it would work off the top of my head although due to the massive amouts of tables that are referenced by the main table (about 50 in all) could be a tad long winded to write the SQL code for.

I have been working on a little application for a while now that will generate some necessary scripts for things i need to do during this upgrade process (hopefully someday I can rewrite and merge into a single app that will make millions!! - not). It generates the create and drop FK Constraints for the tables I tell it too.

At the start of the process, it drops the constraints then updates the necessary tables by updating the PK with a hard-code offset in the SQL (again using my app, i could search through all tables that directly reference the PK column in the main table either directly on indirectly with a nifty search). The app would then generate the necessary code for me to perform this update.

At the end of the process, the FK constraints are then recreated with the generated script from my application. All works well. I decided to go with performing the update on the database with the least amount of data in it. There are two main tables that reference lots of other tables (60 in total). It takes around 20 minutes to execute the steps in a DTS package.

Once the database has been prepared like this, it is a simple case of copying the table data across (using Copy SQL Server Objects Task in DTS). All this is fine on SQL2000 but not done too much in 2005 as yet!

Probs may seem like a funny way to do it, but seems logical and resonable in my way of thinking.

Andez

Merging data from Database2 into Database1

Cookies on SQLServerCentral