August 30, 2010 at 10:14 am
Hello,
I have a core business application which utilizes SQL 2008. We currently have 3 distinct production SQL databases which are identical in table structure and relationships, but which have their own unique relational data. I need to merge these 3 production databases into one SQL 2008 data warehouse database. I'd like to be able to customize the standard SQL replication within SQL 2008 so that each table's primary key data remains unique by adding a 1-column database ID field to each table and each primary key.
For example, the primary key for our employee table is simply an internal sequence number. These sequence numbers would certainly overlap from one database to another. I need to be able to build data warehouse primary keys for this table constructed from a database ID value (example: "a") and the sequence number (example: 1234567) yielding a data warehouse primary key of "a1234567". The same sequence number from database "b" would have a value of "b1234567" and would remain unique.
Is there an easy way to update the SQL sync procedures to add this source database ID to each table and primary key? I would like to minimize customization and use as much standard SQL replication as possible. I'd also prefer to develop in Transact SQL if possible.
Thanks.
Larry
September 1, 2010 at 11:27 am
I would highly recommend moving away from trying to modify the SQL sync procedures in Replication to load your data into a data warehouse. Instead I would create a SSIS etl package to load your data warehouse. That way you can easily customize the fields and it will be easier to maintain.
If you have to use replication, you will have to modify the sp_MSdel, sp_MSins, and sp_MSupd stored procedures for each table that has a competing identity/id. That requires significant customization of the replication jobs and if anyone drops the replication and recreates it, all of those modifications will be lost.
September 1, 2010 at 11:53 am
Larry Russell (8/30/2010)
Hello,I have a core business application which utilizes SQL 2008. We currently have 3 distinct production SQL databases which are identical in table structure and relationships, but which have their own unique relational data. I need to merge these 3 production databases into one SQL 2008 data warehouse database. I'd like to be able to customize the standard SQL replication within SQL 2008 so that each table's primary key data remains unique by adding a 1-column database ID field to each table and each primary key.
For example, the primary key for our employee table is simply an internal sequence number. These sequence numbers would certainly overlap from one database to another. I need to be able to build data warehouse primary keys for this table constructed from a database ID value (example: "a") and the sequence number (example: 1234567) yielding a data warehouse primary key of "a1234567". The same sequence number from database "b" would have a value of "b1234567" and would remain unique.
Is there an easy way to update the SQL sync procedures to add this source database ID to each table and primary key? I would like to minimize customization and use as much standard SQL replication as possible. I'd also prefer to develop in Transact SQL if possible.
Thanks.
Larry
Adding the column for the foreign key is pretty easy, you just add the column FormerDB (or whatever you want to call it, but try to avoid reserved words), drop your old Primary Key and then put a new Primary Key on the columns FormerDB and your TableID (sequence number).
ALTER TABLE TableName ADD CONSTRAINT PK_TableName PRIMARY KEY CLUSTERED (FormerDB, TableID)
However, if you do this, and you use an IDENTITY field to create new IDs, you'll end up with unique IDs from that point whether there's a FormerDB column associated with it or not. You'll get
A 12345
B 12346
C 12347
You won't get
A 12345
B 12345
C 12345
As for transferring the data, you can certainly do it all with T-SQL, just write the INSERT INTO statements for everything you need to transfer, adding your constant where needed.
INSERT INTO AlreadyCreatedTable (FormerDB, SeqNum, SomeInfo, SomeMoreInfo)
SELECT 'A', SeqNum, SomeInfo, SomeMoreInfo
FROM OldTable
'A' is your constant and the rest are your columns.
--------------------------------------
When you encounter a problem, if the solution isn't readily evident go back to the start and check your assumptions.
--------------------------------------
It’s unpleasantly like being drunk.
What’s so unpleasant about being drunk?
You ask a glass of water. -- Douglas Adams
September 2, 2010 at 7:20 am
Thank you both for your replies.
I currently sync our old SQL 2000 data warehouse to our UNIX ISAM files using ODBC data transfers, DTS packages and stored procedures. Part of that old process currently adds the database ID to all primary keys and tables, just as you suggested. It sounds like it will be best to maintain this type of consolidated master table update process rather than attempt to modify/configure the SQL 2008 replication process to handle this function.
Thanks again for your input.
Larry
Viewing 4 posts - 1 through 3 (of 3 total)
You must be logged in to reply to this topic. Login to reply