January 28, 2004 at 7:29 pm
Hi All
Hoping someone can come up with a good way of re-working a data import process we currently run.
The source data resides in a Progress database running on Unix. This database is managed by an external company and we cannot change the database structure, nor can we add/change any of the processes running of the UNIX server. We also have no method of determining what and when a record is changed in the Progress database.
The destination is a SQL Server 2000 database on Win2K. We have total control over this server and can create/delete/add as necessary.
Currently we need to copy data from about 50 tables from the Progress database to the SQL Server database. This could increase in the future.
On a nightly basis we truncate the tables in a staging SQL database. Use DTS to open an ODBC connection to the Progress and copy the data across using a standard DataPump task. Then we have a "Copy Objects.." task that copies the data to the production SQL database. This entire process takes over 5 hours .
What we would like to implement is an incremental update of the production database. This would be beneficial for 3 reasons,
1) The time taken to copy the data to the production server would be reduced
2) The database would still be available while data is being copied
3) We would be able to provide some external consumers with updated data instead of a full list.
Hope someone can give some help/guidance here.
--------------------
Colt 45 - the original point and click interface
January 29, 2004 at 8:24 am
Phill,
I have THE SAME Problem than you but with another proprietary legacy db.
1. I do The Datapump step and the ODBC Driver speed is KEY
2. I have 2 sets of staging Tables When I finish the First Step I Run queries to Determine The Deleted, Updated and Inserted Data from the Comparison of the 2 sets of tables!
3 . Run three stored Procedures using a linked Server to modify the Production data.
4. at the end I truncate The oldest set of staging tables and DTS will know where to import next Time!
HTH
* Noel
January 29, 2004 at 9:21 am
I do the same sort of thing with back end produced text files, each file containing multiple tables.
1. Get each import table's data into #Temp tables in the fastest way (OLEDB, ODBC, BULK INSERT....)
2. Make Key index(s) on #Temp data (Target already has them)
3 For each table....
3a. Delete from Target where not in Source
3b. Update Target where Keys exsit in both, but data is different, by
Performing SELECT Keys INTO #Temp from a Source - Target * UNION ... HAVING COUNT(Key) <> 2
Delete Target, Insert Source for above Keys
3c. Insert to Target where Source Key not in Target
Using combination of properly indexed #Temp tables of the source data, and 3 step processing on them seems to be, for me and my data, the fastest and least intrusive on Target. I've done quite a bit of testing on this, because I get, in some cases, hundreds of different sources at the same time.
In any case, continue what you are doing in testing the various ways the proccessing can be done. It may payoff greatly in performance gains.
Once you understand the BITs, all the pieces come together
Viewing 3 posts - 1 through 2 (of 2 total)
You must be logged in to reply to this topic. Login to reply