April 27, 2009 at 11:04 pm
Hi der,
i have an SSIS package running to import data from a mbd file into a SQL table.
1) is der any way to update existing records if needed and insert new records into the SQL table
for example: say first time i run the package all the data from mdb is put into my table. later few rows in the mdb wil be updated (say first 5 rows) and few extra rows added to the mdb. now how wil i update only those rows and insert only those new rows into my SQL table
2)How wil i know which of the rows wer updated and which wer the news rows inserted into the SQL table
this is much useful wen the number of rows exceed 2 lacs...
kindly help, its urgent.
April 28, 2009 at 6:47 am
You have a couple of options:
1. Use a staging table in SQL Server and after loading the staging use T-SQL to do an Update Where Exists and an Insert Where Not Exists (Or Left Join). This will most likely be the best performing solution.
2. Use a Lookup Component in SSIS and send the "Error Rows", those without a match to an OLE DB Destination for insert and the matched rows to an OLE DB Command to do an update.
As far as knowing which are updates, you could add an update_date column to the source table and only populate in your update statement.
Jack Corbett
Consultant - Straight Path Solutions
Check out these links on how to get faster and more accurate answers:
Forum Etiquette: How to post data/code on a forum to get the best help
Need an Answer? Actually, No ... You Need a Question
May 3, 2009 at 11:32 pm
hi,
could you please elaborate on 1st point above.
if possible with an example as im not getting wat is- and how to use- staging table.
thank you.
May 4, 2009 at 4:14 am
See this following link that explains the warehouse refresh methods...
http://www.sqlis.com/post/Get-all-from-Table-A-that-isnt-in-Table-B.aspx
--Ramesh
May 4, 2009 at 7:27 am
A staging table is basically a copy of your source table except in SQL Server. So if I am migrating a persons table (last_name, first_name, birth_date, national_id_number) I would create a staging.persons table in SQL Server which I would truncate and re-load each time my package runs. Here is what the package steps would be:
1. Execute SQL Task - Truncate staging.persons
2. Data Flow taks - Load staging.persons from source.persons table
3. Execute SQL Task - Update destination.persons
Update destination.persons
Set columns = persons_staging.columns
From
destination.persons DP Jon
staging.persons SP On
DP.key_columns = SP.key_columns
4. Execute SQL Task - Insert new rows into destination.persons
Insert Into destination.persons
(
columns
)
Select
SP.columns
From
staging.persons SP LEFT JOIN
destination.persons DP On
DP.key_columns = SP.key_columns
Where
DP.column Is Null
Jack Corbett
Consultant - Straight Path Solutions
Check out these links on how to get faster and more accurate answers:
Forum Etiquette: How to post data/code on a forum to get the best help
Need an Answer? Actually, No ... You Need a Question
May 4, 2009 at 2:04 pm
There is also the SCD (Slowly Chaning Dimension) control that you can use to achieve this. This control tends to perform poorly (like other solutions) if you need to compare every column in the table looking for changes. There are ways around this, like adding a checksum to your table. But, since you are only processing several lac rows, any solution posted so far should work just fine.
If you are going to need to perform this type of operation for many tables it would be a good idea to get familar with delta processing or slowly changing dimension processing.
For what it is worth, I tend to use the Lookup control in conjuction with a checksum to do my loading.
May 29, 2009 at 6:20 am
SQL Server 2008 Merge syntax could be a better solution in this case.
Viewing 7 posts - 1 through 6 (of 6 total)
You must be logged in to reply to this topic. Login to reply