need help regarding SSIS

  • Hi der,

    i have an SSIS package running to import data from a mbd file into a SQL table.

    1) is der any way to update existing records if needed and insert new records into the SQL table

    for example: say first time i run the package all the data from mdb is put into my table. later few rows in the mdb wil be updated (say first 5 rows) and few extra rows added to the mdb. now how wil i update only those rows and insert only those new rows into my SQL table

    2)How wil i know which of the rows wer updated and which wer the news rows inserted into the SQL table

    this is much useful wen the number of rows exceed 2 lacs...

    kindly help, its urgent.

  • You have a couple of options:

    1. Use a staging table in SQL Server and after loading the staging use T-SQL to do an Update Where Exists and an Insert Where Not Exists (Or Left Join). This will most likely be the best performing solution.

    2. Use a Lookup Component in SSIS and send the "Error Rows", those without a match to an OLE DB Destination for insert and the matched rows to an OLE DB Command to do an update.

    As far as knowing which are updates, you could add an update_date column to the source table and only populate in your update statement.

  • hi,

    could you please elaborate on 1st point above.

    if possible with an example as im not getting wat is- and how to use- staging table.

    thank you.

  • See this following link that explains the warehouse refresh methods...

    http://www.sqlis.com/post/Get-all-from-Table-A-that-isnt-in-Table-B.aspx

    --Ramesh


  • A staging table is basically a copy of your source table except in SQL Server. So if I am migrating a persons table (last_name, first_name, birth_date, national_id_number) I would create a staging.persons table in SQL Server which I would truncate and re-load each time my package runs. Here is what the package steps would be:

    1. Execute SQL Task - Truncate staging.persons

    2. Data Flow taks - Load staging.persons from source.persons table

    3. Execute SQL Task - Update destination.persons

    Update destination.persons

    Set columns = persons_staging.columns

    From

    destination.persons DP Jon

    staging.persons SP On

    DP.key_columns = SP.key_columns

    4. Execute SQL Task - Insert new rows into destination.persons

    Insert Into destination.persons

    (

    columns

    )

    Select

    SP.columns

    From

    staging.persons SP LEFT JOIN

    destination.persons DP On

    DP.key_columns = SP.key_columns

    Where

    DP.column Is Null

  • There is also the SCD (Slowly Chaning Dimension) control that you can use to achieve this. This control tends to perform poorly (like other solutions) if you need to compare every column in the table looking for changes. There are ways around this, like adding a checksum to your table. But, since you are only processing several lac rows, any solution posted so far should work just fine.

    If you are going to need to perform this type of operation for many tables it would be a good idea to get familar with delta processing or slowly changing dimension processing.

    For what it is worth, I tend to use the Lookup control in conjuction with a checksum to do my loading.

  • SQL Server 2008 Merge syntax could be a better solution in this case.

Viewing 7 posts - 1 through 6 (of 6 total)

You must be logged in to reply to this topic. Login to reply