Conditional Split

  • I have job in SSIS where in it INSERTS Employees into EMp table. Its a recurring job every night.

    Is it possible to use conditional split task in SSIS to check for existing records and only insert only new records and also update old records if there are any changes in old records

  • Hi Mike,

    Yes, it is possible to design a package which will satisfy your requirements. You will have to use a dataflow task,in that use a look up componennt to compare the incoming rows with the ones existing in the table. The lookup will send the rows that are not in the target table down the error pipe. Connect the error pipe to the destination. The rows will not be in error will be sent to the regular pipe. Connect this pipe to an OLEDB command task , where you can write an UPDATE on the target.

    Also , what you can do is.. use a OLEDB source in the dataflow task, fetch the rows by doing a query with LEFT OUTER JOIN between source table and target table ,include a filter condition to allow only those rows which are not present in the target(by WHERE target table.col is null) and insert these rows in the target table.

    For update , fetch rows using an INNER join between the source and the target table. Hope that helps

  • Trying to compare each value in every record to that currently in the table is agonizing for the cpu and extremely low performing. Instead try to identify something in the source that identifies records that are new or have changed such as a 'Last Modified' date. Other approaches can be used such as reading from an Audit table in the source system which identifies changed records (if such a table exists) or a modified indicator in the table.

    If none of these options are available in the source system and cannot be imposed I would suggest attempting to persuade the owners of the source system to allow you to add at least one column to the table. This new column will have a default of NULL and when the nightly processing occurs be updated to the date/time that the record was imported. From there on your nightly process can easily identify the rows which are new or have been modified simply by checking for NULL values in the new column.

    The latter option here is not a bad solution but one would hope that the source system already has some method of tracking when records are changed or new...

    Hope this helps!


    Cheers,

    Ben Sullins
    bensullins.com
    Beer is my primary key...

  • what component i need to use to look up in destination table.

  • I agree with Ben here, you should probably try to add columns that will track the changes in that row.

    You can find the look up component in a Data flow task, in a data flow task , in the tools menu , under transformations.

  • Trying to compare each value in every record to that currently in the table is agonizing for the cpu and extremely low performing.

    While I agree there can be something to this, for processes that run in the middle of the night, does it really matter that something has top notch performance? I use the slowly changing dimension using the LEFT JOIN technique for the inserts, but letting the updates be done through the OLE DB Command so that only records that need updating get updated. I can't just use a last modified because the records that get rolled up to one occur across multiple tables, and I want the audit trigger in the table I'm inserting into to trip only if necessary.

    I don't recommend that as a solution in a high performance environment, but it doesn't sound like this is. I once turned an import from a VB program into a DTS package because the VB program wasn't sufficiently reliable. In the process I reduced the time needed from 4 minutes to just 30 seconds. I tempered my enthusiasm, however, when I realized that at 4 in the morning, it didn't matter in and of itself if that data was available at 4:01 or 4:04.

  • Mike: Take a look as this article from Any Warren. It's a fairly decent step-by-step to a method for how to do exeactly this:

    Anatomy of an Incremental Load[/url]

    And yes - it's designed with large data sets in mind.

    ----------------------------------------------------------------------------------
    Your lack of planning does not constitute an emergency on my part...unless you're my manager...or a director and above...or a really loud-spoken end-user..All right - what was my emergency again?

Viewing 7 posts - 1 through 6 (of 6 total)

You must be logged in to reply to this topic. Login to reply