How to dedupe rows in source before sending to destination in SSIS process

  • I have data came from different sources. If I have a situation that duplicate key rows coming from source, how can I handle it through SSIS?.I wanted to add one row in my destination table and send other in the output(bad data) table at the same time I wanted to log the type of error ( duplicate key error) in a different table.

    I wanted to dedupe the source before sending it to the destinatin.

    Please give me some suggestion. Any help will be appreciated.

    You all have a wonderful weekend.

  • Hi Sabina,

    One of the coolest free features that I came across was actually a removeduplicates transformation that ships with the Samples of SSIS.

    You will need to read up on this, so in your BOL, type the following "Remove Duplicates Component Sample". Its really not that difficult to add this to your SSIS assembly.

    The only requirement that you have is that you need to define the key row.

    Also note that it will output a single row to your unique values, and all other rows to your duplicates row. It all depends on what you want to do from here (In my case, all duplicates are suspect. Took a very long time to get the source owners to agree that in such a case, not one of the 2 rows are okey to insert into the environment)

    Trust this helps

    ~PD

Viewing 2 posts - 1 through 1 (of 1 total)

You must be logged in to reply to this topic. Login to reply