How to delete a record if it is already exists?

  • I have a package where it gets data from stage( raw data ) and loads to destination ( which has all key columns)

    2) The key columns in destination have lookups to reference tables.

    so , my destination have all key columns.

    I need to check the records in destination and delete that record if exits. How do i do that checking each column?

  • The standard way of doing this is to load your data into a staging area and then to do a MERGE (which is potentially less expensive than DELETE/INSERT) into the target table.

    The absence of evidence is not evidence of absence
    - Martin Rees
    The absence of consumable DDL, sample data and desired results is, however, evidence of the absence of my response
    - Phil Parkin

  • Since Stage table is raw data and destination is key columns we cannot merge Stage and destination.

    my package is something like below

    Stage(Column1 , column2)

    |

    |

    Lookup1(get Column1key1)

    |

    |

    Lookup2(get Column2key2)

    |

    |

    Destination ( Column1key1 , Column2key2 , loaddate)

    Can we still use merge and if so , how?

  • komal145 (6/15/2016)


    Since Stage table is raw data and destination is key columns we cannot merge Stage and destination.

    my package is something like below

    Stage(Column1 , column2)

    |

    |

    Lookup1(get Column1key1)

    |

    |

    Lookup2(get Column2key2)

    |

    |

    Destination ( Column1key1 , Column2key2 , loaddate)

    Can we still use merge and if so , how?

    Create a new Destination table, scratch.Tablename (or whatever) which has an almost identical structure to your target table.

    Near the beginning of your ETL process, add a step to truncate this table.

    Replace your Destination above with scratch.Tablename.

    Add an ExecuteSQL task to merge from scratch.Tablename to your target table.

    This, of course, assumes you have a proper PK on which to match.

    A slight refinement of the above, which is probably worthwhile if most of your rows are INSERTs, is to add a lookup in your data flow to check whether a destination row already exists. If not, proceed with a direct insert. If it does, send it to scratch.Tablename and continue as described above.

    The absence of evidence is not evidence of absence
    - Martin Rees
    The absence of consumable DDL, sample data and desired results is, however, evidence of the absence of my response
    - Phil Parkin

  • komal145 (6/15/2016)


    Since Stage table is raw data and destination is key columns we cannot merge Stage and destination.

    my package is something like below

    Stage(Column1 , column2)

    |

    |

    Lookup1(get Column1key1)

    |

    |

    Lookup2(get Column2key2)

    |

    |

    Destination ( Column1key1 , Column2key2 , loaddate)

    Can we still use merge and if so , how?

    why not?

    merge into dest_table dst

    using (select substring(rd.record, 10,20) as somevalue

    , l1.key1

    , l2.key2

    from staging_raw_data rd

    left outer join lookup1 l1

    on l1.keyfield = substring(rd.record,1,4)

    left outer join lookup2 l2

    on l2.keyfield = substring(rd.record,1,4)

    ) src

    on dst.key1 = src.key1

    when matched

    then update

    set dst.somevalue = src.somevalue

    , dst.key2 = src.key2

    when not matched

    then insert (somevalue, key1, key2)

    values src.somevalue

    , src.key1

    , src.key2

    ;

    if staging_raw_data is not a sql table already, load it into a staging db located on the same instance as the destination table. Avoid using linked servers if possible for this.

Viewing 5 posts - 1 through 4 (of 4 total)

You must be logged in to reply to this topic. Login to reply