How prevent to load the dublicate row in OLEBD destination

  • Hi All,

    i load the data from Staging database to Poststaging but when run package two time at that time i got same rows two time or if i run package three time i got same data three time in poststaging table.

    i took OLEDB Souce which extract the data from staging

    i took OLEDB destination whichload the data to Poststaging

    my question is how i prevent to load dublicate row in to OLEDB destination.

    in my many source column contain NULL value and there is no any row which uniquly identify.

    my question is i wan to prevent to the same rows in destination.

    i dont in this scenario i have to create the store procedure or i have to use the look transpomation.

    if i have create the store procedure then where i have to create the SP.

    My source table is like

    state_code State_Name State_Code State_code_Alpha Boat_Rat_Territory

    NULL IL 098 908 0978

    786 NJ 909 889 0989

    878 MA 090 097 0989

    in this scenario i can not create the primary key in destination table.

    what is other option.

    if i have to create the SP then can you please help me to create the SP

    I really appreciate if i get any Help from you

    Thank you

    Please help me in this scenario.

  • nikishh9 (5/19/2011)


    Hi All,

    i load the data from Staging database to Poststaging but when run package two time at that time i got same rows two time or if i run package three time i got same data three time in poststaging table.

    i took OLEDB Souce which extract the data from staging

    i took OLEDB destination whichload the data to Poststaging

    my question is how i prevent to load dublicate row in to OLEDB destination.

    in my many source column contain NULL value and there is no any row which uniquly identify.

    my question is i wan to prevent to the same rows in destination.

    i dont in this scenario i have to create the store procedure or i have to use the look transpomation.

    if i have create the store procedure then where i have to create the SP.

    My source table is like

    state_code State_Name State_Code State_code_Alpha Boat_Rat_Territory

    NULL IL 098 908 0978

    786 NJ 909 889 0989

    878 MA 090 097 0989

    in this scenario i can not create the primary key in destination table.

    what is other option.

    if i have to create the SP then can you please help me to create the SP

    I really appreciate if i get any Help from you

    Thank you

    Please help me in this scenario.

    If there is nothing which uniquely identifies your rows, how do you define a duplicate?

    The absence of evidence is not evidence of absence
    - Martin Rees
    The absence of consumable DDL, sample data and desired results is, however, evidence of the absence of my response
    - Phil Parkin

  • ya thats true.

    but when i run the package it appending the rows.

    how i avoid this one

    same rows appending

    so i dont know i have to create the SP or identity column

  • What do you want to happen?

    1) Do you want to delete all the rows and re-load them? If yes, just truncate the destination table before doing the load.

    2) Or not load them because they are already there? If you want this, you have to find a way of matching between your source data and destination data. But without being able to uniquely identify rows, it is impossible to do a 1:1 match.

    The absence of evidence is not evidence of absence
    - Martin Rees
    The absence of consumable DDL, sample data and desired results is, however, evidence of the absence of my response
    - Phil Parkin

  • ya thats true.

    in my requirement they dont want to truncate the destination table.

    there is no any uniquely identify the rows.

    so i dont thinng i can do it.

    is there any way i can compare the destination data with source data so if the data is already loaded then it will not load again.

  • Well, I am also working on a scenario very similar to yours where we dont have any primary key defined for our datawarehouse table but we have to incremental load the table for any changes. What I was doing is to apply order by clause to include all column name then compare each row (column wise) with your destination table.

    Though this approach is not good performance wise, but you dont have any other way to do the same without any primary key.

    __________________________________________
    ---------------------------------------------------
    Save our mother Earth. Go Green !!!

  • Anjan Wahwar (5/20/2011)


    --

    Though this approach is not good performance wise, but you dont have any other way to do the same without any primary key.

    One possibility I thought of was to use a binary checksum across all the columns and then match on that. Not sure what the possibility of incorrect matches would be.

    The absence of evidence is not evidence of absence
    - Martin Rees
    The absence of consumable DDL, sample data and desired results is, however, evidence of the absence of my response
    - Phil Parkin

  • Thanks Anjan. May it will work.

Viewing 8 posts - 1 through 7 (of 7 total)

You must be logged in to reply to this topic. Login to reply