Most efficient way to insert new and update changed.

  • I have an ssis job that does a comparison of a staging table and a production table. It inserts any new rows into the production table and updates any rows that have been changed since the last run. Right now I am using Fingerprint Assignment in ssis to identify the new and missing. However, we are not happy with the performance of this. These tables can contain up to several million rows in some cases. Is there a more efficient way in TS-SQL or ssis to accomplish this?

  • You could use CROSS APPLY with HASHBYTES to do a comparison of the rows. As long as you have some sort of unique id that would help you identify where to make the update. We do something like this with our data warehouse in which we track changes to an employee's record. Their employee id does not change, so we can use that.

    Basically you use CROSS APPLY to generate an XML string of the data, and then convert it to a HASHBYTE for comparison.

    SELECT

    employee_id,

    fname,

    lname,

    ....,

    HASHBYTES('SHA1',h.x) as hashValue

    FROM

    stgTable

    CROSS APPLY(

    SELECT

    fname,

    lname,

    ...

    FOR XML RAW) as h(x)

    This will give you a varbinary representation of the data. Do the same for your production table, and then compare the 2 joining on employee_id and checking where stgTable.hashValue != prodTable.hashValue.

    Those instances where the hashValues are not the same are the records you need to update, and if you don't find the employee_id in prodTable, then those are the ones you insert.

    This may be an overly simple example, but you should be able to apply it to your situation. We use this method to manage changes of over 50,000 active employees daily. The comparison, update and insert is complete in less than 60 seconds.

  • andy-973092 (5/6/2011)


    Right now I am using Fingerprint Assignment in ssis to identify the new and missing. However, we are not happy with the performance of this.

    Is this a third party tool or a technique I'm just not familiar with the name of?


    - Craig Farrell

    Never stop learning, even if it hurts. Ego bruises are practically mandatory as you learn unless you've never risked enough to make a mistake.

    For better assistance in answering your questions[/url] | Forum Netiquette
    For index/tuning help, follow these directions.[/url] |Tally Tables[/url]

    Twitter: @AnyWayDBA

  • Is your staging table flushed and reloaded every time with only changed and new rows?

Viewing 4 posts - 1 through 3 (of 3 total)

You must be logged in to reply to this topic. Login to reply