November 23, 2007 at 11:38 am
I could use some advice on possible approaches for a scenario, that loads a text file extract into a large staging table, then would update one or more target tables within the same database. Extract will contain changed records for existing customers and new records for new customers. This would be an OLAP or decision support, reporting only db.
I need to replace any existing rows with corresponding rows from the staging table (using some type of key like customer number,etc), but only if any of the column values have changed as compared to whats in the table row. Also, add any new rows that exist in the staging table, but not in the target table(s).
Thanks for any info and/or examples.
November 23, 2007 at 11:43 am
Do you have any triggers on the target tables? It's important...
--Jeff Moden
Change is inevitable... Change for the better is not.
November 23, 2007 at 12:13 pm
Hi Jeff,
This is a largely 'from scratch effort', so currently no triggers in place.
November 23, 2007 at 5:14 pm
Randy,
There's just a huge number of options because you don't have any triggers. For example, IF you're staging table contained ALL the customers (including ones that are no longer your customers but once were), even if nothing in the row changed AND you had a single target table, it would be much faster to do what I call a "Snap Rename". That's where you temporarily drop or disable FK constraints, rename the current target table to "old", rename the staging table to "current", and reinstantiate the FK's (if you have any). Whole thing takes about 65 milliseconds depending on the number of FK's. Of course, that means it would be a "static" table so far as your apps went... any changes you make to the table would be lost.
Another method is, don't try to search and update changed rows... takes too long to compare column by column. Instead, add a CHECKSUM to both tables... if the CHECKSUM is different, delete the old row and insert the replacement. Then just insert new rows using either an outer join comparison of the PK's or a WHERE NOT IN (just a tiny bit faster than the exclusive outer join). Again, disadvantage is that you cannot make any mods to the data on the target table... changes would be lost in the future.
Shifting gears a bit and also a bit out of my league because I haven't played with it yet... SSIS. Rumor has it that it's pretty good for things like this. Dunno for sure and dunno what the performance and/or blocking impacts are.
Last, but not least, is the good old fashioned, hard-coded "upsert" or "merge"... my favorite way is to have an extra column on the staging table where I compare the rows/columns and I mark this "Action" column with "D", "U", or "I" (hmmm... Freudian slip there, I guess... 😛 ) for "Delete", "Update", or "Insert". Then, to absolutely prevent any blocking of the process and to ensure the quickest "git er' done", I lock the target table and do all the Deletes at once and unlock the table. Then, I'll wait 5 or 10 seconds to allow other folks in'n'out, lock the target table, quickly do the inserts, and unlock the table. Same thing, one more time, for the Updates.
Actually... I never delete... I archive... but same idea.
Hope that's what you were looking for.
--Jeff Moden
Change is inevitable... Change for the better is not.
November 25, 2007 at 8:55 pm
Checksum has a possibility of producing the same result from different inputs ....
http://msdn2.microsoft.com/en-US/library/ms189788.aspx
MSDN recommends using HashBytes (although still not guaranteed), but to be entirely accurate you would need to add some other processing. Checksum is a nice fast way of getting >99% of the differences though - when I ran into this problem I ended up doing a rowcount as an additional verification (which was sufficient in that case).
November 26, 2007 at 7:38 am
Jeff,
Thanks for the detailed reply..this is a big help!!
Randy
November 26, 2007 at 9:18 am
Sure thing and thanks for the feedback, Randy... if you have any other questions, please don't hesitate to ask.
--Jeff Moden
Change is inevitable... Change for the better is not.
Viewing 7 posts - 1 through 6 (of 6 total)
You must be logged in to reply to this topic. Login to reply