Suggestions on data compare using SSIS?

  • Hi,

    Perhaps someone can throw some suggestions my way on how to perform a data compare on very large tables. The tables are in different datawarehouses, in different locations and performance is a big issue. I need to compare 4 rows in one table against 4 rows in another (there is a common key). The data is: key, date, time, value.

    I've looked into binary_checksum, checksum, hashbytes, EXCEPT, MERGE. I'm not 100 sure which would offer better performance, and accurancy seems to be an issue with the checksums.

    Currently, with smaller tables I am using EXCEPT, it's performance is acceptable for small tables in the region of 2 million rows. But when I reach the larger tables, upwards of 900 millions rows, I've not even tried any comparisons for fear of causing a nuclear 'meltdown' ;-).

    Any ideas on how to handle a compare of this magnitude would be great?

    Thanks!

  • Just as an update, I ended up going with EXCEPT using a subset of data each day for the comparison. It's performance is better than any of the other options I tried out!

Viewing 2 posts - 1 through 1 (of 1 total)

You must be logged in to reply to this topic. Login to reply