October 12, 2010 at 3:21 am
Hello All,
I need to move data from a source SQL table to destination SQL table. Since source is having around half a million of records I have set "Fast-Load" option on OLE destination task.
One constriant is, destination table has got a unique constraint defined on one of the column. Thus there will be an error when a duplicate value arrives in that column.
I am handling these error records by catching them in an error table (Red line from OLE destination task is being linked to one more OLE destination task pointing to 'Error table').
The problem am facing out of above stated set up is: Whenever an error record in encountered, instead of moving that single record to error table whole batch (around 6000 records) are getting redirected to error table.:crazy:
I beleive this is the problem when we set Fast-Load option. Is there work around to reject only the duplicate record? How this issue can be addressed?
Any suggestions are of great help.
Thanks in advance,
Chetan:rolleyes:
October 12, 2010 at 3:34 am
You can clean up your source first before you try to load your destination table. (if possible)
Are those real duplicates (meaning, every column is exactly the same) or are just the business keys duplicates?
For real duplicates, just write a DELETE query to remove them (you can find sample queries all over the web).
For BK duplicates, you can use the ROW_NUMBER function combined with the PARTITION BY function.
This gives every row a row number, based on the partitions. Map the partitions with your business key (meaning, if a BK occurs only once, the partition only has one record with a row number of 1. If a BK occurs 5 times, the partition has 5 records, with row numbers from 1 to 5). Then select only the records with a row number of 1.
Need an answer? No, you need a question
My blog at https://sqlkover.com.
MCSE Business Intelligence - Microsoft Data Platform MVP
October 12, 2010 at 9:25 am
I have BK duplicates. Will try implementing it. Thanks:-)
October 12, 2010 at 9:33 am
That is why we have surrogate keys...:-)
Raunak J
October 12, 2010 at 9:45 am
One thing is, If I don't set Fast-Load option then package is exactly rejecting the duplicate records;-)
But it takes ten times more execution time than having FastLoad option. Am looking If we have any alternatives\properties in SSIS tasks.
If not have to go with options you guys have stated.
October 12, 2010 at 9:52 am
chetan.v (10/12/2010)
.... I beleive this is the problem when we set Fast-Load option.
No, SSIS is trying to commit a batch of records at a time. If any row fails, the whole batch fails. This is caused by the database engine, not SSIS.
Is there work around to reject only the duplicate record? How this issue can be addressed?
Setting the Maximum Insert Commit Size to 1 should do what you're asking for. It will slow things down a bit, but since it will try to commit each row, one row at a time, only the duplicates should error out.
For best practices on asking questions, please read the following article: Forum Etiquette: How to post data/code on a forum to get the best help[/url]
Viewing 7 posts - 1 through 6 (of 6 total)
You must be logged in to reply to this topic. Login to reply