March 23, 2012 at 10:45 am
In SSIS, I need to avoid duplicate records which was coming form flat file. So what i am doing is, driving output records of flat file to sort transformation, and sorting the records ascending by license id. But i am in speculation that would effect the performance. So i am planning the other way, that is driving the output records of the flat file to the temp table and using the select distinct query. So that i can avoid duplicate records.
So please any one suggest me, which method will effect the performance level.
March 23, 2012 at 11:28 am
It dependstm.:-P
On one hand the SSIS sort will introduce blocking in your data flow.
On the other hand you'll have the cost of both writing and reading from the staging table plus the sort in SQL.
I think you'll just have to run some tests. My guess would be that for smaller data sets the SSIS transform will be faster and for very large data sets (millions of rows+) then it will be faster to load to a staging table. The SSIS transform will certainly be simpler if that's a consideration.
Viewing 2 posts - 1 through 1 (of 1 total)
You must be logged in to reply to this topic. Login to reply