May 8, 2015 at 11:31 am
Hi Everyone,
which one is the best way to remove duplicates in flatfile before loading into DataBase Table.
currently i am using aggregate transformation. is any other transformation there for remove duplicate..?
kindly suggest.
thanks
Kannan
May 8, 2015 at 11:36 am
one option is to import into a staging table, then append the unique values to another table...
May 8, 2015 at 1:31 pm
that is fine. but remove duplicates in flatfile itself..?
May 8, 2015 at 2:20 pm
removing duplicates will mean rewriting the file, so whether you put it in sq serverl, and sort+ group, or do it in memory via aggregates or script tasks, it still needs to be written to disk again.
i'd probably lean towards maybe a script task to suck it into a data table, where you can use linq to sort and group by, and output the fiel again
Lowell
May 8, 2015 at 6:49 pm
Kannan Vignesh (5/8/2015)
that is fine. but remove duplicates in flatfile itself..?
The absolute best way is to have the people that are creating the flat file do their jobs better and prevent the duplicates from being written to the flat file to begin with.
Heh... well, you did ask for the BEST way. 😉
--Jeff Moden
Change is inevitable... Change for the better is not.
Viewing 5 posts - 1 through 4 (of 4 total)
You must be logged in to reply to this topic. Login to reply