May 5, 2009 at 7:55 am
I have large files. One with 100 Million records and the other with 7 Million. The Merge Join is taking too long to complete. Is there an alternative faster way of joining these files?
May 5, 2009 at 8:57 am
I think I found a way to get the job done without taking so much time using a merge join. I added anothe data flow where I load the contents of the larger file into a cache transform. Then once that data flow completes I execute my original dataflow and use a lookup transform to get the data I need from the Cache using the Cache Connection used to cache the larger file in the first data flow.
May 5, 2009 at 9:37 am
I would load the data from the files into a table, then use a sql task to create some indexes on the tables, and then join them using the appropriate JOIN deepening on your requirements. Then run a sql task to remove the tables.
This should run a lot faster than doing the processing entirely in SSIS
May 5, 2009 at 10:06 am
Will give it a try...Thanks
May 5, 2009 at 10:06 am
steveb (5/5/2009)
I would load the data from the files into a table, then use a sql task to create some indexes on the tables, and then join them using the appropriate JOIN deepening on your requirements. Then run a sql task to remove the tables.This should run a lot faster than doing the processing entirely in SSIS
That's what I would do too - except for the remove tables bit. I'd drop the indexes & then truncate - or truncate & then drop - not sure which is faster. Then the tables are empty and ready for the next load - if you are doing this regularly, of course.
Phil
The absence of evidence is not evidence of absence.
Martin Rees
You can lead a horse to water, but a pencil must be lead.
Stan Laurel
Viewing 5 posts - 1 through 4 (of 4 total)
You must be logged in to reply to this topic. Login to reply