August 24, 2012 at 2:07 am
Hi Friends,
I am developing a project in SSIS 2008, SQL 2008 server....i deal with 4 to 6 million records.....i have a table which is used only in the Look Up task. it has 5 million records...So when ever my data flow is executing some cache process is going on which takes 3 to 4 minutes to get the total count of records...3 to 4 minutes is not a problem, but i have 15 data flows, in where this table is frequently used. so everytime for each data flow, this table is doing some cache process...(I have no idea about what the cache process is) ....how to avoid this cache process for each data flows...
I mean, is there any other way, that we can use this table in look up, with caching the records for one time and using it any where ....?
If i am not clear with my question, please let me know...
I need your suggestions...
Thanks,
Charmer
August 24, 2012 at 6:39 am
Phil Parkin (8/24/2012)
Consider changing the cache mode. See here.
i read the article Phil,
Could you please give me your idea for my condition?
My situation is , Look up or reference table has 5 million rows and no indexes and i am not going to upsert this reference table at any time...
So which cache would be good?
If i understood correctly from the article, partial cache is good to my situation. is that correct?
Thanks,
Charmer
August 24, 2012 at 6:56 am
Charmer (8/24/2012)
Phil Parkin (8/24/2012)
Consider changing the cache mode. See here.i read the article Phil,
Could you please give me your idea for my condition?
My situation is , Look up or reference table has 5 million rows and no indexes and i am not going to upsert this reference table at any time...
So which cache would be good?
If i understood correctly from the article, partial cache is good to my situation. is that correct?
It's hard to tell without knowing your data.
Partial cache is good if you know that certain rows are going to be looked up many times. But if the lookup is completely random and spread across all the rows, full cache mode is probably the best (though you should keep an eye on RAM while it is doing the caching - it's very easy to run out of memory with this component).
One tip is to load only the columns that you need into the cache, rather than the whole table, via a query - that will potentially speed things up.
I would try executing the package in all three modes and comparing run times.
But before you do that, put an index on the lookup table's lookup column!!
The absence of evidence is not evidence of absence.
Martin Rees
You can lead a horse to water, but a pencil must be lead.
Stan Laurel
August 24, 2012 at 7:27 am
It's hard to tell without knowing your data.
Partial cache is good if you know that certain rows are going to be looked up many times. But if the lookup is completely random and spread across all the rows, full cache mode is probably the best (though you should keep an eye on RAM while it is doing the caching - it's very easy to run out of memory with this component).
out of 5 million records, almost 4 million records are going to be looked up....in that case full cache is best correct?
One tip is to load only the columns that you need into the cache, rather than the whole table, via a query - that will potentially speed things up.
yes, i choose only those columns in the look up...
I would try executing the package in all three modes and comparing run times.
But before you do that, put an index on the lookup table's lookup column!!
ok, Phill... i will create index for the respective columns...
Thanks,
Charmer
August 24, 2012 at 7:34 am
If you're using full cache, I am not certain that the index will help (it's loading all the columns into memory anyway) - but worth a test.
The index would, however, help a lot for partial or no cache - and that is what my comment was aimed at.
The absence of evidence is not evidence of absence.
Martin Rees
You can lead a horse to water, but a pencil must be lead.
Stan Laurel
August 24, 2012 at 8:25 am
Phil Parkin (8/24/2012)
If you're using full cache, I am not certain that the index will help (it's loading all the columns into memory anyway) - but worth a test.The index would, however, help a lot for partial or no cache - and that is what my comment was aimed at.
ok better, i would try to test with each cache..and i will let you know Phil...
Thanks,
Charmer
Viewing 7 posts - 1 through 6 (of 6 total)
You must be logged in to reply to this topic. Login to reply