May 19, 2009 at 10:16 am
Hi
We are having 20 dimension tables and each table will be having around 20 million records.
These tables would be loaded on a daily frequency with 5 files, each of 3 million records.
We are currently using SCD transformation for TYPE2 load of data.( to maintain history in the dimension table.)
But SCD is taking a long time to insert the data and below are the statistics that I recorded when I executed the package with sample files:
Run1:File1(0.5 million records) —2 minutes (Dimension Table is empty)
Run2:File2(0.5 million records) —13 minutes (Table has 589,000 records)
Run3:File3(0.5 million records) —26 minutes (Table has 1,140,000 records)
Run4:File4(0.5 million records) —37 minutes (Table has 1,680,000 records)
Run5:File5(1 million records) —51 minutes (Table has 2,780,000 records)
Package elapsed time : 2 hr 9 min
1. How do i improve the performance of the SCD? If not, is there any way of loading a table parallely from file so that i can achive performance?
2. In informatica, we have a partitioning feature to load the data parallely which greatly improves performance. Is there any equivalent feature or workaround in SSIS?
Any help would be greatly appreciated.
Thanks,
May 19, 2009 at 5:36 pm
dasari_stupid (5/19/2009)
HiWe are having 20 dimension tables and each table will be having around 20 million records.
These tables would be loaded on a daily frequency with 5 files, each of 3 million records.
We are currently using SCD transformation for TYPE2 load of data.( to maintain history in the dimension table.)
But SCD is taking a long time to insert the data and below are the statistics that I recorded when I executed the package with sample files:
Run1:File1(0.5 million records) —2 minutes (Dimension Table is empty)
Run2:File2(0.5 million records) —13 minutes (Table has 589,000 records)
Run3:File3(0.5 million records) —26 minutes (Table has 1,140,000 records)
Run4:File4(0.5 million records) —37 minutes (Table has 1,680,000 records)
Run5:File5(1 million records) —51 minutes (Table has 2,780,000 records)
Package elapsed time : 2 hr 9 min
1. How do i improve the performance of the SCD? If not, is there any way of loading a table parallely from file so that i can achive performance?
2. In informatica, we have a partitioning feature to load the data parallely which greatly improves performance. Is there any equivalent feature or workaround in SSIS?
Any help would be greatly appreciated.
Thanks,
Have you seen Todd McDermid's SCD Component?
Viewing 2 posts - 1 through 1 (of 1 total)
You must be logged in to reply to this topic. Login to reply