April 10, 2012 at 6:42 am
Hi, I have some excel files. The number of files are not constant. They can increase or decrease. I m using for each loop to store data in excel files to one sql table. This is taking around one and half day to load all files. I want to reduce the time. Is there any way to do this? Please share your thoughts.
I was going through the Microsoft article where they have transferred 1 tb of data in less than 30 min. But this is not useful for me in my case( I think).
Thanks
Rajneesh
April 10, 2012 at 9:29 am
The 1TB in 30 minutes example was leveraging parallelism within SSIS. If you're using a foreach loop then you're making SSIS single-threaded. Since you have a varying number of files though a foreach loop is the natural choice to begin with.
From here, I would look to split up your workload into multiple foreach loops, each with a different mutually exclusive file masks.
There are no special teachers of virtue, because virtue is taught by the whole community.
--Plato
April 10, 2012 at 9:32 am
With no specifics, it's going to be awfully hard to give you anything meaningful. The only insight I could give you is to identify what part of the loading process is taking all of the time.
Is there more than one step involved? As in - are you loading AND scrubbing the data?
Are you enhancing the data in any way or just a straight import?
Could you benefit from running TWO instances of the same process in parallel? What about 3?
----------------------------------------------------------------------------------
Your lack of planning does not constitute an emergency on my part...unless you're my manager...or a director and above...or a really loud-spoken end-user..All right - what was my emergency again?
April 10, 2012 at 11:57 pm
How many files are we talking about here?
How many rows? How many megabytes?
Need an answer? No, you need a question
My blog at https://sqlkover.com.
MCSE Business Intelligence - Microsoft Data Platform MVP
April 11, 2012 at 1:08 am
opc.three (4/10/2012)
The 1TB in 30 minutes example was leveraging parallelism within SSIS. If you're using a foreach loop then you're making SSIS single-threaded. Since you have a varying number of files though a foreach loop is the natural choice to begin with.From here, I would look to split up your workload into multiple foreach loops, each with a different mutually exclusive file masks.
As a matter of fact, such a design would introduce parallelism
Raunak J
April 11, 2012 at 2:31 am
Hi All,
A file can be of minimum 30000 KB in size. So let say there are around 400 files, ssis is taking around one and half day to load the data in the sql table using for each loop.
Thanks
Rajneesh
April 11, 2012 at 2:39 am
rajn.knit07 (4/11/2012)
Hi All,A file can be of minimum 30000 KB in size. So let say there are around 400 files, ssis is taking around one and half day to load the data in the sql table using for each loop.
That's a lot of data. In Excel.
Try to exploit parallellism as mentioned above.
Only take columns/rows you actually need.
Enlarge the network package size and enlarge the dataflow buffers.
Need an answer? No, you need a question
My blog at https://sqlkover.com.
MCSE Business Intelligence - Microsoft Data Platform MVP
April 11, 2012 at 2:39 am
Give more info in terms of:
Kind of hardware, number of rows, full load or incremental load, and lastly a snapshot of the package design.
Raunak J
April 11, 2012 at 8:26 am
Raunak Jhawar (4/11/2012)
opc.three (4/10/2012)
The 1TB in 30 minutes example was leveraging parallelism within SSIS. If you're using a foreach loop then you're making SSIS single-threaded. Since you have a varying number of files though a foreach loop is the natural choice to begin with.From here, I would look to split up your workload into multiple foreach loops, each with a different mutually exclusive file masks.
As a matter of fact, such a design would introduce parallelism
Precisely. That's the point. Introducing multiple foreach loops introduces parallelism, which is how we can achieve maximum throughput.
There are no special teachers of virtue, because virtue is taught by the whole community.
--Plato
Viewing 9 posts - 1 through 8 (of 8 total)
You must be logged in to reply to this topic. Login to reply