for each loop container to loop through excel files

  • Hi, I have some excel files. The number of files are not constant. They can increase or decrease. I m using for each loop to store data in excel files to one sql table. This is taking around one and half day to load all files. I want to reduce the time. Is there any way to do this? Please share your thoughts.

    I was going through the Microsoft article where they have transferred 1 tb of data in less than 30 min. But this is not useful for me in my case( I think).

    Thanks

    Rajneesh

  • The 1TB in 30 minutes example was leveraging parallelism within SSIS. If you're using a foreach loop then you're making SSIS single-threaded. Since you have a varying number of files though a foreach loop is the natural choice to begin with.

    From here, I would look to split up your workload into multiple foreach loops, each with a different mutually exclusive file masks.

    There are no special teachers of virtue, because virtue is taught by the whole community.
    --Plato

  • With no specifics, it's going to be awfully hard to give you anything meaningful. The only insight I could give you is to identify what part of the loading process is taking all of the time.

    Is there more than one step involved? As in - are you loading AND scrubbing the data?

    Are you enhancing the data in any way or just a straight import?

    Could you benefit from running TWO instances of the same process in parallel? What about 3?

    ----------------------------------------------------------------------------------
    Your lack of planning does not constitute an emergency on my part...unless you're my manager...or a director and above...or a really loud-spoken end-user..All right - what was my emergency again?

  • How many files are we talking about here?

    How many rows? How many megabytes?

    Need an answer? No, you need a question
    My blog at https://sqlkover.com.
    MCSE Business Intelligence - Microsoft Data Platform MVP

  • opc.three (4/10/2012)


    The 1TB in 30 minutes example was leveraging parallelism within SSIS. If you're using a foreach loop then you're making SSIS single-threaded. Since you have a varying number of files though a foreach loop is the natural choice to begin with.

    From here, I would look to split up your workload into multiple foreach loops, each with a different mutually exclusive file masks.

    As a matter of fact, such a design would introduce parallelism

    Raunak J

  • Hi All,

    A file can be of minimum 30000 KB in size. So let say there are around 400 files, ssis is taking around one and half day to load the data in the sql table using for each loop.

    Thanks

    Rajneesh

  • rajn.knit07 (4/11/2012)


    Hi All,

    A file can be of minimum 30000 KB in size. So let say there are around 400 files, ssis is taking around one and half day to load the data in the sql table using for each loop.

    That's a lot of data. In Excel.

    Try to exploit parallellism as mentioned above.

    Only take columns/rows you actually need.

    Enlarge the network package size and enlarge the dataflow buffers.

    Need an answer? No, you need a question
    My blog at https://sqlkover.com.
    MCSE Business Intelligence - Microsoft Data Platform MVP

  • Give more info in terms of:

    Kind of hardware, number of rows, full load or incremental load, and lastly a snapshot of the package design.

    Raunak J

  • Raunak Jhawar (4/11/2012)


    opc.three (4/10/2012)


    The 1TB in 30 minutes example was leveraging parallelism within SSIS. If you're using a foreach loop then you're making SSIS single-threaded. Since you have a varying number of files though a foreach loop is the natural choice to begin with.

    From here, I would look to split up your workload into multiple foreach loops, each with a different mutually exclusive file masks.

    As a matter of fact, such a design would introduce parallelism

    Precisely. That's the point. Introducing multiple foreach loops introduces parallelism, which is how we can achieve maximum throughput.

    There are no special teachers of virtue, because virtue is taught by the whole community.
    --Plato

Viewing 9 posts - 1 through 8 (of 8 total)

You must be logged in to reply to this topic. Login to reply