Two 'for each file' loops and one data flow task

  • All,

    I have a data task which I want to use for two 'for each file' loops which run in parallel.

    I know I can copy the data task from one loop to the other but then any changes have to be made in each copy of the data task. Is it possible to 'share' or 'link' a data flow task to two loops so that any changes the data flow only need to be made once?

    Thanks

     

  • Not directly possible, as far as I know.

    But there may be an alternative approach. It is possible to execute multiple instances of the same package, in parallel. With some careful design and by passing the appropriate parameters, you should be able to achieve the parallelism you require.

    The absence of evidence is not evidence of absence
    - Martin Rees
    The absence of consumable DDL, sample data and desired results is, however, evidence of the absence of my response
    - Phil Parkin

  • Thanks for your help.

    For now I have it setup as in the attached (it doesn't look like I can paste an image into a reply?).

    All of the loops are 'for each file'. The first one splits the files into two folders and works fine.

    The other two loops should be processing a folder each. They each have their own variable names.

    The loop on the right (loop A) runs with no problem.

    As soon as the loop at the bottom (loop B) starts it fails with the error:

    at Microsoft.SqlServer.Dts.Runtime.Wrapper.IDTSVariables100.get_Item(Object Index)

    at SC_6458201212534d39b1243c5b61e12023.Variables.get_gmlFileA()

    at SC_6458201212534d39b1243c5b61e12023.ScriptMain.CreateNewOutputRows()

    at SC_6458201212534d39b1243c5b61e12023.UserComponent.PrimeOutput(Int32 Outputs, Int32[] OutputIDs, PipelineBuffer[] Buffers, OutputNameMap OutputMap)

    at Microsoft.SqlServer.Dts.Pipeline.ScriptComponentHost.PrimeOutput(Int32 outputs, Int32[] outputIDs, PipelineBuffer[] buffers)

    gmlFileA is the 'forloop' variable for loop A and not used in loop B. I've checked a few times but it's always possible I've missed something. My assumption, from the error message, is that something in loopB is trying to use gmlFilea?

    If I run them in series they run fine so logically it's something to do with them running in parallel.

    Should this setup work in theory or am I trying to fix something that will never work?

    I can also try setting it up as a package and executing it multiple times. I'm just interested, if possible, to know why this solution doesn't work.

    Attachments:
    You must be logged in to view attached files.
  • Looks to me like it should run just fine and I agree with your assessment of what the error indicates.

    I suggest that you open up the package in Notepad++ and do a search for the problematic variable, just to double-check that it's not being referenced anywhere in the second loop.

    The absence of evidence is not evidence of absence
    - Martin Rees
    The absence of consumable DDL, sample data and desired results is, however, evidence of the absence of my response
    - Phil Parkin

  • Thanks for your help.

    I didn't know about the possibility of reading the project file in Notepad++ .  I've now checked it and I can't find gmlFileA in loopB.

    I'll keep looking and post if I work out the cause.

     

  • as_1234 wrote:

    Thanks for your help.

    I didn't know about the possibility of reading the project file in Notepad++ .  I've now checked it and I can't find gmlFileA in loopB.

    I'll keep looking and post if I work out the cause.

    I suggested checking the package file (.DTSX) in Notepad++, not the project file.

    The absence of evidence is not evidence of absence
    - Martin Rees
    The absence of consumable DDL, sample data and desired results is, however, evidence of the absence of my response
    - Phil Parkin

  • Sorry, I used the wrong word. It was the package file I checked. Apologises for the error in my post.

  • I am not sure about the full process you are attempting here - but it looks to me like a situation that might work better as a master package and sub-packages in a single project.

    If you built it that way...project parameters would be shared across the whole project and package variables would be specific for each package.  This could be accomplished by using a single sub-package with appropriate variables assigned at run-time...or separate sub-packages specific to that flow.

    Personally - I would use separate packages.  Once the basic package is built - it would then be very easy to copy/paste a new package...changing the package as needed for that specific process.  Then in the master package you add another call to the new sub-package.

    I have done many projects this way and development of additional packages is quite easy and takes minimal time.

    And...new projects that are similar can easily be created from the existing project using the Integration Services Project Import Wizard and specifying the catalog or an ispac as the source.  Once the new project is created then it is just a matter of changing the parameters, variables and usually just simple changes to the data flow for the new import/export process.

    Jeffrey Williams
    “We are all faced with a series of great opportunities brilliantly disguised as impossible situations.”

    ― Charles R. Swindoll

    How to post questions to get better answers faster
    Managing Transaction Logs

  • Thank you for your help. I now have it set up using a sub package and it works.

    Just in case anyone else finds this thread looking for details of parallel 'for each' loops - Unfortunately I've not managed to work out why it fails. Sometimes loop A fails complaining about variables from loop B and sometimes vice versa. I converted it over to a sub package design instead.

  • Well done on getting round it.

    If you have the time and inclination, feel free to post a package here which would allow me to recreate the issue in my own environment.

    Simplify things as much as possible & remove as much reliance on external files and databases as you can. I do not understand why you had this issue & would be interested in seeing it for myself (and, hopefully, in being able to find a resolution).

    The absence of evidence is not evidence of absence
    - Martin Rees
    The absence of consumable DDL, sample data and desired results is, however, evidence of the absence of my response
    - Phil Parkin

  • Thanks for your reply. I'll see if I can post something. As I'm fairly new to SSIS I'm wary of making sure I exclude all usernames and passwords e.t.c.

  • Understood. There are a couple of things you can do to reassure yourself about that, should you decide to try:

    1. Set the package protection level to DontSaveSensitive. This forces VS not to save the values of passwords or 'Sensitive' parameters in the package
    2. Open the package in a text editor, such as Notepad++, and check the text for anything which looks sensitive (.DTSX files are XML).

    The absence of evidence is not evidence of absence
    - Martin Rees
    The absence of consumable DDL, sample data and desired results is, however, evidence of the absence of my response
    - Phil Parkin

Viewing 12 posts - 1 through 11 (of 11 total)

You must be logged in to reply to this topic. Login to reply