What is most efficient (Task & Connections)

  • I was told once that it is better to have multiple tasks using a single DB connection

    vs.

    Having a connection per task...

    I haven't found too much on the subject (not sure if I am looking in the right places)...

    Is there a general rule of thumb which is the standard way of doing it or one being more efficient that the other?

  • Lee,

    Tasks using the same connections are run serially because only one task at a time can use a connection.  If you want to have tasks run in parallel you must have multiple connections with different names. 

    Whether the efficiency is noticeable or not depends on the complexity and number of the tasks and the amount of data that's being moved.  I've found that importing or exporting small amounts of data from multiple tables is fast even if they all run in sequence.  I tend to not have more than four tasks use a single connection.

    Greg

    Greg

  • OK that makes sense...

    The reason I brought this topic up is I have to move 6 tables from one database to another on separate servers daily. The current process had a connection from the parent server to the child server for each of the tables.

    I was curious if it was a good idea to condense that down to one connection for each server and 6 tasks between those two databases.

    The reason being, that in the package properties I saw that you can set how the maximum number of tasks to run in parallel when the packages executes which lead me to believe that if I were to set it to 6 all the tasks would execute at the same time.

    Am I wrong?

  • No, I don't think you are wrong Lee, I think that there might be more variables involved however.

    I have a package with 1 single SQL Source.  I have 20 destination objects and 20 transformations.  When I execute the entire package, it fires up multiple threads of execution and keeps at least 4 or 5 of them running concurrently.  This being said, I seems almost like each one is running sequentially.  Maybe there is a lock being placed on the db which prevents them from all gaining access to the db at the same time...I don't know.  I've never bothered to put a stop-watch on it to see if there is any speed improvement resulting from this either.

  • I kind of noticed the same thing as you described, even though I have 6 tasks going to the same connection, and the option to set the number of connections to run at the same time to 6, the do not all execute at the same time.

     

    I found some other literature that basically supports that, if you want to run stuff (tasks) in parallel you need to have multiple connections set up.

     

    Kind of defeats the purpose of the package property (setting for how many tasks to run at the same time) for one and the whole design concept for two, having to have multiple connections all over your DTS package just to get the tasks to run seperately but in parallel...but that is my opinion only, I am sure there is a good reason for it being set up that way.

Viewing 5 posts - 1 through 4 (of 4 total)

You must be logged in to reply to this topic. Login to reply