The Assumption
It's a good question to ask. Just about any question is because they all come with answers, sometimes more than one. On its face, it seems reasonable to think that manually arranging the data flow in a parallel manner will allow Integration Services to process the lookups at the same time. Especially because (as the package developer) you know that the lookups aren't going to touch the same columns - and Integration Services can't be expected to know that, can it? Once we're done doing the parallel lookup, we'll just stitch the results back together, easy-peasy, right?
Implementing Manual Parallelism
So what does that look like? Something like this to start...
And then you have to merge the results...
Oh - I caught that. I see you questioning our premise! Merge Joins aren't very efficient, are they? Will the gains from manually parallelizing the lookups be washed out by the performance penalty associated with merging the datasets back together? Does it depend on how many lookups are being done? Or how big the rowsets are?
How Do You Spell "Ass u me"?
Perhaps a little harsh - there's no need to tarnish an inquiring mind! But the moral behind that statement remains - we're assuming that Integration Services needs our help to parallelize the workflow. Fortunately not! We also assumed that "stitching the results back together" will actually be fast and easy. Far from it as we've seen - but that ends up being pretty irrelevant in the end.
Sequential Parallelism - An Assembly Line
That makes no sense, does it? Maybe not - but it's the best way I could describe what Integration Services is actually doing. I've found that comparing the data flow pipeline to an automotive assembly line can be very enlightening. One of the great achievements of the industrial age was the development of the assembly line - the paradigm that allowed the product to be worked on by different people at different stations, rather than a single worker performing multiple tasks on a single product. The efficiencies gained from being able to train workers more thoroughly at specific tasks, and being able to "tune the workflow" by placing more resources on the "slower" steps allows modern manufacturers to smoothly push a consistent flow of product out of their plants.
A strength of the assembly line process is that almost any one step can be parallelized fairly easily. If step #10 in a process takes twice as long as the other steps, simply set up two stations to do the work, and split the "cars" between those two paths. Another strength is that the plant doesn't have a lot of inventory "tied up" in the manufacturing process - if there are one hundred steps in manufacturing the car, then there are only one hundred cars under construction inside the plant (disregarding parallel optimizations). Quite similar to that strength is that it doesn't take that long (or waste that much product) to finish the first car - it only takes as long as the sum of the steps. (In the alternative workflow - a batch approach - it could take much longer to get the first product out the door, and mean lots of in-process chassis in the workflow.)
Those same strengths are achieved in the Integration Services data flow. For example, a Derived Column transformation (or any of the non-blocking synchronous transforms) can easily be parallelized - multiple threads can process different buffers at the same time, coordinating amongst themselves. Similarly, non-blocking transformations mean that the buffers can get passed from transform to transform efficiently, such that there aren't that many buffers (memory) occupied at any one time. Finally, since we're working with buffers, it doesn't take all that long to get the first results out of the pipeline and into the hands of the storage engine to "put away" in the finished car lot.
Enough About Cars - What's Going On With The Data?
- MSDN Whitepaper: Integration Services: Performance Tuning Techniques
- The "godfather" of SSIS MVPs Jamie Thomson: SSIS Lookup component tuning tips
- From the SSIS Team blog: Getting Optimal Performance with Integration Services Lookups
- In fact, all of the SSIS Team blog posts on Lookups
- Fellow MVP Phil Brammer: SSIS – Lookup Cache Modes – Full, Partial, None
- From the Data Storage & SQL Server Performance team blog: Getting Optimal Performance with Integration Services Lookups