June 28, 2012 at 5:19 pm
hi All,
I am using a single dataflow that is populating OLE DB staging tables from respective OLE DB data sources. So they all run in parallel. My question is the next task is an execute sql task that has to run when only when the dataflow task completely finishes. i.e all staging tables have been populated. Can i assume the dataflow is synchronous in that it will only proceed to the next control task when it is completed , or do i have to do something else to check its done. If yes what is the defacto/best practice way of doing this. Should i be using a sequence container to wrap the dataflow in and look at precedence constraints? I'm in early days of SSIS ...so bear with me...need a steer
thx
Robin
June 29, 2012 at 12:42 am
The dataflow will only finish if every component in the dataflow has finished. In other words, when every parallel path has finished. So you're good, just connect the Execute SQL Task to the dataflow with a green arrow.
Be aware: having too much parallel flows in one dataflow can affect performance. Keep it limited to about 5 parallel flows.
Need an answer? No, you need a question
My blog at https://sqlkover.com.
MCSE Business Intelligence - Microsoft Data Platform MVP
July 2, 2012 at 3:00 am
robinrai3 (6/28/2012)
hi All,I am using a single dataflow that is populating OLE DB staging tables from respective OLE DB data sources. So they all run in parallel. My question is the next task is an execute sql task that has to run when only when the dataflow task completely finishes. i.e all staging tables have been populated. Can i assume the dataflow is synchronous in that it will only proceed to the next control task when it is completed , or do i have to do something else to check its done. If yes what is the defacto/best practice way of doing this. Should i be using a sequence container to wrap the dataflow in and look at precedence constraints? I'm in early days of SSIS ...so bear with me...need a steer
thx
Robin
Not sure whether it is 'best practice', but I would say that having a single dataflow for multiple sources and destinations is likely to be difficult to maintain and troubleshoot.
My preference would be to construct multiple dataflows - one for each source/destination - and put them all in a Sequence container, maintaining your required degree of parallelism. Connect the ExecuteSQL task to the Sequence container to enforce your required processing order.
The absence of evidence is not evidence of absence.
Martin Rees
You can lead a horse to water, but a pencil must be lead.
Stan Laurel
Viewing 3 posts - 1 through 2 (of 2 total)
You must be logged in to reply to this topic. Login to reply