April 11, 2011 at 3:07 am
Hi,
I am transferring one Source to several destinations using parallel dataflow task execution . but five dataflow tasks are reading same source five times. I want to decrease this reading action to one using cache or buffer .
Has anyone experienced about this?
Thanks
April 11, 2011 at 5:59 am
It is not possible to "share" data between different dataflows.
You could create one dataflow with one source and a multicast component to create different flows.
Need an answer? No, you need a question
My blog at https://sqlkover.com.
MCSE Business Intelligence - Microsoft Data Platform MVP
April 11, 2011 at 1:37 pm
Thanks for your answer, but I need it in Control Flow
April 11, 2011 at 2:47 pm
Could you explain why you want to do it this way? I think it would be useful to understand.
CEWII
April 11, 2011 at 11:34 pm
harun-974445 (4/11/2011)
Thanks for your answer, but I need it in Control Flow
You didn't mention the control flow in your question.
Can you explain the complete set-up, as Elliot has asked?
Need an answer? No, you need a question
My blog at https://sqlkover.com.
MCSE Business Intelligence - Microsoft Data Platform MVP
April 12, 2011 at 12:42 am
Hi,Thank you for your interest
I want to optimize my performance using below actions on my SSIS package
1.Reading same source once and transfer same Data several destinations using buffer or cache
2.Using parallel execution
3.Using dynamically connection strings with scope variables
4.error handling
I succeded 1,2 and 3 in Dataflow level with multicast. But if any node of multicast component fails then all of the other nodes fail. if this scenario in Controlflow with parallel dataflow execution , then other nodes are running successfully. only one node fails. But in Controlflow I can't use cache or buffer.
April 12, 2011 at 8:56 am
harun-974445 (4/12/2011)
Hi,Thank you for your interestI want to optimize my performance using below actions on my SSIS package
1.Reading same source once and transfer same Data several destinations using buffer or cache
2.Using parallel execution
3.Using dynamically connection strings with scope variables
4.error handling
I succeded 1,2 and 3 in Dataflow level with multicast. But if any node of multicast component fails then all of the other nodes fail. if this scenario in Controlflow with parallel dataflow execution , then other nodes are running successfully. only one node fails. But in Controlflow I can't use cache or buffer.
Not sure if I understood correctly, however...
Can you stage your data into a temp table, i.e. in step 1;
1) Read from source and move to a temp table (I'll prefer it to be a physical temp table....)
2) On inserting into the temp table, add a new column, such as EntryID IDENDITY (1,1)
3) When reading the data from the staging table, specify an integer range value, i.e. for example 1..100 for the first data flow, 101..200 for the second data flow etc... specifying NOLOCK on this temptable...
Regards
Brian Ellul
----------------------------------------------------------------------------------
Computers are incredibly fast, accurate, and stupid. Human beings are incredibly slow, inaccurate, and brilliant. Together they are powerful beyond imagination.
- Albert Einstein -
April 12, 2011 at 9:50 am
Why is one of the nodes of the multicast failing. Answer those questions and you will probably reveal what you need to do in terms of error handling so the package can finish successfully.
Basically figure out why it fails, then shunt the data somewhere else to be dealt with differently in an error handler, so that the transform never fails in the first place.
April 13, 2011 at 3:09 am
Daniel Bowlin (4/12/2011)
Why is one of the nodes of the multicast failing. Answer those questions and you will probably reveal what you need to do in terms of error handling so the package can finish successfully.Basically figure out why it fails, then shunt the data somewhere else to be dealt with differently in an error handler, so that the transform never fails in the first place.
I have a management server and different 5 front server in my scenario. And my transfer package is running every 3 minutes. Sometimes one of front servers is inaccessable because of windows update or network problem. my ssis package should not be interrupted when an error occurs here. but this process fails if I use a multicast
April 13, 2011 at 11:28 am
As it sits the pipeline(buffers) from one data-flow are not accessible from another, period, there is nothing that allows that.
With that said I think brian118 was on the right track.. Provided that I understand what you are trying to accomplish.. As I see it you have a file that you want to push to 5 other servers at the same time while only reading the file once. You can *kind of* do that. You pull the file into a table in SQL, I would use a persisted table, probably in temdb, since SSIS can get a little freaky with temp (#, ##) tables. Once you have the data in that table it is really easy and fairly efficient to just suck the data out of the table and into each pipeline. It isn't exactly what you were looking for but what you want to do can't be done efficiently with the tools available.
CEWII
April 13, 2011 at 1:14 pm
Elliott Whitlow (4/13/2011)
As it sits the pipeline(buffers) from one data-flow are not accessible from another, period, there is nothing that allows that.With that said I think brian118 was on the right track.. Provided that I understand what you are trying to accomplish.. As I see it you have a file that you want to push to 5 other servers at the same time while only reading the file once. You can *kind of* do that. You pull the file into a table in SQL, I would use a persisted table, probably in temdb, since SSIS can get a little freaky with temp (#, ##) tables. Once you have the data in that table it is really easy and fairly efficient to just suck the data out of the table and into each pipeline. It isn't exactly what you were looking for but what you want to do can't be done efficiently with the tools available.
CEWII
If you are reading the same file 5 times, you can also use the SSIS RAW files instead of a SQL Server. That is *probably* faster...
Need an answer? No, you need a question
My blog at https://sqlkover.com.
MCSE Business Intelligence - Microsoft Data Platform MVP
April 13, 2011 at 1:22 pm
harun-974445 (4/12/2011)
Hi,Thank you for your interestI want to optimize my performance using below actions on my SSIS package
1.Reading same source once and transfer same Data several destinations using buffer or cache
2.Using parallel execution
3.Using dynamically connection strings with scope variables
4.error handling
I succeded 1,2 and 3 in Dataflow level with multicast. But if any node of multicast component fails then all of the other nodes fail. if this scenario in Controlflow with parallel dataflow execution , then other nodes are running successfully. only one node fails. But in Controlflow I can't use cache or buffer.
I'm personally with Daniel on this, why is the node failing? Is the server down? Deadlock problems? What's the failure reason?
Also, for this kind of distribution, I'd personally drop to one table as SoR, then replicate the table to the other locations. This removes the multiple server success issue, allows for a single point of knowledge, and uses tools more commonly associated with multi-casting data to multiple locations.
Never stop learning, even if it hurts. Ego bruises are practically mandatory as you learn unless you've never risked enough to make a mistake.
For better assistance in answering your questions[/url] | Forum Netiquette
For index/tuning help, follow these directions.[/url] |Tally Tables[/url]
Twitter: @AnyWayDBA
Viewing 12 posts - 1 through 11 (of 11 total)
You must be logged in to reply to this topic. Login to reply