April 26, 2010 at 12:59 pm
Hello,
Here is the situation: there is a package in collection that over time has grown over time to include more and more sources. Basically as each new source has been added to the data flow, a new copy of the destination table has also been added. We are at the point now where we have about 12 source flows inserting data into 12 OLE DB destinations where the target of each of those is the same table. When we added the last two destinations we found performance really took a hit. Turns out that those copies of the destination were using Table Locks. They are now all fast-load with no table locks.
My colleague and I wondering what the best way to clean this up is. What we don't know is does it matter how many flows you have loading the same target table. Should we put in a UNION ALL so that there one flow loading the table? Or, can we leave with 12 as it is now?
Thanks!
April 27, 2010 at 3:28 am
I wouldn't recommand the union all if you want to combine 12 flows. The union all is quite a nightmare in maintenance and since it is a semi-blocking transformation, it will still affect performance.
Doesn't deselecting the checkbox Table lock in the OLE DB Destination Editor help?
Need an answer? No, you need a question
My blog at https://sqlkover.com.
MCSE Business Intelligence - Microsoft Data Platform MVP
April 27, 2010 at 7:58 am
Yes, removing the TABLOCK did help performance.
My question though is still: does it matter if you have the same table being loaded by 12 different flows simultaneously or should the table be loaded from a single flow? In my earlier post I forgot to mention that the table being loaded has a clustered index.
Thanks,
padre
April 28, 2010 at 6:59 am
I have run into the same type of scenario, and I do
use the Union All task.
The only trouble I find with it is when I need to modifiy
anything upstream the Union All gets wonky.
Solution? do what you need to do, and then remove the
Union All task and re-add it in.
Works great for me.
You also could consider, if performance is an issue, to
remove the clustered index, then load up, then re-add the
index. Of course that would depend on when you are processing
this package and so on.
I end up rebuilding my indexes on fact tables nightly as we add
250,000 + records per day and over time they degrade anyhow.
April 30, 2010 at 4:48 am
Best way to load the data
1) make the source as single instance
a) Use T-SQL UNION ALL if you source from more than one table
Advantages: It will hit the DB only once and perfrom better
When you need to add NEW source, just change the SQL no additional SSIS Task
DisAdvantage: Datatype conversion and keeping the order of the column
b) USE UNION ALL SSIS TASK
It does the same thing T-SQL Union ALL but it will hit the DB as many as instance (source selection) and you have to add new TASK for new source
2) load them into heap table if we're hiting multiple time
Performance will be hit if you are hitting the NON-EMPTY , indexed tables
Use dummy empty table if you are going to manipulate them before loading
3) change the properties of the Target DB like TABLock and FAST Load so on
April 30, 2010 at 5:29 am
mvelusamy (4/30/2010)
a) Use T-SQL UNION ALL if you source from more than one tableAdvantages: It will hit the DB only once and perfrom better
When you need to add NEW source, just change the SQL no additional SSIS Task
DisAdvantage: Datatype conversion and keeping the order of the column
I won't consider datatype conversion as a disadvantage for T-SQL. The datatypes must match also in the SSIS Union ALL component and it is much much easier to do datatype conversion in T-SQL then in SSIS.
To expand further on the topic: I would (almost) always recommand the T-SQL union above the SSIS variant, as the SSIS is semi-blocking and you can use the T-SQL union even to filter out duplicates. If you want to do that in SSIS, you must add a Sort component, making the dataflow completely blocked.
+ you can edit T-SQL in nice GUI's, but the SSIS union all, well, that's just plain ugly and stupid with all those dropdowns.
Need an answer? No, you need a question
My blog at https://sqlkover.com.
MCSE Business Intelligence - Microsoft Data Platform MVP
Viewing 6 posts - 1 through 5 (of 5 total)
You must be logged in to reply to this topic. Login to reply