Issue with Lookups, Fast Loads and DTC transactions

  • Hello,

    I am having a strang problem while performing some tests on an existing package with a high load. I have the following setup:

    - 1 dimension table

    - 1 staging table that holds my staging data

    - 1 update table that I use to stage the data before I send a bulk update for my dimension rows.

    The package is setup as follows:

    - 1 Data Flow task that will read from the staging table. It then computes a hash. The data is looked up from the dimension table. If it's found and the hash is different from the one in database, a row is added to my update table using an OLE DB command (type 1). If not, (ie lookup error), the row is inserted directly into the dimension table using a fast load/table lock

    - then the control flow makes a big update statement from my update table into the dimension

    This control flow container has the required transaction DTC attribute, this is something I really need.

    My issue here is that my dimension table has a lot of rows (about 25,000). Without any cache parameters on the lookup transform, the package is excruciatingly slow. So I decided to add a memory caching restriction. But now, the data flow task preforms fast but the commit of the data flow task fails on my new dimension row insert (not the update one) with an error that says that "The operation conflicts with another operation. The operation is cancelled".

    I have SP3 installed so it should not be this issue here: http://support.microsoft.com/kb/949687 and everything is run locally on top of that...

    So I'm a bit puzzled. Am I going to have to stage my inserts as well, then in a separate SQL task in the control flow, perform a bulk insert? That just does not make much sense to me. Technically, my lookup is performed on rows that are not conflicting with the ones I am about the insert since the lookup is performed on the dimension adn the insert is done only when the lookup fails...

    Any help regarding this issue would be greatly appreciated. I am lost here... I could always do an insert staging but I feel that there should be something that makes more sense to allow this... I will try to remove the fast load/table lock to see how that works but I don't want to impact performance too much either 🙁

  • Just curious, have you tried using the SCD transformation for this?

    John Rowan

    ======================================================
    ======================================================
    Forum Etiquette: How to post data/code on a forum to get the best help[/url] - by Jeff Moden

  • Hello,

    No I have not. the main reason behind this is that I wanted to use this to understand really how packages work, data flows etc. I understand the SCD transform basically fulfills this purpose quite well but I wanted to implement the same kind of logic myself.

    In the end, the only way I was able to set a memory restriction on my lookup and have my inserts to work was to replace the fast load for a regular load operation. I mean it kinda makes sense, you want to cache data but another thread wants to get a table lock to insert that data which will block the lookup cache or vice versa. I just wished it was a bit "smarter" at trying to figure that out. Oh well, nothing new here regarding some DTC issues one can encounter while programming COM+ components as well 😉

  • I seem to remember running accross this before.. Create another connection that exactly matches the existing connection (except the name) the lookup uses and then change the lookup to use that new connection. I think that will solve it..

    CEWII

Viewing 4 posts - 1 through 3 (of 3 total)

You must be logged in to reply to this topic. Login to reply