Synchronizing Lookup and record inserts

  • New to SSIS and have what I'm sure is a "SSIS 101" question. I've searched to no avail, but may not be using the correct terminology.

    I have a Lookup transformation feeding it's "No Match Output" to an OLE DB Command (which calls a stored proc to add an item to the DB)

    Lookup for row 1 finds no match, OLE DB command adds an item

    Lookup for row 2, which should match the item just added by OLD DB command, finds no match...because the action of the OLE DB command hasn't yet finished/committed(?)

    I end up with duplicate items in the target DB. How can I ensure that the insert resulting from a "no match" lookup completes before the lookup for the next row fires?

    I have cache mode on the lookup set to "no cache" and a "Union All" bringing the match and no match legs of the lookup back together.

    Or do I need a different approach? I guess I could put all of the logic above in a script, but would like to handle it using the SSIS data flow components if feasible.

    Thanks in advance for helping a newbie out!

  • Have you tried putting an explicit transaction in the procedure?

  • Thanks for the quick reply and suggestion...within the stored proc, I wrapped the insert in an explicit transaction but the results are the same. The lookup for the second row of the stream is firing before the insert/transaction of the first row completes (at least that's what it looks like).

    My test source data has 20 rows, and the lookup should match on rows 2-20 (insert occurs on row 1). It looks like the lookup of all 20 rows completes even prior to the insert caused by the "no match" of the first row.

    Somehow, I need to halt the lookup for the second row until the insert/transaction of the first row has completed.

    I've verified my lookup is correct, because if I execute the package again the lookup finds the records inserted during the previous run.

  • On your dataflow task properties you can change the Isolation Level to Read Uncommitted or you could put a NoLock hint on the lookup's select.

  • The lookup task will not run synchronously with the following insert task. An entire buffer (or multiple buffers) will be processed by the lookup task before info is passed to the insert task. You would have to set the buffer size to one row at the data source to make this work, and probably take a big performance hit.

    The lookup task may also cache the reference data and not reflect the inserts.

  • You may be better off adding some logic to your data flow to ensure that there is only 1 instnace of any given row id in the pipeline. What is the data source?

    John Rowan

    ======================================================
    ======================================================
    Forum Etiquette: How to post data/code on a forum to get the best help[/url] - by Jeff Moden

  • How about the stored procedure be given a bit more intelligence. Add a section before the insert that does an existence check, if it already exists then you are done. So for all subsequent calls in that particular buffer (which is often up to 10,000 rows) it may get called 10K times but it will only insert once.

    Also you might feed lookup 2 from the output of the OLEDB command

    Since you reference the no match output you must be using 2008, is that correct?

    I think you might want to consider something like the attached picture..

    CEWII

Viewing 7 posts - 1 through 6 (of 6 total)

You must be logged in to reply to this topic. Login to reply