July 30, 2009 at 6:54 am
New to SSIS and have what I'm sure is a "SSIS 101" question. I've searched to no avail, but may not be using the correct terminology.
I have a Lookup transformation feeding it's "No Match Output" to an OLE DB Command (which calls a stored proc to add an item to the DB)
Lookup for row 1 finds no match, OLE DB command adds an item
Lookup for row 2, which should match the item just added by OLD DB command, finds no match...because the action of the OLE DB command hasn't yet finished/committed(?)
I end up with duplicate items in the target DB. How can I ensure that the insert resulting from a "no match" lookup completes before the lookup for the next row fires?
I have cache mode on the lookup set to "no cache" and a "Union All" bringing the match and no match legs of the lookup back together.
Or do I need a different approach? I guess I could put all of the logic above in a script, but would like to handle it using the SSIS data flow components if feasible.
Thanks in advance for helping a newbie out!
July 30, 2009 at 8:11 am
Have you tried putting an explicit transaction in the procedure?
Jack Corbett
Consultant - Straight Path Solutions
Check out these links on how to get faster and more accurate answers:
Forum Etiquette: How to post data/code on a forum to get the best help
Need an Answer? Actually, No ... You Need a Question
July 30, 2009 at 8:54 am
Thanks for the quick reply and suggestion...within the stored proc, I wrapped the insert in an explicit transaction but the results are the same. The lookup for the second row of the stream is firing before the insert/transaction of the first row completes (at least that's what it looks like).
My test source data has 20 rows, and the lookup should match on rows 2-20 (insert occurs on row 1). It looks like the lookup of all 20 rows completes even prior to the insert caused by the "no match" of the first row.
Somehow, I need to halt the lookup for the second row until the insert/transaction of the first row has completed.
I've verified my lookup is correct, because if I execute the package again the lookup finds the records inserted during the previous run.
July 30, 2009 at 9:07 am
On your dataflow task properties you can change the Isolation Level to Read Uncommitted or you could put a NoLock hint on the lookup's select.
Jack Corbett
Consultant - Straight Path Solutions
Check out these links on how to get faster and more accurate answers:
Forum Etiquette: How to post data/code on a forum to get the best help
Need an Answer? Actually, No ... You Need a Question
July 30, 2009 at 9:08 am
The lookup task will not run synchronously with the following insert task. An entire buffer (or multiple buffers) will be processed by the lookup task before info is passed to the insert task. You would have to set the buffer size to one row at the data source to make this work, and probably take a big performance hit.
The lookup task may also cache the reference data and not reflect the inserts.
August 10, 2009 at 4:51 pm
You may be better off adding some logic to your data flow to ensure that there is only 1 instnace of any given row id in the pipeline. What is the data source?
August 10, 2009 at 9:33 pm
How about the stored procedure be given a bit more intelligence. Add a section before the insert that does an existence check, if it already exists then you are done. So for all subsequent calls in that particular buffer (which is often up to 10,000 rows) it may get called 10K times but it will only insert once.
Also you might feed lookup 2 from the output of the OLEDB command
Since you reference the no match output you must be using 2008, is that correct?
I think you might want to consider something like the attached picture..
CEWII
Viewing 7 posts - 1 through 6 (of 6 total)
You must be logged in to reply to this topic. Login to reply