Double Insert Possible ?

  • We have a job that runs this code:

    InsertMore:

    WAITFOR DELAY '00:00:05'

    insert top(2500) into TableA

    select * from TableB B

    where B.ID not in (select ID from TableA) -- don't insert same records again

    if @@rowcount > 0 goto InsertMore

    It takes a long time to run, but is written this way to prevent much locking during business hours.

    If a separate session runs the same code, with the hope of it completing faster, is there any chance that duplicate ID rows could get inserted since it only has a non-unique index on ID ?

  • Absolutely there's a chance. A good one.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • My initial thoughts went towards explicit transactions and transaction isolation levels. What is the real-world scenario behind the question? And are you equally worried about locking on both tables, or is one one more important than the other?

    There are no special teachers of virtue, because virtue is taught by the whole community.
    --Plato

  • TableA, being inerted into, is a live production table, while TableB, being selected from is a temp holding table during a load process. Initially, the process did 1 big insert, but it could take a while and caused blocking, so we modified it to do batches of small inserts.

    The problem is that the overall load time now takes longer, although it doesn't cause blocking. We were trying to find a better way, and the other guy suggested running the same process twice at the same time since it checks to see that the records are not already inserted. Seemed to me that you might get duplicates though because the same batch could be getting inserted at the same time.

  • homebrew01 (6/25/2011)


    Seemed to me that you might get duplicates though because the same batch could be getting inserted at the same time.

    Absolutely. You'd need some nasty isolation level to prevent that (likely serialisable)

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • Serializable would keep you from getting dups. It is super max lockdown though so using it would defeat the purpose of running two instances of your code as it is written.

    Snapshot iso is the other level you could consider. It is the only other read-consistent iso level besides serializable. I say consider because I have never looked at it in this light and am having a hard time wrapping my head around what it would do for you in this scenario.

    That's all I had on iso levels.

    Here is a code change option for you:

    What if we introduced a temp table to hold the results of this...

    select * from TableB B

    where B.ID not in (select ID from TableA)

    ...to get our working set. Then we could delete 2500 rows from the temp table at a time and use the OUTPUT clause to insert those rows directly into TableA. I think that would solve your transaction issues because the DELETE would open write locks on those rows it was deleting and prevent a second process from reading those. If you had two delete process it would be worth coding them to worth on different sections of the temp table so they have less of a chance of blocking each other. Not sure what that would do to TableA in terms of fragmentation though. Just thinking out loud here, it would need to be run down but you get the idea.

    CREATE TABLE dbo.src (id INT)

    CREATE TABLE dbo.dest (id INT)

    INSERT INTO dbo.src ( id ) VALUES ( 0 ) ;

    DELETE

    FROM dbo.src

    OUTPUT DELETED.id

    INTO dest;

    SELECT * FROM dest ;

    There are no special teachers of virtue, because virtue is taught by the whole community.
    --Plato

  • opc.three (6/26/2011)


    Snapshot iso is the other level you could consider. It is the only other read-consistent iso level besides serializable. I say consider because I have never looked at it in this light and am having a hard time wrapping my head around what it would do for you in this scenario.

    It'll almost guarantee duplicates.

    Snapshot gets you a consistent view of the data at the time of the transactin (snapshot) or statement (read committed snapshot) starting. If one insert starts then the other starts before the first finishes, it will see (for the not exists) a consistent view of the data before the first insert started and will hence happily insert duplicate rows.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • GilaMonster (6/26/2011)


    opc.three (6/26/2011)


    Snapshot iso is the other level you could consider. It is the only other read-consistent iso level besides serializable. I say consider because I have never looked at it in this light and am having a hard time wrapping my head around what it would do for you in this scenario.

    It'll almost guarantee duplicates.

    Snapshot gets you a consistent view of the data at the time of the transactin (snapshot) or statement (read committed snapshot) starting. If one insert starts then the other starts before the first finishes, it will see (for the not exists) a consistent view of the data before the first insert started and will hence happily insert duplicate rows.

    Good to know. Thank you Gail.

    There are no special teachers of virtue, because virtue is taught by the whole community.
    --Plato

  • Thanks for the feedback. I'm heading on vacation for a bit, so I'm going to leave it alone for now, and it only runs once a month, so no rush.

Viewing 9 posts - 1 through 8 (of 8 total)

You must be logged in to reply to this topic. Login to reply