Stored Procedure throws Violation of PRIMARY KEY constraint when called concurrently whit same parameters

  • Y.B. (6/13/2016)


    Do you have any examples/articles you can point me to so I can check it out

    You mean other than the one I linked in my first post here?

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • GilaMonster (6/13/2016)


    Y.B. (6/13/2016)


    Do you have any examples/articles you can point me to so I can check it out

    You mean other than the one I linked in my first post here?

    I caught it after my post, however I get a blank page when I hit that link.

    EDIT: Never mind...works in Chrome just fine, not so much with IE. Yes, I still need to use IE for testing purposes from time to time...don't hate me for it. lol

    Thanks for the link!


    SELECT quote FROM brain WHERE original = 1
    0 rows returned

  • GilaMonster (6/13/2016)


    Y.B. (6/13/2016)


    Maybe something like this would be better suited...

    IF EXISTS (SELECT 1 FROM SomeTable WHERE ID = @id)

    UPDATE SomeTable SET SomeText = @SomeText

    ELSE

    INSERT INTO SomeTable VALUES (@ID, @SomeText)

    Not atomic, and hence will result in primary key violations, same with Merge.

    http://source.entelect.co.za/why-is-this-upsert-code-broken

    Sorry about the formatting on that page, I'm trying to get it fixed.

    thanks to your link I understand the reason the concurrency happens.

  • GilaMonster (6/13/2016)


    Y.B. (6/13/2016)


    Also Phil is right...better to make use of a MERGE here anyway.

    I disagree MERGE is NOT atomic (it should be, it isn't) and hence both it and the IF EXISTS form are prone to concurrency issues unless elevated isolation levels and/or locking hints are used.

    I should have mentioned something along those lines in my post, thanks Gail.

    The absence of evidence is not evidence of absence.
    Martin Rees

    You can lead a horse to water, but a pencil must be lead.
    Stan Laurel

  • Y.B. (6/13/2016)


    works in Chrome just fine, not so much with IE. Yes, I still need to use IE for testing purposes from time to time...don't hate me for it. lol

    πŸ™ I'll file a bug, that shouldn't have happened.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • That was a great article Gail! Thanks for sharing, It's one of those subtle things most probably learn the hard way.

    So with that in mind here is update code... πŸ˜€

    CREATE TABLE dbo.SomeTable

    (

    ID int NOT NULL,

    SomeText varchar(10) NOT NULL

    ) ON [PRIMARY]

    GO

    ALTER TABLE dbo.SomeTable ADD CONSTRAINT

    PK_SomeTable PRIMARY KEY CLUSTERED

    (

    ID

    ) WITH( STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]

    GO

    CREATE PROCEDURE [dbo].[InsertSomeText]

    @ID int,

    @SomeText varchar(10)

    AS

    BEGIN

    UPDATE dbo.SomeTable SET SomeText = @SomeText WHERE ID = @ID;

    INSERT INTO dbo.SomeTable (ID, SomeText)

    SELECT @ID, @SomeText

    WHERE NOT EXISTS (SELECT 1 FROM dbo.SomeTable WITH (XLOCK, HOLDLOCK) WHERE ID = @ID)

    END

    GO

    INSERT INTO SomeTable VALUES (1, 'Test1'), (2, 'Test2')

    GO

    --BEFORE

    SELECT * FROM dbo.SomeTable

    EXEC InsertSomeText 1, 'TestNew'

    EXEC InsertSomeText 3, 'Test3'

    --AFTER

    SELECT * FROM dbo.SomeTable

    GO

    DROP PROCEDURE dbo.InsertSomeText

    DROP TABLE dbo.SomeTable


    SELECT quote FROM brain WHERE original = 1
    0 rows returned

  • Y.B. (6/13/2016)


    That was a great article Gail! Thanks for sharing, It's one of those subtle things most probably learn the hard way.

    So with that in mind here is update code... πŸ˜€

    CREATE TABLE dbo.SomeTable

    (

    ID int NOT NULL,

    SomeText varchar(10) NOT NULL

    ) ON [PRIMARY]

    GO

    ALTER TABLE dbo.SomeTable ADD CONSTRAINT

    PK_SomeTable PRIMARY KEY CLUSTERED

    (

    ID

    ) WITH( STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]

    GO

    CREATE PROCEDURE [dbo].[InsertSomeText]

    @ID int,

    @SomeText varchar(10)

    AS

    BEGIN

    UPDATE dbo.SomeTable SET SomeText = @SomeText WHERE ID = @ID;

    INSERT INTO dbo.SomeTable (ID, SomeText)

    SELECT @ID, @SomeText

    WHERE NOT EXISTS (SELECT 1 FROM dbo.SomeTable WITH (XLOCK, HOLDLOCK) WHERE ID = @ID)

    END

    GO

    INSERT INTO SomeTable VALUES (1, 'Test1'), (2, 'Test2')

    GO

    --BEFORE

    SELECT * FROM dbo.SomeTable

    EXEC InsertSomeText 1, 'TestNew'

    EXEC InsertSomeText 3, 'Test3'

    --AFTER

    SELECT * FROM dbo.SomeTable

    GO

    DROP PROCEDURE dbo.InsertSomeText

    DROP TABLE dbo.SomeTable

    If the row already exists and the UPDATE successfully executes, is it necessary to attempt the INSERT?

    β€œWrite the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw

    For fast, accurate and documented assistance in answering your questions, please read this article.
    Understanding and using APPLY, (I) and (II) Paul White
    Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

  • DELETE + INSERT will probably result in more reads, writes, page fragmentation, and transaction logging, so for that reason I'd suggest an UPDATE | INSERT. The less "stuff" that's going on inside your transaction and behind the scenes by SQL Server, then the shorter and more isolated the transactions will be.

    Consider something like this instead:

    UPDATE SomeTable

    SET SomeText = @SomeText

    WHERE WHERE ID = @ID;

    IF @@ROWCOUNT = 0

    INSERT INTO SomeTable (ID, SomeText)

    VALUES (@ID, @SomeText);

    As for the occasional primary key violations, in this specific situation, a case could be made for simply ignoring them. Does it really matter from a business perspective that every transaction for a given row complete when that same row is updated repeatedly and concurrently?

    If Sue and John both edit the text of the same row at the same time, then both values are equally valid. So, let SQL Server's unique constraint mechanism resolve these race conditions. You can ignore the error at the application layer, or add the option WITH (IGNORE_DUP_KEY = ON) to the primary key on ID column. The only reason I can see for wanting both updates to complete is if there is a change data capture trigger in place to audit the history of changes over time.

    Also, you may want to set FILL FACTOR to something like 70% to help avoid page splits when SomeText increases in width.

    "Do not seek to follow in the footsteps of the wise. Instead, seek what they sought." - Matsuo Basho

  • I've often applied a slight variation that I think has a minor optimization (based on the assumption that many times a row will exist, so that we can test if the update worked so we don't need to try the insert anymore). Plus I've always used (updlock) instead of (xlock, holdlock). I'll post my alternative to see if you like it, or that I need to go through my code and do some rewrites πŸ˜‰

    CREATE PROCEDURE [dbo].[InsertSomeText]

    @ID int,

    @SomeText varchar(10)

    AS

    BEGIN

    UPDATE SomeTable

    SET SomeText = @SomeText

    WHERE ID = @id;

    IF NOT @@ROWCOUNT > 0

    INSERT INTO SomeTable( ID, SomeText)

    SELECT @ID, @SomeText

    WHERE NOT EXISTS (

    SELECT *

    FROM SomeTable WITH (UPDLOCK)

    WHERE ID = @ID

    )

    END



    Posting Data Etiquette - Jeff Moden[/url]
    Posting Performance Based Questions - Gail Shaw[/url]
    Hidden RBAR - Jeff Moden[/url]
    Cross Tabs and Pivots - Jeff Moden[/url]
    Catch-all queries - Gail Shaw[/url]


    If you don't have time to do it right, when will you have time to do it over?

  • Good explanation Eric πŸ™‚

    _______________________________________________________________
    To get quick answer follow this link:
    http://www.sqlservercentral.com/articles/Best+Practices/61537/

  • If two users are changing the same text simultaneously and the stored procedure does not produce an error in that case, one update will be lost without further notice to that user. Maybe the software does use an algorithm to get a new ID that does not always produce a unique identification for a new row, so new data from one user will be replaced by new data from another user. From my experience I know it is very easy to do this wrong and very hard to do it right, you should leave it to the database to generate unique identifications throught IDENTITY columns or SEQUENCEs. If both users try to update the same existing row, at least one of them should receive a warning that his or her change to the text will not be saved. Either way this is not a problem within the stored procedure but within the logic of the program using it, and it cannot be solved by the stored procedure alone. Just my two cents ...

  • vliet (6/14/2016)


    If two users are changing the same text simultaneously and the stored procedure does not produce an error in that case, one update will be lost without further notice to that user. Maybe the software does use an algorithm to get a new ID that does not always produce a unique identification for a new row, so new data from one user will be replaced by new data from another user. From my experience I know it is very easy to do this wrong and very hard to do it right, you should leave it to the database to generate unique identifications throught IDENTITY columns or SEQUENCEs. If both users try to update the same existing row, at least one of them should receive a warning that his or her change to the text will not be saved. Either way this is not a problem within the stored procedure but within the logic of the program using it, and it cannot be solved by the stored procedure alone. Just my two cents ...

    Let's assume that Jack and Jill both attempt to edit the text of ID 321at approximately the same time. Jack's update is successfully committed, but then 500ms later Jill's update is committed on top of that. In that scenario, should Jack be notififed? The end result is actually identical to the scenario where Jill's update is committed first and then Jack's update fails with an duplicate key error. There are two round pegs and only one hole called 321, so someone is going to basically lose their update one way or the other.

    This sounds to me more like an application workflow issue. For example, in a cutomer support call center, you only want (1) user to have a customer's profile screen open in edit mode at any given time, which can be done using sp_getapplock | sp_releaseapplock. You can't depend on transactions and row isolation alone.

    "Do not seek to follow in the footsteps of the wise. Instead, seek what they sought." - Matsuo Basho

  • Eric M Russell (6/13/2016)


    Consider something like this instead:

    UPDATE SomeTable

    SET SomeText = @SomeText

    WHERE WHERE ID = @ID;

    IF @@ROWCOUNT = 0

    INSERT INTO SomeTable (ID, SomeText)

    VALUES (@ID, @SomeText);

    Still not good enough.

    2 parallel processes could complete UPDATEs at the same time, both ending up with @@ROWCOUNT = 0.

    Both will proceed with INSERT causing PK violation.

    No place for "IF" in T-SQL:

    UPDATE SomeTable

    SET SomeText = @SomeText

    WHERE ID = @ID;

    INSERT INTO SomeTable (ID, SomeText)

    select @ID, @SomeText

    WHERE NOT EXISTS (select * from SomeTable where ID = @ID )

    _____________
    Code for TallyGenerator

  • Eric M Russell (6/13/2016)


    DELETE + INSERT will probably result in more reads, writes, page fragmentation, and transaction logging, so for that reason I'd suggest an UPDATE | INSERT. The less "stuff" that's going on inside your transaction and behind the scenes by SQL Server, then the shorter and more isolated the transactions will be.

    Consider something like this instead:

    UPDATE SomeTable

    SET SomeText = @SomeText

    WHERE WHERE ID = @ID;

    IF @@ROWCOUNT = 0

    INSERT INTO SomeTable (ID, SomeText)

    VALUES (@ID, @SomeText);

    As for the occasional primary key violations, in this specific situation, a case could be made for simply ignoring them. Does it really matter from a business perspective that every transaction for a given row complete when that same row is updated repeatedly and concurrently?

    If Sue and John both edit the text of the same row at the same time, then both values are equally valid. So, let SQL Server's unique constraint mechanism resolve these race conditions. You can ignore the error at the application layer, or add the option WITH (IGNORE_DUP_KEY = ON) to the primary key on ID column. The only reason I can see for wanting both updates to complete is if there is a change data capture trigger in place to audit the history of changes over time.

    Also, you may want to set FILL FACTOR to something like 70% to help avoid page splits when SomeText increases in width.

    yes you'r right, i dont need to fix the exception, i just wanted to know the reason it happens.

  • Eric M Russell (6/14/2016)


    This sounds to me more like an application workflow issue. For example, in a cutomer support call center, you only want (1) user to have a customer's profile screen open in edit mode at any given time, which can be done using sp_getapplock | sp_releaseapplock. You can't depend on transactions and row isolation alone.

    Thanks for your response, Eric.

    I do agree it is an application workflow issue, that can be solved with a combination of programming effort and database engine features. Web applications are particulary prone to this type of misbehaviour and usualy developers assume that the framework will handle these concurrency issues for them. If concurrent use leads to primary key violation either the number of concurrent actions must be quite high or the method of getting new [ID] values might be flawed, for example by fetching the maximum value plus one at the start of the 'create new item' action. The later fault has become quite common and should be solved in the application, not in the database.

    I am both a .NET developer and skilled SQL Server DBA, so I do not blame developers that cannot imagine multiple users using the same part of their application simultaneously because unlike most DBA's they are not trained to do that. Tests do not reveal these flaws and their source is not easy to track down when they happen. But it is a sad thing that most (web) application frameworks lack good support for concurrency control. How do you ensure your applications handle concurrency without data loss?

Viewing 15 posts - 16 through 30 (of 35 total)

You must be logged in to reply to this topic. Login to reply