Stored Procedure throws Violation of PRIMARY KEY constraint when called concurrently whit same parameters

Question

Stored Procedure throws Violation of PRIMARY KEY constraint when called concurrently whit same parameters

Viewing 15 posts - 16 through 30 (of 35 total)

You must be logged in to reply to this topic. Login to reply

Gail Shaw SSC Guru Points: 1004504 More actions · Answer 1

Y.B. (6/13/2016)
Do you have any examples/articles you can point me to so I can check it out

You mean other than the one I linked in my first post here?

Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

We walk in the dark places no others will enter
We stand on the bridge and no one may pass

Y.B. SSChampion Points: 11649 More actions · Answer 2

GilaMonster (6/13/2016)
Y.B. (6/13/2016)
Do you have any examples/articles you can point me to so I can check it out
You mean other than the one I linked in my first post here?

I caught it after my post, however I get a blank page when I hit that link.

EDIT: Never mind...works in Chrome just fine, not so much with IE. Yes, I still need to use IE for testing purposes from time to time...don't hate me for it. lol

Thanks for the link!

SELECT quote FROM brain WHERE original = 1
0 rows returned

fabriziodb Mr or Mrs. 500 Points: 562 More actions · Answer 3

GilaMonster (6/13/2016)
Y.B. (6/13/2016)
Maybe something like this would be better suited...
IF EXISTS (SELECT 1 FROM SomeTable WHERE ID = @id)
UPDATE SomeTable SET SomeText = @SomeText
ELSE
INSERT INTO SomeTable VALUES (@ID, @SomeText)
Not atomic, and hence will result in primary key violations, same with Merge.
http://source.entelect.co.za/why-is-this-upsert-code-broken
Sorry about the formatting on that page, I'm trying to get it fixed.

thanks to your link I understand the reason the concurrency happens.

Phil Parkin SSC Guru Points: 246977 More actions · Answer 4

GilaMonster (6/13/2016)
Y.B. (6/13/2016)
Also Phil is right...better to make use of a MERGE here anyway.
I disagree MERGE is NOT atomic (it should be, it isn't) and hence both it and the IF EXISTS form are prone to concurrency issues unless elevated isolation levels and/or locking hints are used.

I should have mentioned something along those lines in my post, thanks Gail.

The absence of evidence is not evidence of absence.
Martin Rees

You can lead a horse to water, but a pencil must be lead.
Stan Laurel

Gail Shaw SSC Guru Points: 1004504 More actions · Answer 5

Y.B. (6/13/2016)
works in Chrome just fine, not so much with IE. Yes, I still need to use IE for testing purposes from time to time...don't hate me for it. lol

🙁 I'll file a bug, that shouldn't have happened.

Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

We walk in the dark places no others will enter
We stand on the bridge and no one may pass

Y.B. SSChampion Points: 11649 More actions · Answer 6

That was a great article Gail! Thanks for sharing, It's one of those subtle things most probably learn the hard way.

So with that in mind here is update code... 😀

CREATE TABLE dbo.SomeTable

(

ID int NOT NULL,

SomeText varchar(10) NOT NULL

) ON [PRIMARY]

GO

ALTER TABLE dbo.SomeTable ADD CONSTRAINT

PK_SomeTable PRIMARY KEY CLUSTERED

(

ID

) WITH( STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]

GO

CREATE PROCEDURE [dbo].[InsertSomeText]

@ID int,

@SomeText varchar(10)

AS

BEGIN

UPDATE dbo.SomeTable SET SomeText = @SomeText WHERE ID = @ID;

INSERT INTO dbo.SomeTable (ID, SomeText)

SELECT @ID, @SomeText

WHERE NOT EXISTS (SELECT 1 FROM dbo.SomeTable WITH (XLOCK, HOLDLOCK) WHERE ID = @ID)

END

GO

INSERT INTO SomeTable VALUES (1, 'Test1'), (2, 'Test2')

GO

--BEFORE

SELECT * FROM dbo.SomeTable

EXEC InsertSomeText 1, 'TestNew'

EXEC InsertSomeText 3, 'Test3'

--AFTER

SELECT * FROM dbo.SomeTable

GO

DROP PROCEDURE dbo.InsertSomeText

DROP TABLE dbo.SomeTable

SELECT quote FROM brain WHERE original = 1
0 rows returned

ChrisM@Work SSC Guru Points: 186127 More actions · Answer 7

Y.B. (6/13/2016)
That was a great article Gail! Thanks for sharing, It's one of those subtle things most probably learn the hard way.
So with that in mind here is update code... 😀
CREATE TABLE dbo.SomeTable
(
ID int NOT NULL,
SomeText varchar(10) NOT NULL
) ON [PRIMARY]
GO
ALTER TABLE dbo.SomeTable ADD CONSTRAINT
PK_SomeTable PRIMARY KEY CLUSTERED
(
ID
) WITH( STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
GO
CREATE PROCEDURE [dbo].[InsertSomeText]
@ID int,
@SomeText varchar(10)
AS
BEGIN
UPDATE dbo.SomeTable SET SomeText = @SomeText WHERE ID = @ID;
INSERT INTO dbo.SomeTable (ID, SomeText)
SELECT @ID, @SomeText
WHERE NOT EXISTS (SELECT 1 FROM dbo.SomeTable WITH (XLOCK, HOLDLOCK) WHERE ID = @ID)
END
GO
INSERT INTO SomeTable VALUES (1, 'Test1'), (2, 'Test2')
GO
--BEFORE
SELECT * FROM dbo.SomeTable
EXEC InsertSomeText 1, 'TestNew'
EXEC InsertSomeText 3, 'Test3'
--AFTER
SELECT * FROM dbo.SomeTable
GO
DROP PROCEDURE dbo.InsertSomeText
DROP TABLE dbo.SomeTable

If the row already exists and the UPDATE successfully executes, is it necessary to attempt the INSERT?

^{“Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw}

For fast, accurate and documented assistance in answering your questions, please read this article.
Understanding and using APPLY, (I) and (II) Paul White
Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

Eric M Russell SSC Guru Points: 125562 More actions · Answer 8

DELETE + INSERT will probably result in more reads, writes, page fragmentation, and transaction logging, so for that reason I'd suggest an UPDATE | INSERT. The less "stuff" that's going on inside your transaction and behind the scenes by SQL Server, then the shorter and more isolated the transactions will be.

Consider something like this instead:

UPDATE SomeTable

SET SomeText = @SomeText

WHERE WHERE ID = @ID;

IF @@ROWCOUNT = 0

INSERT INTO SomeTable (ID, SomeText)

VALUES (@ID, @SomeText);

As for the occasional primary key violations, in this specific situation, a case could be made for simply ignoring them. Does it really matter from a business perspective that every transaction for a given row complete when that same row is updated repeatedly and concurrently?

If Sue and John both edit the text of the same row at the same time, then both values are equally valid. So, let SQL Server's unique constraint mechanism resolve these race conditions. You can ignore the error at the application layer, or add the option WITH (IGNORE_DUP_KEY = ON) to the primary key on ID column. The only reason I can see for wanting both updates to complete is if there is a change data capture trigger in place to audit the history of changes over time.

Also, you may want to set FILL FACTOR to something like 70% to help avoid page splits when SomeText increases in width.

"Do not seek to follow in the footsteps of the wise. Instead, seek what they sought." - Matsuo Basho

R.P.Rozema SSChampion Points: 12317 More actions · Answer 9

I've often applied a slight variation that I think has a minor optimization (based on the assumption that many times a row will exist, so that we can test if the update worked so we don't need to try the insert anymore). Plus I've always used (updlock) instead of (xlock, holdlock). I'll post my alternative to see if you like it, or that I need to go through my code and do some rewrites 😉

CREATE PROCEDURE [dbo].[InsertSomeText]

@ID int,

@SomeText varchar(10)

AS

BEGIN

UPDATE SomeTable

SET SomeText = @SomeText

WHERE ID = @id;

IF NOT @@ROWCOUNT > 0

INSERT INTO SomeTable( ID, SomeText)

SELECT @ID, @SomeText

WHERE NOT EXISTS (

SELECT *

FROM SomeTable WITH (UPDLOCK)

WHERE ID = @ID

)

END

Posting Data Etiquette - Jeff Moden[/url]
Posting Performance Based Questions - Gail Shaw[/url]
Hidden RBAR - Jeff Moden[/url]
Cross Tabs and Pivots - Jeff Moden[/url]
Catch-all queries - Gail Shaw[/url]

If you don't have time to do it right, when will you have time to do it over?

kapil_kk SSC-Insane Points: 21316 More actions · Answer 10

Good explanation Eric 🙂

_______________________________________________________________
To get quick answer follow this link:
http://www.sqlservercentral.com/articles/Best+Practices/61537/

vliet SSCommitted Points: 1998 More actions · Answer 11

If two users are changing the same text simultaneously and the stored procedure does not produce an error in that case, one update will be lost without further notice to that user. Maybe the software does use an algorithm to get a new ID that does not always produce a unique identification for a new row, so new data from one user will be replaced by new data from another user. From my experience I know it is very easy to do this wrong and very hard to do it right, you should leave it to the database to generate unique identifications throught IDENTITY columns or SEQUENCEs. If both users try to update the same existing row, at least one of them should receive a warning that his or her change to the text will not be saved. Either way this is not a problem within the stored procedure but within the logic of the program using it, and it cannot be solved by the stored procedure alone. Just my two cents ...

Eric M Russell SSC Guru Points: 125562 More actions · Answer 12

vliet (6/14/2016)
If two users are changing the same text simultaneously and the stored procedure does not produce an error in that case, one update will be lost without further notice to that user. Maybe the software does use an algorithm to get a new ID that does not always produce a unique identification for a new row, so new data from one user will be replaced by new data from another user. From my experience I know it is very easy to do this wrong and very hard to do it right, you should leave it to the database to generate unique identifications throught IDENTITY columns or SEQUENCEs. If both users try to update the same existing row, at least one of them should receive a warning that his or her change to the text will not be saved. Either way this is not a problem within the stored procedure but within the logic of the program using it, and it cannot be solved by the stored procedure alone. Just my two cents ...

Let's assume that Jack and Jill both attempt to edit the text of ID 321at approximately the same time. Jack's update is successfully committed, but then 500ms later Jill's update is committed on top of that. In that scenario, should Jack be notififed? The end result is actually identical to the scenario where Jill's update is committed first and then Jack's update fails with an duplicate key error. There are two round pegs and only one hole called 321, so someone is going to basically lose their update one way or the other.

This sounds to me more like an application workflow issue. For example, in a cutomer support call center, you only want (1) user to have a customer's profile screen open in edit mode at any given time, which can be done using sp_getapplock | sp_releaseapplock. You can't depend on transactions and row isolation alone.

"Do not seek to follow in the footsteps of the wise. Instead, seek what they sought." - Matsuo Basho

Sergiy SSC Guru Points: 110208 More actions · Answer 13

Eric M Russell (6/13/2016)
Consider something like this instead:
UPDATE SomeTable
SET SomeText = @SomeText
WHERE WHERE ID = @ID;
IF @@ROWCOUNT = 0
INSERT INTO SomeTable (ID, SomeText)
VALUES (@ID, @SomeText);

Still not good enough.

2 parallel processes could complete UPDATEs at the same time, both ending up with @@ROWCOUNT = 0.

Both will proceed with INSERT causing PK violation.

No place for "IF" in T-SQL:

UPDATE SomeTable

SET SomeText = @SomeText

WHERE ID = @ID;

INSERT INTO SomeTable (ID, SomeText)

select @ID, @SomeText

WHERE NOT EXISTS (select * from SomeTable where ID = @ID )

_____________
Code for TallyGenerator

fabriziodb Mr or Mrs. 500 Points: 562 More actions · Answer 14

Eric M Russell (6/13/2016)
DELETE + INSERT will probably result in more reads, writes, page fragmentation, and transaction logging, so for that reason I'd suggest an UPDATE | INSERT. The less "stuff" that's going on inside your transaction and behind the scenes by SQL Server, then the shorter and more isolated the transactions will be.
Consider something like this instead:
UPDATE SomeTable
SET SomeText = @SomeText
WHERE WHERE ID = @ID;
IF @@ROWCOUNT = 0
INSERT INTO SomeTable (ID, SomeText)
VALUES (@ID, @SomeText);
As for the occasional primary key violations, in this specific situation, a case could be made for simply ignoring them. Does it really matter from a business perspective that every transaction for a given row complete when that same row is updated repeatedly and concurrently?
If Sue and John both edit the text of the same row at the same time, then both values are equally valid. So, let SQL Server's unique constraint mechanism resolve these race conditions. You can ignore the error at the application layer, or add the option WITH (IGNORE_DUP_KEY = ON) to the primary key on ID column. The only reason I can see for wanting both updates to complete is if there is a change data capture trigger in place to audit the history of changes over time.
Also, you may want to set FILL FACTOR to something like 70% to help avoid page splits when SomeText increases in width.

yes you'r right, i dont need to fix the exception, i just wanted to know the reason it happens.

vliet SSCommitted Points: 1998 More actions · Answer 15

Eric M Russell (6/14/2016)
This sounds to me more like an application workflow issue. For example, in a cutomer support call center, you only want (1) user to have a customer's profile screen open in edit mode at any given time, which can be done using sp_getapplock | sp_releaseapplock. You can't depend on transactions and row isolation alone.

Thanks for your response, Eric.

I do agree it is an application workflow issue, that can be solved with a combination of programming effort and database engine features. Web applications are particulary prone to this type of misbehaviour and usualy developers assume that the framework will handle these concurrency issues for them. If concurrent use leads to primary key violation either the number of concurrent actions must be quite high or the method of getting new [ID] values might be flawed, for example by fetching the maximum value plus one at the start of the 'create new item' action. The later fault has become quite common and should be solved in the application, not in the database.

I am both a .NET developer and skilled SQL Server DBA, so I do not blame developers that cannot imagine multiple users using the same part of their application simultaneously because unlike most DBA's they are not trained to do that. Tests do not reveal these flaws and their source is not easy to track down when they happen. But it is a sad thing that most (web) application frameworks lack good support for concurrency control. How do you ensure your applications handle concurrency without data loss?