May 8, 2009 at 3:39 pm
Jamie Julius (5/8/2009)
Paul White (5/8/2009)
Jamie Julius (5/7/2009)
It seems to me that the proposed SQL would be considered an implicit transaction. It would be run under the default isolation mode of "Read committed". I don't think this is enough to prevent two concurrent sessions from both inserting the same UserID. If they each do the SELECT part at the same time, they will both conclude that the UserID is not in the table. I would be happy to be proved wrong about this.Jamie,
You are correct that two concurrent transaction could both conclude, from a SELECT at READ COMMITTED that no row exists, and both would attempt an INSERT.
Absent a unique constraint, index, primary key, or similar...you could end up with two INSERTS.
The window of mis-opportunity is usually very small, however.
. . .
Paul
Thank you, Paul.
Basically, this is all I wanted confirmed in my pursuit to understand what SQL can guarantee and what it cannot. I appreciate all the practical advise provided by the many contributors to this post. I apologize if I offended anyone by persisting with my question.
Jamie
Jamie,
You have nothing to apologize for in this regard. There was an excellent discussion from your question, and much information was shared.
What I'd like to encourage, however, when posting questions it is better to provide as much information as possible rather than less. You will get much better answers to your questions. If you get a chance, please go to my blog, SQL Musings from the Desert[/url], and read the article The Flip Side. Read the others also if you want, but that one should help you when dealing with us "difficult users" that are trying to help you.
May 8, 2009 at 4:02 pm
this is all I wanted confirmed in my pursuit to understand what SQL can guarantee and what it cannot
Jamie, just out of curiosity: Are you satisified that creating a unique index over the userid and timestamp columns will be sufficient to prevent duplicate insertions? I guess what I'm asking is what items of information did you, as the original poster, find valuable about the discussion?
__________________________________________________
Against stupidity the gods themselves contend in vain. -- Friedrich Schiller
Stop, children, what's that sound? Everybody look what's going down. -- Stephen Stills
May 9, 2009 at 1:33 am
-- duplicate post: having problems submitting this end! --
Paul White
SQLPerformance.com
SQLkiwi blog
@SQL_Kiwi
May 9, 2009 at 1:39 am
-- duplicate post: having problems submitting this end! --
Paul White
SQLPerformance.com
SQLkiwi blog
@SQL_Kiwi
May 9, 2009 at 1:46 am
Bob Hovious (5/8/2009)
Paul, can we agree that if a unique index exists over the column(s) to be inserted, that SQL will return an error for any subsequent "duplicate" inserts regardless of the transaction isolation level? Therefore, Jamie doesn't have to be concerned about anything other than creating such an index (or adding a constraint) and then providing logic to catch the error.
Hey Bob,
Yes, I think we can safely agree that a unique index will prevent duplicates :w00t:
The approach you posted - trying the INSERT and catching any error - is a good solution. I expanded on your approach slightly in my long reply to Jamie. I have just realized that I forgot to refer to your solution when posting that - sorry.
@everyone:
I think it is important to separate the 'enhancing understanding' aspects of this thread from the specific issue of preventing duplicates. The prevention of duplicates is obviously best done with an explicitly with a unique index, unique constraint, or primary key.
The question of understanding relates to the more general, and technical, considerations when using different isolation levels and locking hints.
Regardless of the specific question in this thread, it is my contention that Jamie was looking to enhance his understanding of how SQL Server works. I think the example was provided simply by way of an example.
Cheers,
Paul
Paul White
SQLPerformance.com
SQLkiwi blog
@SQL_Kiwi
May 9, 2009 at 5:57 am
Hi all,
Thanks again to all who contributed to this post. As the OP (I'm assuming that means me, although I don't know what it stands for (original poster?)), let me try to answer your last comments and wrap this up.
I am definitely satisfied creating a unique index (or constraint) over the userid and timestamp fields. This will indeed prevent the duplicate entries.
I will use code to catch the duplicate error for those rare occasions where it might happen. As an aside, I will pursue to write code that catches only this specific error and not any error as I believe that other errors should be handled differently (or not at all - i.e., bubble back to the source to be handled there).
I appreciate Paul's remark about separating the 'enhancing understanding' aspects from the practical advise. I personally gain confidence with my code only when I know why I'm writing it in a certain way.
So if the default isolation level is such that code like this:
insert into dbo.UserStats
select
where
not exists(select 1 from dbo.UserStats us1 where us1.UserID = @user-id);
does not provide a complete safeguard against duplicate entries, then I am of course happy to use the recommended unique index/constraint combined with the needed try/catch mechanism. On the other hand, if the above did provide the needed safeguard, then adding the catch in my opinion would be misleading.
I am relatively new at SQL development, but far from new in software development. Concurrency issues (thread-safeness and deadlocks) are always tricky to master. It's simply the nature of the beast. So perhaps it's not terribly surprising to find that some considered the above code safe while others knew that it was not. I'm glad that through discussion this point was clarified, to which I am grateful.
I have learned a lot from this thread! Thank you, all.
Jamie
P.S. I have just read the tips in Lynn's blog: The Flip Side. Points are well taken!
May 9, 2009 at 6:31 am
Paul, it was a good discussion from the standpoint of what a select would and would not find under various scenarios. Like you, I just wanted to be sure that Jamie did not confuse that with SQL's ability to prevent duplicate insertions through use of a primary key. Don't worry about attribution of who said what first. TRY/CATCH is hardly an original idea that I can claim 😉
Jamie, you are welcome. Discussions like these are one of the reasons I love SSC.
__________________________________________________
Against stupidity the gods themselves contend in vain. -- Friedrich Schiller
Stop, children, what's that sound? Everybody look what's going down. -- Stephen Stills
May 9, 2009 at 8:06 am
Jamie,
Thank you for taking the time to read my blog. I can tell you, from my own point of view, that the more you tell me up front the better answer you will get. And it is simply because that means less that I have to assume, which we know what happens then, right?
I'm glad that we were able to help you gain a better understanding with regard to your issue here. I do want to ask one more question to be sure of where you are now.
You do understand that it isn't the implicit or explicit transaction itself that creates the locks, but the action itself (INSERT,UPDATE,DELETE, and even SELECT), and that any locks held by a transaction aren't released until the transaction is committed or rolled back, correct?
May 9, 2009 at 8:15 am
Lynn Pettis (5/9/2009)
Jamie,You do understand that it isn't the implicit or explicit transaction itself that creates the locks, but the action itself (INSERT,UPDATE,DELETE, and even SELECT), and that any locks held by a transaction aren't released until the transaction is committed or rolled back, correct?
Yes, this is clear.
Thanks,
Jamie
May 9, 2009 at 9:36 am
Jamie,
thanks for the update and explanation. And we're glad to help, sounds like we did, and please feel free to come back if you have more questions
May 9, 2009 at 10:26 am
Lynn Pettis (5/9/2009)
You do understand that it isn't the implicit or explicit transaction itself that creates the locks, but the action itself (INSERT,UPDATE,DELETE, and even SELECT), and that any locks held by a transaction aren't released until the transaction is committed or rolled back, correct?
Only the exclusive locks (and the related intent-exclusive locks at higher levels) required to protect modified rows/pages/table are guaranteed to be held to the end of the transaction, regardless of isolation level.
These exclusive locks are not taken at the start of the transaction (except for serializable) - they are only acquired when the row in question is actually modified.
The locks taken while reading rows to check eligibility for modification include: none at all (snapshot, read uncommitted); transient shared and update locks (read committed); held shared locks (repeatable read); and finally held range locks (serializable).
Importantly, other concurrent activity can mean that some rows which technically qualify for modification during the whole transaction can be missed or double-counted, depending on the locks taken while checking rows for a match against the data-modifying predicates.
I really want to be sure that everyone appreciates that fact.
Paul
Paul White
SQLPerformance.com
SQLkiwi blog
@SQL_Kiwi
May 10, 2009 at 1:46 am
Paul,
All I wanted to make sure was that it isn't the BEGIN TRANSACTION (explicit or implicit) that creates any locks. This part of a transaction does not know what will or needs to be locked. Any locks created during the transaction are held (usually/always?) until the transaction is committed or rolled back.
The finer details can be learned through further reading and experimentation.
May 10, 2009 at 3:23 am
Lynn Pettis (5/10/2009)
All I wanted to make sure was that it isn't the BEGIN TRANSACTION (explicit or implicit) that creates any locks.
Sure. BEGIN TRANSACTION doesn't even get logged, much less acquire any locks. In that sense, BEGIN TRANSACTION is a no-op. The next statement that uses a transaction actually starts it.
Lynn Pettis (5/10/2009)
Any locks created during the transaction are held (usually/always?) until the transaction is committed or rolled back.
I covered this in my last reply, but to summarize:
The exclusive locks resulting from a data modification are always held from the point they are acquired to the end of the transaction.
Locks taken while scanning records looking for rows to update vary depending on the isolation level:
READ UNCOMMITTED, READ_COMMITTED with READ_COMMITTED_SNAPSHOT ON, and SNAPSHOT do not take any locks (aside from schema-stability).
READ_COMMITTED with READ_COMMITTED_SNAPSHOT OFF locks and unlocks each record (usually just a Shared lock) as the record is checked. These locks are *not* held to the end-of-transaction.
REPEATABLE READ takes and holds Shared locks on all rows examined until the end of the transaction.
SERIALIZABLE takes range-locks to protect all potentially scanned values before the scan starts.
Tracing the locks acquired and released by even the simplest SELECT can be a surprising and enlightening experience!
Cheers,
Paul
Paul White
SQLPerformance.com
SQLkiwi blog
@SQL_Kiwi
May 10, 2009 at 11:36 am
Paul White (5/10/2009)
Lynn Pettis (5/10/2009)
All I wanted to make sure was that it isn't the BEGIN TRANSACTION (explicit or implicit) that creates any locks.Sure. BEGIN TRANSACTION doesn't even get logged, much less acquire any locks. In that sense, BEGIN TRANSACTION is a no-op. The next statement that uses a transaction actually starts it.
I think that is something I'd want to verify before stating. You can use named transactions, which tells me something has to be written to the transaction log to mark the beginning of a transaction weither named or unnamed. There are also marked transactions, but I know even less about those and would definately need to read and investigate those more before commenting on them.
May 10, 2009 at 11:03 pm
Lynn,
Lynn Pettis (5/10/2009)
I think that is something I'd want to verify before stating.
I am wounded - you seem to suggest that I did not verify it...!
Lynn Pettis (5/10/2009)
You can use named transactions, which tells me something has to be written to the transaction log to mark the beginning of a transaction weither named or unnamed.
Perhaps you should have verified this before stating? 😛
Lynn Pettis (5/10/2009)
There are also marked transactions, but I know even less about those and would definitely need to read and investigate those more before commenting on them.
Very wise.
You may find it useful to run the following demonstration:
use master;
create database [237D8E7D-BDBE-48E5-89C7-874F4D4A054D];
go
use [237D8E7D-BDBE-48E5-89C7-874F4D4A054D];
go
create table dbo.[B49B0C2A-FC9C-45E9-8325-3F710AFF04E8] (a int)
go
checkpoint;
go
begin transaction Bob with mark 'Bob Transaction Start'
-- Just the checkpoint records - no transaction yet
select*
from::fn_dblog(null, null);
-- Generate some log activity
insert dbo.[B49B0C2A-FC9C-45E9-8325-3F710AFF04E8] values (1);
-- Lots of stuff including BEGIN TRAN (operation LOP_BEGIN_XACT, note also the transaction name and description)
select*
from::fn_dblog(null, null);
rollback transaction Bob;
-- Lots more stuff including the ROLLBACK (operation LOP_ABORT_XACT)
select*
from::fn_dblog(null, null);
drop tabledbo.[B49B0C2A-FC9C-45E9-8325-3F710AFF04E8];
go
use master;
drop database [237D8E7D-BDBE-48E5-89C7-874F4D4A054D];
go
Paul White
SQLPerformance.com
SQLkiwi blog
@SQL_Kiwi
Viewing 15 posts - 76 through 90 (of 92 total)
You must be logged in to reply to this topic. Login to reply