February 16, 2008 at 3:55 pm
I have based some core logic on a webpage based on inserts into a table whose rows are identified by a column ID (int, Identity increment 1). The insert query runs in a transaction.
My understanding is that in a multiuser environment, the stored procedure executed by a particular user would always have a set of Id's that would be contiguous. Here is what I mean-
User 1 and User 2 hit the page simultaneously and insert 5 records each that run in its own transaction scope. The set of Id's returned to user1 would be 1 through 5 and user 2 from 6-11 (this is my understanding).
Is there a possibility that the user1 would have records with Id 1, 3, 4, 6, 7 and user2 with Ids- 2,5,8,9,10? (overlapping Id's)
All my tests have proved otherwise, that is the series of numbers are always a sequence. But many members in my team think that it can overlap in a multi-user environment. I am not convinced. Is there a way to prove either way?
Table (Test_Table) structure
id int Identity (1,1) PK
Title varchar (50)
SessionID varchar(50)
Query
BEGIN TRAN A
INSERT INTO TEST_TABLE
SELECT substring(TITLE,1,20), 'SessionId' FROM DBO.AnotherTestTable
COMMIT TRAN A
Note: I have also posted this question on other sites because of the urgency. Thanks.
February 16, 2008 at 8:33 pm
I believe that to guarantee it you might need a table lock. Your code implies that the 5 records are inserted in a single statement... I've got a test case below that shows you need a table lock in the case where the records are not all inserted in a single statement...
Open two query windows in SSMS. Switch both to TEMPDB.
In the first window execute
create table test_table(
id int identity(1,1) primary key clustered,
title varchar(50),
sessionID varchar(50)
)
begin transaction
insert into test_table
select 'Insert #1', 'Session #1'
UNION ALL
select 'Insert #2', 'Session #1'
waitfor DELAY '00:00:10'
insert into test_table
select 'Insert #3', 'Session #1'
UNION ALL
select 'Insert #4', 'Session #1'
UNION ALL
select 'Insert #5', 'Session #1'
commit transaction
Whilst that's executing (you've got 10 seconds delay in the middle) execute the following in the second window
begin transaction
insert into test_table
select 'Insert #1', 'Session #2'
UNION ALL
select 'Insert #2', 'Session #2'
waitfor DELAY '00:00:10'
insert into test_table
select 'Insert #3', 'Session #2'
UNION ALL
select 'Insert #4', 'Session #2'
UNION ALL
select 'Insert #5', 'Session #2'
commit transaction
select * from test_table order by id
drop table test_table
You'll see, at least I did on my system, that the ranges are not contiguous.
However if you add a
with(tablock)
after the table names in the insert statements then you serialize access to the table and all is well.
In short, I'm not sure if you can rely on the behaviour of the query engine but I haven't proven it with a single insert statement. Perhaps create a test case that takes a while to run such as inserting 100000 records in two simultaneous query windows?
February 16, 2008 at 8:43 pm
Actually, I surprised myself and bothered to do it 🙂
Try this - same as before in tempdb with two query windows...
Window #1
create table test_table(
id int identity(1,1) primary key clustered,
num int,
sessionID int
)
insert into test_table( num, sessionID )
--select number, 1 from DoxKPH.dbo.numbers cross join numbers
select top 1000000 c1.id, 1 from syscolumns c1 cross join syscolumns c2 cross join syscolumns c3
Window #2
insert into test_table( num, sessionID )
select top 1000000 c1.id, 2 from syscolumns c1 cross join syscolumns c2 cross join syscolumns c3
select sessionID, min(id) from test_table group by sessionID order by sessionID
drop table test_table
Each inserts a million rows into test_table. The minimum ID of each insert is then retrieved. The results on my machine were
sessionID Min Identity
1 1
2 436254
If your hypothesis were correct the Min Identity for session #2 should have been 1000000, not 436254.
Sorry to say you cannot rely on the behaviour of contiguous identity values within a single insert on a busy table. 🙁 Good question though!!! 😀
February 17, 2008 at 4:29 am
Another issue you need to watch out for, if your requirement is a contiguous sequence of values in the identity column with no gaps in between:
The insertion of a new value in the identity column is NOT part of the transaction!
If the transaction fails for some reason, the new value will still hold, but obviously will not appear in the table. That will cause you to end up with gaps in the sequence.
__________________________________________________________________________________
SQL Server 2016 Columnstore Index Enhancements - System Views for Disk-Based Tables[/url]
Persisting SQL Server Index-Usage Statistics with MERGE[/url]
Turbocharge Your Database Maintenance With Service Broker: Part 2[/url]
February 17, 2008 at 8:06 am
Hi Ian,
No matter what I do, I am always getting a sequential Id. The Id's do not overlap between connections. In your last post you are not running the inserts in a transaction.
Here is what I did -
Con1
begin transaction
insert into test_table( num, sessionID )
select top 1000000 c1.id, 1 from syscolumns c1
cross join syscolumns c2 cross join syscolumns c3
waitfor DELAY '00:00:02'
insert into test_table( num, sessionID )
select top 1000000 c1.id, 1 from syscolumns c1
cross join syscolumns c2 cross join syscolumns c3
waitfor DELAY '00:00:02'
insert into test_table( num, sessionID )
select top 1000000 c1.id, 1 from syscolumns c1
cross join syscolumns c2 cross join syscolumns c3
waitfor DELAY '00:00:02'
insert into test_table( num, sessionID )
select top 1000000 c1.id, 1 from syscolumns c1
cross join syscolumns c2 cross join syscolumns c3
commit transaction
Con2
begin transaction
insert into test_table( num, sessionID )
select top 1000000 c1.id, 2 from syscolumns c1
cross join syscolumns c2 cross join syscolumns c3
insert into test_table( num, sessionID )
select top 1000000 c1.id, 2 from syscolumns c1
cross join syscolumns c2 cross join syscolumns c3
insert into test_table( num, sessionID )
select top 1000000 c1.id, 2 from syscolumns c1
cross join syscolumns c2 cross join syscolumns c3
commit transaction
Query results
select sessionID, min(id) as StartIndex, max(id) as EndIndex
from test_table group by sessionID order by sessionID
Result
114000000
240000017000000
February 17, 2008 at 9:00 am
From "INSIDE MICROSOFT SQL SERVER 2005 - T-SQL QUERYING" by Itzik Ben-Gan, Solid Quality Learning, pg 429:
Another aspect of the IDENTITY property that can be considered a limitation in some cases is that identity values are assigned in an asynchronous manner. This means that multiple sessions issuing multirow inserts might end up getting nonsequential identity values.
__________________________________________________________________________________
SQL Server 2016 Columnstore Index Enhancements - System Views for Disk-Based Tables[/url]
Persisting SQL Server Index-Usage Statistics with MERGE[/url]
Turbocharge Your Database Maintenance With Service Broker: Part 2[/url]
February 17, 2008 at 9:13 am
Marios Philippopoulos (2/17/2008)
From "INSIDE MICROSOFT SQL SERVER 2005 - T-SQL QUERYING" by Itzik Ben-Gan, Solid Quality Learning, pg 429:Another aspect of the IDENTITY property that can be considered a limitation in some cases is that identity values are assigned in an asynchronous manner. This means that multiple sessions issuing multirow inserts might end up getting nonsequential identity values.
That's nice... and, although it's a famous and very well respected author, without deomonstrable code to prove it one way or the other, it can still be wrong.
For example, one very famous and very well respected author said that you don't need to preallocate any disk space when setting up a new database... that the autogrow features would take care of everything...
... it sure does... including 73 disk fragments just to get to the first Gig.
And, how many times have you seen well known authors recommend the use of Table Variables because the "live in memory and Temp Tables don't", which of course, is incorrect.
Extraordinary claims require extraordinary proof... show me the code 😉
--Jeff Moden
Change is inevitable... Change for the better is not.
February 17, 2008 at 9:49 am
Ok... here's some extraordinary proof that supports what Itzek said...
First, I made a table in a nice safe place... TempDB... couldn't use a actual temp table because I wanted it to be as normal as possible and I wanted multiple connections to be able to get to it (and didn't want to use a global temp, folks).
--===== Identify the database to use (Tempdb is a nice safe place)
USE TempDB
--===== Create the test table
CREATE TABLE DeleteMe
(
DeleteMeID INT IDENTITY(1,1) PRIMARY KEY CLUSTERED WITH FILLFACTOR = 90,
SomeStringData VARCHAR(10),
Spid INT
)
Then, I opened 8 new windows in QA and entered/copied the following code to all 8 windows...
--===== Identify the database to use (Tempdb is a nice safe place)
USE TempDB
WAITFOR TIME '11:27:00'
INSERT INTO DeleteMe
(SomeStringData,Spid)
SELECT TOP 100
'xxxxxx' AS SomeStringData,
@@SPID
FROM Master.dbo.SysColumns
Basically, I have 8 connections that all entered 100 rows each at the same time. And, what Itzek said is true... you can't rely on the Identity being contiguous by connection... here's the proof of that I found in the DeleteMe table...
DeleteMeID SomeStringData Spid
----------- -------------- -----------
336 xxxxxx 69
337 xxxxxx 69
338 xxxxxx 69
339 xxxxxx 69
340 xxxxxx 69
341 xxxxxx 67
342 xxxxxx 71
343 xxxxxx 72
344 xxxxxx 64
345 xxxxxx 68
346 xxxxxx 70
347 xxxxxx 69
348 xxxxxx 65
349 xxxxxx 65
350 xxxxxx 65
(15 row(s) affected)
We could stop there because a single insert is an implicit transaction. But, what the heck... I modified the code in the 8 windows to do an explicit transaction, just to make sure...
--===== Identify the database to use (Tempdb is a nice safe place)
USE TempDB
WAITFOR TIME '11:40:00'
BEGIN TRANSACTION
INSERT INTO DeleteMe
(SomeStringData,Spid)
SELECT TOP 100
'xxxxxx' AS SomeStringData,
@@SPID
FROM Master.dbo.SysColumns
COMMIT
... truncated the DeleteMe table and, again, the 8 scripts all ran at the same time... and, here's what I got...
DeleteMeID SomeStringData Spid
----------- -------------- -----------
336 xxxxxx 65
337 xxxxxx 65
338 xxxxxx 65
339 xxxxxx 65
340 xxxxxx 65
341 xxxxxx 68
342 xxxxxx 72
343 xxxxxx 70
344 xxxxxx 71
345 xxxxxx 64
346 xxxxxx 69
347 xxxxxx 69
348 xxxxxx 69
349 xxxxxx 69
350 xxxxxx 69
(15 row(s) affected)
So, NOW we can say, with some great conviction ('cause we're seen it, it's no longer someone's word), that Itzek is right, in this case... whether or not you use an implicit or explicit transaction, there is no guarantee that Identities will be contiguous for any given insert.
Sorry for all the hoo-hah about trusting authors... I've just been burned to many times by authors that have been wrong.
--Jeff Moden
Change is inevitable... Change for the better is not.
February 17, 2008 at 9:53 am
Point well taken... 🙂
__________________________________________________________________________________
SQL Server 2016 Columnstore Index Enhancements - System Views for Disk-Based Tables[/url]
Persisting SQL Server Index-Usage Statistics with MERGE[/url]
Turbocharge Your Database Maintenance With Service Broker: Part 2[/url]
February 17, 2008 at 10:18 am
Jeff, the requirement in my case is not to have contiguous ID's but to be 100% sure that there are no overlap of Id's between the two transactions.
As an example - if for transaction 1, the id's are spread between 51-150 and transaction two between 151-250, I would stick to the logic I have built for the web page otherwise I would have to change it.
The following is not acceptable -
First Transaction Id's - 51-80, 100-169
2nd Transaction Id's - 81-99 and 170-249
I hope I am being clear, otherwise I would be happy to clarify. Thanks.
SET TRANSACTION ISOLATION LEVEL SERIALIZABLE - What do you think of using the isolation level as SERIALIZABLE? In this case the set of Id's would be sequential.
Any performance Issues I need to worry about using this?
February 17, 2008 at 12:23 pm
atarafder (2/17/2008)
Jeff, the requirement in my case is not to have contiguous ID's but to be 100% sure that there are no overlap of Id's between the two transactions.
Um... same thing, I believe... Non-contiguous means there WILL be an overlap.
--Jeff Moden
Change is inevitable... Change for the better is not.
February 18, 2008 at 12:17 am
Is there a possibility that the user1 would have records with Id 1, 3, 4, 6, 7 and user2 with Ids- 2,5,8,9,10? (overlapping Id's)
If the example you quote here is what you are worried about, then, yes, this sort of thing can and will happen, as it was shown with the examples.
If by overlapping ids you mean that 2 sessions will accidentally hit the SAME identity value, then, no, this will not happen.
__________________________________________________________________________________
SQL Server 2016 Columnstore Index Enhancements - System Views for Disk-Based Tables[/url]
Persisting SQL Server Index-Usage Statistics with MERGE[/url]
Turbocharge Your Database Maintenance With Service Broker: Part 2[/url]
February 18, 2008 at 3:55 pm
Hi Jeff
I am a frequent lurker and occasional poster on this site. Firstly, I would like to thank you and others for the invaluable info that you post here - I have learned a lot.
Having said that I have a potentially dumb question:
you frequently refer to TempDB as a nice safe place to test code etc. Given that it is a system database,and is used by other databases on the server is it really a better place to test code than say a "test database" created specifically for the purpose of testing, where the "test database" could be dropped if necessary? I'm interested to learn the reasoning for this
TIA
February 18, 2008 at 6:16 pm
First, I love your "handle"... "Ivana Noh"... that's outstanding. Someday, ya gotta tell me at least your first name.
Your observation is spot on... It would be MUCH better to have a test database... But, there's the rub... What is the name of your test database? And, if I need to create a real table and drop it in code, I don't want to take the chance that someone is working from a test database. I have had people who aren't fully aware of what they're doing actually run test code in a production environment. If I were to write code that created, used, and dropped a table called "Employee" of "Customer", and that type of uniformed steward ran the code, there's a pretty good chance that my code would mess up someones data.
So, to protect the, shall we say, lesser informed folks, I run the code over on TempDB just to keep the chances of damaging someones data as low as I possiblly can even if they've made the mistake of logging in as "SA"... I know it's not my data and anyone who runs test code from a forum on a production database is just begging for trouble, but data is data and, as corny as it sounds, I'm sworn to protect it even if it's not mine.
So, that's why... it's to protect folks who just might not know better. Worst they may have to do to get out of trouble is bounce the service so that TempDB will rebuild. And, folks who are aware, will simply change the USE statement to point to their correct data base instead of TempDB... or not. Either way, things are pretty safe that way.
Thanks for the compliment and don't lurk so much... you ask good questions and probably have some good answers of your own... join the"crowd".
--Jeff Moden
Change is inevitable... Change for the better is not.
June 9, 2009 at 4:52 am
Hi guys,
I know this is an old post, but I've got some more info.
As you would expect setting isolation level to SERIALIZABLE or using HOLDLOCK or SERIALIZABLE hints on the INSERT statements does not help as the data that is being inserted in the different INSERT statements may be unrelated.
TABLOCK hint causes deadlocks with the example code.
The only way I found to totally isolate the inserts and avoid interlaced identity ids was to use a TABLOCKX hint on the INSERT statewent. Obviously this has huge repercussions for concurrency.
Cheers,
James
Viewing 15 posts - 1 through 15 (of 15 total)
You must be logged in to reply to this topic. Login to reply