Identity Table vs Identity Column

Question

Identity Table vs Identity Column

jasonshaw

Old Hand

Points: 368
More actions
May 20, 2011 at 6:10 am

#238544

Hello Everybody.
I am trying to work out if it is a major problem to create a table with a single identity column to provide a way of find out the next job number for our Logging system. I am proposing to insert into the table, retrieve the the new number and then delete the row immediately. Then use this number to store directly in the main table which holds the rest of the data we need, rather than use an identity column on the main table itself.
The reason for this is I am coding in C# and need to store some information before I have all the data to fill in the main table. Having read about sequence tables it would appear that this method could lead to locking problems (mentioned by Jeff Moden) is this method flawed or would it work in this situation.
I have not got a problem with gaps appearing in the sequence when a identity is obtained but not completed...
Thank you in advance for your comments
Jason Shaw

Viewing 13 posts - 1 through 13 (of 13 total)

You must be logged in to reply to this topic. Login to reply

GSquared SSC Guru Points: 260824 More actions · Answer 1

In cases where C# or other applications need a row ID before the row is inserted, I recommend GUIDs generated by the app.

They'll take more space than an ID, but you can generate them in the application without a database trip at all. Just don't use them in the leading edge of any indexes on the table. No contention, reduced network traffic, and you have the ID any time you need it in the front end, for the cost of slightly more storage space. It's probably worth the trade off.

- Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
Property of The Thread

"Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon

Grant Fritchey SSC Guru Points: 399379 More actions · Answer 2

I've seen designs like this in the past and they lead to serious bottlenecks. I'm with Gus. If you have to do this a GUID is your best bet.

But, be very careful when using GUID as the key on a clustered index because they can lead to performance bottlenecks caused by fragmentation.

"The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
- Theodore Roosevelt

Author of:
SQL Server Execution Plans
SQL Server Query Performance Tuning

Lowell SSC Guru Points: 323755 More actions · Answer 3

Jason you are using C#.NET, right? why not take advantage of the typed data sets and the relations?

for example, In .NET i can add parent and child records to any depth, and when it comes down to the final update, the dataadapter / provider swaps out my temporary Identity values that were -1,-2, etc for the real keys returned from the server, and then replaces those keys in the child tables, then updates those tables, and then their children , and so on.

I'd really recommend that instead of trying to maintain the ID's all on your own.

Lowell

--help us help you! If you post a question, make sure you include a CREATE TABLE... statement and INSERT INTO... statement into that table to give the volunteers here representative data. with your description of the problem, we can provide a tested, verifiable solution to your question! asking the question the right way gets you a tested answer the fastest way possible!

jasonshaw Old Hand Points: 368 More actions · Answer 4

Thank you for your ideas.

GUID is a new concept to me and I am not sure that I works in this case as it is not good for indexing as this number is used as a reference number for a number of tables that link together.

The C# .Net data adaptor would be better but I am not sure that I feel happy enough with C# to achieve this correctly at the current time.

Going back to the original plan, what exactly causes the bottleneck. Would it be better to have a table to insert a number of entries and have a scheduled job to periodically remove all entries to keep the table size down to a minimum. We would probably be looking at a maximum of 300 entries per day, so is there likely to be a bottleneck ?

Jason

Gail Shaw SSC Guru Points: 1004514 More actions · Answer 5

jasonshaw (5/23/2011)
Going back to the original plan, what exactly causes the bottleneck.

The fact that you have to serialise access to that table (one connection at a time) to avoid duplicate key values.

Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

We walk in the dark places no others will enter
We stand on the bridge and no one may pass

jasonshaw Old Hand Points: 368 More actions · Answer 6

I am assuming that this serializing access to this table is handled by Sql Server internally.

Having a identity field on the main table would also cause the same serailization of entry of the main table, I would expect that this would be more of a bottle neck as this table has updates run on it as well so there would be more traffic to this table. This table is also highly accessed for reading only for reporting purposes so is the problem really with identity columns as a whole?

Jason

Gail Shaw SSC Guru Points: 1004514 More actions · Answer 7

jasonshaw (5/23/2011)
I am assuming that this serializing access to this table is handled by Sql Server internally.

No, it's handled by you, in code.

If your code assumes there's only ever one row in that table, then you have to be completely sure that there is only one connection executing the insert, select, delete code at a time, otherwise you could have two connections insert a row, then both select, because the code assumes there's only one row in the table one of them gets the other's ID, tries to insert and you get duplicate key errors.

If you're going to use @@identity rather than selecting the new value it's a lot safer and the serialisation isn't necessary.

Having a identity field on the main table would also cause the same serailization of entry of the main table

Yes, but in that case it's handled as part of the insert operation and is optimised very well, not as a separate step as you're planning. Depends on how exactly you're going to implement it.

Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

We walk in the dark places no others will enter
We stand on the bridge and no one may pass

jasonshaw Old Hand Points: 368 More actions · Answer 8

Thank you for that GilaMonster.

I was planning to use the @@IDENTITY function to get the job number back, I don't think I had made it clear enough.

jason

Grant Fritchey SSC Guru Points: 399379 More actions · Answer 9

jasonshaw (5/23/2011)
Thank you for your ideas.
GUID is a new concept to me and I am not sure that I works in this case as it is not good for indexing as this number is used as a reference number for a number of tables that link together.
The C# .Net data adaptor would be better but I am not sure that I feel happy enough with C# to achieve this correctly at the current time.
Going back to the original plan, what exactly causes the bottleneck. Would it be better to have a table to insert a number of entries and have a scheduled job to periodically remove all entries to keep the table size down to a minimum. We would probably be looking at a maximum of 300 entries per day, so is there likely to be a bottleneck ?
Jason

Gail already outlined the bottleneck.

You can use GUID as a PK and FK on tables just fine. It will work well. I just don't recommend it as a clustered index is all.

"The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
- Theodore Roosevelt

Author of:
SQL Server Execution Plans
SQL Server Query Performance Tuning

aruzhans SSC Enthusiast Points: 105 More actions · Answer 10

I tested GUID column and run into perfromance problems for this exact reason to return the unique value for application that generates a service requests . My solution was to use custom written SQL sp that generates 10 positions string with random number and random sequence of chars. This lended into much more efficient process without using Identity or GUID and accomplished my goal. It is working already a few years with no problems.

Evil Kraig F SSC Guru Points: 100851 More actions · Answer 11

jasonshaw (5/23/2011)
Thank you for that GilaMonster.
I was planning to use the @@IDENTITY function to get the job number back, I don't think I had made it clear enough.
jason

As a recommendation look into scope_identity() as well so you understand where @@IDENTITY can cause some issues.

- Craig Farrell

Never stop learning, even if it hurts. Ego bruises are practically mandatory as you learn unless you've never risked enough to make a mistake.

For better assistance in answering your questions[/url] | Forum Netiquette
For index/tuning help, follow these directions.[/url] |Tally Tables[/url]
Twitter: @AnyWayDBA

Jeff Moden SSC Guru Points: 1004703 More actions · Answer 12

You CAN use a sequence table but not quite the way you have it. You don't need to do an insert/delete... you can do an UPDATE using the well documented 3 part UPDATE. You need to use the 3 part update to avoid the need for serialization and the very high potential for deadlocks that brings on.

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)