Primary Key: GUID vs. INT

Question

Primary Key: GUID vs. INT

Viewing 9 posts - 16 through 23 (of 23 total)

You must be logged in to reply to this topic. Login to reply

Jeff Moden SSC Guru Points: 1004441 More actions · Answer 1

HowardW (3/27/2012)
The point of GUIDs is that they're globally unique, so you can have multiple unconnected client/middle tiers that can safely generate an ID without worrying about collision with other clients.

That's not true anymore. When MS was using Type 1 GUIDs, it was true. MS now uses Type 4 GUIDs which aren't much more than a huge pseudo-random number. While the chances of duplicating a GUID between two or more systems are quite small, it's still possible. Make sure the column that contains the GUID has a UNIQUE constraint on it.

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

TheSQLGuru SSC Guru Points: 134017 More actions · Answer 2

wagner-670519 (3/26/2012)
Hi Folk,
is it a big performance loose when using GUIDs as Primary Keys?
Greetz Teutales

Yes, it is. Integers are 4 bytes, and CPUs and code are optimized for their processing. GUIDs are 16 bytes of storage and much bigger on presentation. Even if you avoid index fragmentation (not just clustered there, btw) there is still the NON-trivial fact that they are FOUR TIMES AS LARGE. If you need optimal performance avoid GUIDs.

Best,
Kevin G. Boles
SQL Server Consultant
SQL MVP 2007-2012
TheSQLGuru on googles mail service

HowardW One Orange Chip Points: 29109 More actions · Answer 3

Jeff Moden (3/27/2012)
HowardW (3/27/2012)
The point of GUIDs is that they're globally unique, so you can have multiple unconnected client/middle tiers that can safely generate an ID without worrying about collision with other clients.
That's not true anymore. When MS was using Type 1 GUIDs, it was true. MS now uses Type 4 GUIDs which aren't much more than a huge pseudo-random number. While the chances of duplicating a GUID between two or more systems are quite small, it's still possible. Make sure the column that contains the GUID has a UNIQUE constraint on it.

Sorry, in my example, I'm referring to the UNIQUEIDENTIFIER data type, and not the NEWID() scalar function.

In my example, the ID's are generated from java web servers which use UUID - this is guaranteed to be unique across different physical machines.

Edit: This set me off on some research - even if using NEWID(), there are much more important things to worry about. Even if it's just random, to have a 1% chance of a single occurence of duplication, you'd need 2,600,000,000,000,000,000 rows of data. You may as well worry about quantum effects randomly flipping storage bits from 0's to 1's and generating duplicates that way 🙂

Eugene Elutin SSC Guru Points: 59322 More actions · Answer 4

You still never answered the question. You said earlier you would do a round trip for purposes of caching a bunch of GUIDs:

Where I've said that?

Even, with using GUID's, if you want your client/middle layer to be super fast, you would want to cache pre-allocated range of keys.
In what scenario would that be a good idea when there are plenty of options available for generating a GUID upstream, in an application tier? If one were serious about saving CPU cycles, one would not do a round trip to the database for purposes of generating a GUID.

I've already said where! Auto-trading apps, they need to be super quick to find discrepancies in market prices and try to get some profit from it. As you can guess they don't use SQL Server. These apps developed to save time on anything. I've seen one which pre-generates range of GUID's and caches it in array for further use. Again, I've not suggested to do db round trip to do so.

4. Oh, me too, I've also seen several 'creative mess approaches'. I've seen them both: with or without using GUID's.
See, international outsourcing is not as bad as it seams. It makes us employed and justifies high rates paid! Someone need to do "dirty" clean up jobs... 😉
What are you getting at? That seems like quite the blanket statement.

Blanket statement, may be...

I've heard about very successful outsourcing solutions, however myself, working with many of them for last 12 years, never experienced the one unfortunately (except from fin. cost reporting point of view). Hope you have different experience.

_____________________________________________
"The only true wisdom is in knowing you know nothing"
"O skol'ko nam otkrytiy chudnyh prevnosit microsofta duh!":-D
(So many miracle inventions provided by MS to us...)

How to post your question to get the best and quick help[/url]

TheSQLGuru SSC Guru Points: 134017 More actions · Answer 5

I've already said where! Auto-trading apps, they need to be super quick to find discrepancies in market prices and try to get some profit from it. As you can guess they don't use SQL Server. These apps developed to save time on anything. I've seen one which pre-generates range of GUID's and caches it in array for further use. Again, I've not suggested to do db round trip to do so.

You are incorrect on your statement about computerized trading not using SQL Server. They do exist, and have for at least a decade that I know of. There was a group of very smart people in Chicago that basically printed money this way. I suspect they were around before that too. There are AMAZING things you can do to make SQL Server run INCREDIBLY fast if you know what you are doing. Also, StreamInsight (which IS part of SQL Server, although I personally disagree with it getting lumped into the SQL Server ecosystem - just like I do with some other stuff too over the years) was developed in part exactly for this type of application.

Best,
Kevin G. Boles
SQL Server Consultant
SQL MVP 2007-2012
TheSQLGuru on googles mail service

Orlando Colamatteo SSC Guru Points: 182279 More actions · Answer 6

Eugene Elutin (3/28/2012)
You still never answered the question. You said earlier you would do a round trip for purposes of caching a bunch of GUIDs:
Where I've said that?
Even, with using GUID's, if you want your client/middle layer to be super fast, you would want to cache pre-allocated range of keys.
In what scenario would that be a good idea when there are plenty of options available for generating a GUID upstream, in an application tier? If one were serious about saving CPU cycles, one would not do a round trip to the database for purposes of generating a GUID.

Your original words from this post were provided as quoted text. See bolded text above. I can see how the int and GUID angles may have blended together, so let's say you did not mean to imply you would do a round trip to the database to get said GUIDs, and that you would use one of the many methods to generate GUIDs in your application language of choice. Why on Earth would you ever bother caching a range of GUIDs? For performance? I could probably come up with hundreds of application layer and database layer optimizations I would look to do before I would ever start worrying about the performance difference between generating a GUID on the fly versus having a bunch of them cached and ready for use in my application layer.

There are no special teachers of virtue, because virtue is taught by the whole community.
--Plato

Eugene Elutin SSC Guru Points: 59322 More actions · Answer 7

...so let's say you did not mean to imply you would do a round trip to the database to get said GUIDs
...

Without "so let's say...". I didn't say anything which could imply that I advise to do sound trip to database.

Usually, it's not only just range of GUID's is cached, it can be kind of range of pre-populated, ready-to-use structures...

At the end, it's not a point for discussion.

Question was GUID vs INT as PK

Yes, there are architects who like to use GUID's as Keys, for one or another reasons, therefore there are many systems you may find that uses them.

However, no-one can much argue that they are much less convenient than int's from maintainability and easy-to-use database prospective.

So, unless you forced to do so or have no good idea about other options, I would not advise using GUID's for PK for your tables.

_____________________________________________
"The only true wisdom is in knowing you know nothing"
"O skol'ko nam otkrytiy chudnyh prevnosit microsofta duh!":-D
(So many miracle inventions provided by MS to us...)

How to post your question to get the best and quick help[/url]

Orlando Colamatteo SSC Guru Points: 182279 More actions · Answer 8

Eugene Elutin (3/28/2012)
So, unless you forced to do so or have no good idea about other options, I would not advise using GUID's for PK for your tables.

Now that, you and I can agree on!

There are no special teachers of virtue, because virtue is taught by the whole community.
--Plato

Lynn Pettis SSC Guru Points: 442467 More actions · Answer 9

Eugene Elutin (3/28/2012)
...so let's say you did not mean to imply you would do a round trip to the database to get said GUIDs
...
Without "so let's say...". I didn't say anything which could imply that I advise to do sound trip to database.
Usually, it's not only just range of GUID's is cached, it can be kind of range of pre-populated, ready-to-use structures...
At the end, it's not a point for discussion.
Question was GUID vs INT as PK
Yes, there are architects who like to use GUID's as Keys, for one or another reasons, therefore there are many systems you may find that uses them.
However, no-one can much argue that they are much less convenient than int's from maintainability and easy-to-use database prospective.
So, unless you forced to do so or have no good idea about other options, I would not advise using GUID's for PK for your tables.

I would say that you would need a good reason to use GUIDs for a primary key. One possible reason would be replication where you need to have primary keys and using identities may be problematic. What I would also agree with when using GUIDs as a primary key is to NOT use them as the clustered index on the table(s).