GUID or Int is the best for primary key columns?

Question

GUID or Int is the best for primary key columns?

Viewing 15 posts - 16 through 30 (of 51 total)

You must be logged in to reply to this topic. Login to reply

GabyYYZ SSCertifiable Points: 7913 More actions · Answer 1

David.Poole (7/15/2009)
Gaby A. (7/15/2009)
What's wrong with a bigint identity(1, 1).
Nothing provided you are running on a single server. The instant you go multi-site or employ a sharding strategy all sorts of challenges come to the fore.

True, and that's where newsequentialid() is a much better option if you need that option.

Here's a kluge someone showed me once:

create table #temp_seqguid

(

myguid uniqueidentifier default newsequentialid() -- necessary, as only available as a default setting

)

insert into #temp_seqguid default values

declare @GUID uniqueidentifier

select top 1 @GUID = myguid from #temp_seqguid

drop table #temp_seqguid

[EDIT] And here's a modified version based on David.Poole's suggestion below:

declare @seqGUID table ( myguid uniqueidentifier default newsequentialid() )

insert into @seqGUID default values

declare @GUID uniqueidentifier

select top 1 @GUID = myguid from @seqGUID

Gaby________________________________________________________________"In theory, theory and practice are the same. In practice, they are not." - Albert Einstein

David.Poole SSC Guru Points: 76039 More actions · Answer 2

Nice but I would use a table variable for this rather than a temporary table.

LinkedIn Profile

GabyYYZ SSCertifiable Points: 7913 More actions · Answer 3

David.Poole (7/15/2009)
Nice but I would use a table variable for this rather than a temporary table.

True, for only few values, the streamlined code of table variables would be better. Code modified in previous post.

Gaby________________________________________________________________"In theory, theory and practice are the same. In practice, they are not." - Albert Einstein

GSquared SSC Guru Points: 260824 More actions · Answer 4

Dave Ballantyne (7/15/2009)
Not that i doubt you but, are you 100% sure that that can be the only explanation ? There were no other bugs that could of caused it to happen?
According to wikipedia
While each generated GUID is not guaranteed to be unique, the total number of unique keys (21^28 or 3.4×10^38) is so large that the probability of the same number being generated twice is infinitesimally small. For example, consider the observable universe, which contains about 5×10^22 stars; every star could then have 6.8×10^15 unique GUIDs.
The odds on 3 out of 2 Billion being duplicated are (infinitesimally small cubed) ???

The mathematical odds are quite small, but computers don't actually generate truly random numbers, which means the odds are much higher as soon as you have a lot of "random" numbers being generated in very short periods of time.

- Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
Property of The Thread

"Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon

GSquared SSC Guru Points: 260824 More actions · Answer 5

David.Poole (7/15/2009)
Gaby A. (7/15/2009)
What's wrong with a bigint identity(1, 1).
Nothing provided you are running on a single server. The instant you go multi-site or employ a sharding strategy all sorts of challenges come to the fore.

Easy enough to solve if you have a two-column PK. One is the identity value, the other is the source computer ID.

- Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
Property of The Thread

"Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon

EdVassie SSC Guru Points: 60481 More actions · Answer 6

Lots of good advice in this thread!

Another way that anything relying on a MAC address can produce duplicate results is that you are relying on all MAC addresses being globally unique.

For virtual servers with virtual networks this is not true. Also, manufacturers of some cheap network cards do not always give them unique MAC addresses.

The first problem you can solve with configuring your virtual network. The second problem is unlikely to affect servers as they seldom use bottom-end cards, but if you have multiple desktops or laptops feeding data to your server you may hit this issue.

Original author: https://github.com/SQL-FineBuild/Common/wiki/ 1-click install and best practice configuration of SQL Server 2019, 2017 2016, 2014, 2012, 2008 R2, 2008 and 2005.

When I give food to the poor they call me a saint. When I ask why they are poor they call me a communist - Archbishop Hélder Câmara

GabyYYZ SSCertifiable Points: 7913 More actions · Answer 7

EdVassie (7/16/2009)
Lots of good advice in this thread!
Another way that anything relying on a MAC address can produce duplicate results is that you are relying on all MAC addresses being globally unique.
For virtual servers with virtual networks this is not true. Also, manufacturers of some cheap network cards do not always give them unique MAC addresses.
The first problem you can solve with configuring your virtual network. The second problem is unlikely to affect servers as they seldom use bottom-end cards, but if you have multiple desktops or laptops feeding data to your server you may hit this issue.

If you really need to make sure, when using newid() or newsequentialid(), have a binary variable added to the GUID made up of system unique variables. For example:

declare @GUID uniqueidentifier, @salt varbinary(100)

set @salt = cast(checksum(@@servername) as varbinary(100)) -- or additional system info

set @GUID = cast(newid() as varbinary(100)) + @salt

Yet another kluge, but this should have minimal to no impact on performance.

Gaby________________________________________________________________"In theory, theory and practice are the same. In practice, they are not." - Albert Einstein

Mauve SSChampion Points: 11316 More actions · Answer 8

SQL Server GUIDs are Version "4" (random) and not Version "1" (MAC address).

An advantage for using GUIDs in a .Net application is that the value can be generated anywhere within the layers. E.g., the .Net application can generate them and pass them to the database layer (as a parameter to the stored procedure). Or, if necessary, the database layer can generate them as needed. Lots of flexibility.

Using an IDENTITY means that the database has to return the value to the caller each time. And if multiple rows are inserted, it gets cumbersome to return the list of (IDENTITY) values back to the application and for it to update the data that it has in memory.

We also use XML as a method for passing "chunks" of data to/from the stored procedures vs. lots of scalar parameters. The .Net layer generates the unique (GUID) identifiers that are contained within the XML. Example: A Sales Order with line items for the merchandise being purchased. This would be a complete XML document that would be passed to the stored procedure which in turn would "shred" it into the proper relational tables.

Impossible to do with IDENTITY.

We use them extensively in the databases powering our SaaS offerings. In fact, all unique identifiers are GUIDs.

[font="Arial Narrow"](PHB) I think we should build an SQL database. (Dilbert) What color do you want that database? (PHB) I think mauve has the most RAM.[/font]

Mauve SSChampion Points: 11316 More actions · Answer 9

GUID is also known as a UUID. See: http://en.wikipedia.org/wiki/Uuid which also describes the Versions and how to identify them.

[font="Arial Narrow"](PHB) I think we should build an SQL database. (Dilbert) What color do you want that database? (PHB) I think mauve has the most RAM.[/font]

TheSQLGuru SSC Guru Points: 134017 More actions · Answer 10

Where possible (which is the vast majority of the cases even if you need uniqueness across multiple machines - just use a compound key) I recommend to my clients that they use some numeric and not a GUID for the PK if they want some surrogate PK. Datatype size is one very key feature, as is possible fragmentation if you (want to) cluster on it - and if you do cluster the key gets carried on every non-clustered index which can be another performance drain.

Oh, and I never take the default seeding. Always explicitly start the sequence at the low limit of whatever numeric you choose. Doing otherwise can lose you half of your potential key values. I have seen numerous cases where key types had to be changed because they ran out of numbers - with half the values going unused.

Best,
Kevin G. Boles
SQL Server Consultant
SQL MVP 2007-2012
TheSQLGuru on googles mail service

SwePeso SSC-Dedicated Points: 39757 More actions · Answer 11

JohnG (7/16/2009)
Impossible to do with IDENTITY.

Not with the OUTPUT operator.

N 56°04'39.16"
E 12°55'05.25"

TheSQLGuru SSC Guru Points: 134017 More actions · Answer 12

JohnG (7/16/2009)
SQL Server GUIDs are Version "4" (random) and not Version "1" (MAC address).
An advantage for using GUIDs in a .Net application is that the value can be generated anywhere within the layers. E.g., the .Net application can generate them and pass them to the database layer (as a parameter to the stored procedure). Or, if necessary, the database layer can generate them as needed. Lots of flexibility.
Using an IDENTITY means that the database has to return the value to the caller each time. And if multiple rows are inserted, it gets cumbersome to return the list of (IDENTITY) values back to the application and for it to update the data that it has in memory.
We also use XML as a method for passing "chunks" of data to/from the stored procedures vs. lots of scalar parameters. The .Net layer generates the unique (GUID) identifiers that are contained within the XML. Example: A Sales Order with line items for the merchandise being purchased. This would be a complete XML document that would be passed to the stored procedure which in turn would "shred" it into the proper relational tables.
Impossible to do with IDENTITY.
We use them extensively in the databases powering our SaaS offerings. In fact, all unique identifiers are GUIDs.

I personally prefer a compound key for this situation - something like this

HeaderTable

PKID int identity

various fields

LineItemTable

PKID

Sequence

various fields

Sequence goes from 1-n for each sales header. Numerous benefits to this construct, including knowing the order of the line items which could be important for things like reproducing identical printouts.

Best,
Kevin G. Boles
SQL Server Consultant
SQL MVP 2007-2012
TheSQLGuru on googles mail service

AmolNaik SSCarpal Tunnel Points: 4767 More actions · Answer 13

First of all do you really want your data to be globally unique? For example if you are trying to implement a merge replication that will transfer and receive data from/to different db servers, then probably makes sense to have GUIDs in your tables. If not then consider using INTEGERS as they are way efficient than GUIDs. Having GUIDs could also lead to heavy index fragmentation since they are random in nature. With this being said try using NEWSEQUENTIALID() function in SQL 2005 which is much efficient compared to GUID(), the only downside is you can't generate this using SELECT statement as you would do in case of NEWID(). You would need to use this function as a DEFAULT constraint in the table definition.

Hope this helps.

Thanks,

Amol

Amol Naik

TheSQLGuru SSC Guru Points: 134017 More actions · Answer 14

AmolNaik, did you read the thread completely? I think everything you said had already been covered, most of it more than once. 🙂

Best,
Kevin G. Boles
SQL Server Consultant
SQL MVP 2007-2012
TheSQLGuru on googles mail service

David.Poole SSC Guru Points: 76039 More actions · Answer 15

The problem with a compound key where one of the fields is a ServerId is when you are running a disaster recovery site.

You have to remember NOT to synchronise the tables that tell you what the ServerId means.

It is not insurmountable but it is an easy mistake to make.

LinkedIn Profile