I came across an interesting question on SE last week. Guid vs INT – Which is better as a primary key? In addition to the quite good accepted answer I thought I would throw in my own take.
- Size
- GUIDs are 16 bytes and hold more values you then could ever use.
- With an identity column you can choose a data type dependent on your need.
- tinyint1 byte0-255
- smallint2 bytes-2^15 (-32,768) to 2^15-1 (32,767)
- int4 bytes-2^31 (-2,147,483,648) to 2^31-1 (2,147,483,647)
- bigint8 bytes-2^63 (-9,223,372,036,854,775,808) to 2^63-1 (9,223,372,036,854,775,807)
Remember that the size of your column affects not just how much space the table takes up but how many pages (both index and data) need to be read to perform a given operation. Bigger the column the less you can fit in a page, the more pages need to be read, the slower your queries. Even if by a very small amount.
- Uniqueness
- GUIDs are considered universally unique. This isn’t exactly true but if you look here and here you will see that it’s really close enough to true to work with.
- Identity columns are only as unique as you make them. If you put a unique constraint on the column (or PRIMARY KEY) then you will be at least guaranteed that you have a unique value in that table. But only in the table itself, when you compare to other tables, databases etc there is no uniqueness
You have to decide here how unique you need your column to be and if it’s worth the space it’s going to take up.
- Portability
- Because they are universally unique GUIDs are completely portable. You can move the values from place to place with no difficulty.
- Identity columns are not really portable. Anyone who has tried to merge two tables with identity columns, between prod and test for example, knows what a pain this is.
- Ease of use
- GUIDs require the use of NEWSEQUENTIALID() or NEWID() either as a default, part of the insert, a trigger etc
- Identity columns are created and then you actually have to avoid them to make them work properly.
Personally I find identity columns MUCH easier but on the other hand I use them far more often than GUIDs so I have a lot more experience there. They do say you tend to go with what you know.
If you are never going to merge data with another table/location then you are probably ok with an identity column. If on the other hand you expect to need to merge data from multiple tables/locations then you should probably think about GUIDs.
And last but not least here is BOLs take on each of them.
As with many design considerations this is an important decision. When deciding to use a GUID vs an integer Identity column you should balance the portability of a GUID vs it’s additional space required and the smaller size of the identity column vs the major pain that moving a row with an identity column from one table to another can be. The fact that integers are easier to work with when debugging is true but somewhat insignificant if you truly need a GUID. And to be honest I’ll bet you get used to it. Integers “looking” better when displayed to the end user is a factor but somewhat less important when compared to other considerations.
You, your co-workers and your replacement (there will always be one) will have to live with your decision so at least think about it before you decide. One of the things I dislike most is the “Always do it this way” mentality.
Filed under: Microsoft SQL Server, SQLServerPedia Syndication, T-SQL Tagged: code language, language sql, microsoft sql server, sql statements, T-SQL