Uniqueidentifier vs int

  • What is your preference for record identifiers; uniqueidentifer or int?

    We are in the process of a database redesign and looking for some opinions (or even facts if you have them)

    TIA

    --Skydyvyr

  • How about the actual key of the table??

  • I don't understand your reply...  What we are looking for opinions on are preferred data types for record ids in a table.

    For example, do you prefer:

    CREATE TABLE UserTable(

                UID uniqueidentifier,

                UserName varchar(15),

                FirstName varchar(20),

                LastName varchar(35))

    or:

    CREATE TABLE UserTable(

                UID int,

                UserName varchar(15),

                FirstName varchar(20),

                LastName varchar(35))

     

    Thanks

    --Skydyvyr

  • AS I said... that kind of table should have the username as the pk.

  • Because many tables will reference this table via Foreign keys, that would mean I would have to have Usernames spread throughout my entire database.  Additionally, if a user wanted to change their username, I would have to replicate that change across the entire database.  Neither of these are desirable side effects for us.

    While changing a username is unlikely, we have other tables within the database that will contain no fields that can be used as a key, but need some sort of key. so again, I ask, would you use an int or a uniqueidentifier in those cases?

     

  • UniqueIdentifiers are not sequential therefore you may run into some performance issues if your primary key is clustered.

    I agree with your using something other than UserName as the primary key, although I would use a unique constraint on it.

  • In that particular example, I would agrea with you re: the unique constraint, but that is not a table from our database...  just a poor example that I threw together on the fly to try and illustrate my question.

    The performance issues you mention would be on writes as opposed to reads, correct?

    --Skydyvyr

  • Yes, it would affect writes as data in clustered indices is is order.

  • I'd use an int in preference to a uniqueidentifier as an int uses 4 bytes and a uniqueidentifier uses 16. This has an impact on the storage requirement both the base table and any tables referencing it.

    Using the UserTable example, I'd define it this way:

    CREATE TABLE [UserTable] (

     [UID] [int] IDENTITY (1, 1) NOT NULL ,

     [UserName] [varchar] (15) NOT NULL ,

     [FirstName] [varchar] (20)  NULL ,

     [LastName] [varchar] (35)   NULL ,

     CONSTRAINT [PK_UserTable] PRIMARY KEY  CLUSTERED ([UID])  ON [PRIMARY] ,

     CONSTRAINT [CK_UserTable] UNIQUE  NONCLUSTERED ([UserName])  ON [PRIMARY]

    ) ON [PRIMARY]

    That is with the clustered (=ordering) index on the UID and a unique index on the Username.

    I'm assuming the most common usage would be like

    Select a.information from Another_Table a

    inner join UserTable u on a.UID = u.UID and u.UserName = 'Fred'

    Using an int as the key will allow an arithmetic comparison instead of a character comparison.

    Having the clustered index on the UID instead of the UserName will avoid page splits when records are inserted.

    Regards

    Otto



    Best Regards,

    Otto Schreibke

    The future is a foreign country, they do things differently there.
    (Stolen from Arthur C Clarke)

  • A GUID is not just 16bytes but binary data type that should be avoided if you can do so because a GUID will break your index and create problems while IDENTITY will not give you such problems.  Try the links below for more info and alternatives to GUID.  Hope this helps.

    http://www.informit.com/articles/article.asp?p=25862&rl=1

    http://sqldev.net/xp/xpguid.htm#Known_limitations

    Kind regards,

    Gift Peddie

    Kind regards,
    Gift Peddie

  • From a Client OOP point of view UID's solve many problems for the programmer:

    I work on systems where client software (usually VB.net) is responsible for creating new entities in the database. This client software generates XML from its object instances and sends this to the database where stored procedures take the XML and create /update records.

    The big benifit comes when the client software is creating a record, using a GUID as a primary key means the key can be created on the client! If an identity is used, a much more complicated procedure needs to be in place on the client, to maintain a "key placeholder" until the server has allocated a new primary key. Then somehow this value has to be sent back to the client - This is much more complex that you might first think, because: Generally the XML document sent to the server does not represent something simple like one user record. For example, it could be 3 new invoices, that has a new customer and modified products specified in the same document, consider how you scope_identity isn't going to help you too much in this scenario, even if you resort to a cursor iterate orders etc.

  • The big benefit of UniqueIdentifiers is that they ARE unique, which is why they are used in Merge Replication.

    I have used them where I have an upside down data structure, that is there are multiple types of parent recordset but a common set of child properties.

    Under normal conditions I wouldn't choose to use them for the reasons discussed above

  • The downfall of GUID's are that they're not guarenteed to be unique, mearly random and because of this they are a pain to use in any sort of PK as they fragment the hell out of a table.

    I saw a great article about this last week with an excellent work around by Adam Machanic (look further down the page for it), for those of you that are interested the link's below.

    http://sqljunkies.com/WebLog/odds_and_ends/archive/2005/08/31/16595.aspx

    PS. I also agree with Remi that you should go for the natural PK for this sort if at all possible of thing rather than generate something to make it unique.

  • They are, for all intents and purposes, unique.

    The link you refer to details what someone once described as the COMB - can't remember what it stood for or who did it, but it was a combination of GUID and binary date/time.  All very good - but this is really only a concern if you are using the GUID as your CLUSTERED index - which may not be the case.  If it is not your clustered index, then your table won't fragment because of the GUID being your primary key (although your index may fragment to a lesser extent).

    We use GUIDs quite extensively for the reasons given earlier - it is nice for the client to generate the key and have that key sent to the server.  If you ever want to have two independant databases merged together, either with SQL merge replication or via a "roll your own" method, GUIDs are handy because you won't have any conflicts.  Yes GUIDs are 16 bytes, but perhaps the small amount of extra disk space is worth it?

    There are MANY MANY threads debating this topic on these forums and many others - it is in some ways an Apple vs PC debate - both sides have good points and both have evangelists and at the end of the day, each can co-exist reasonably well....

    I'm a PC man by the way, although I can at least find my way around OS X last time I tried!  

Viewing 14 posts - 1 through 13 (of 13 total)

You must be logged in to reply to this topic. Login to reply