Are There That Many GUIDs?

  • GSquared (10/26/2010)


    Eric Russell 13013 (10/26/2010)


    This why I keep saying that some type of identifier unique to each instance of the application or user, like client_id or cashregister_id, when combined with a sequential integer or date/time stamp is the way to go (assuming the application database was initially setup correctly). If the operating environment wasn't setup correctly, then you need fix that and re-code the data. For example, if you have a POS system running locally at multiple retail outlets, and each sales transactions contains (store_id or cashregister_id) and sale_datetime, then you can merge those transactions into a central datawarehouse with no collisions. Since these columns already exist on the sales transaction record anyhow, there should be no need for any algorithm based surrogate key. If you tie the primary key with a physical entity in the real world, like an input device or a person, and also include date/time for context, then it's inherently unique.

    This works, but does have the disadvantage of things like accidentally re-assigning an existing register ID.

    Of course, either can suffer from someone doing something like copying a database from one register to another in order to get a crashed register up and running again, or something like that. (Less likely with cash registers, highly possible with other business applications.) In a case like that, collisions are inevitable, but can usually be handled easily.

    If someone has incorrectly configured the ID for a POS register or user, especially assigning them an ID belonging to someone or something else, then it's not just a technical problem but also a serious accounting problem. In this situation it's beneficial for the ETL process that loads the datewarehouse to reject the transactions as duplicates, because it clues us in to the fact that something is wrong. If the primary key is a GUID or even a sequential integer, then weeks or months of bad data could be imported into corporate database before anyone realizes that something is wrong.

    "Do not seek to follow in the footsteps of the wise. Instead, seek what they sought." - Matsuo Basho

  • It would seem there are now more posts on the subject of GUIDS than there are GUIDS. 😀

  • Craig-315134 (10/26/2010)


    It would seem there are now more posts on the subject of GUIDS than there are GUIDS. 😀

    Not yet, but we're working on it.

  • Craig-315134 (10/26/2010)


    It would seem there are now more posts on the subject of GUIDS than there are GUIDS. 😀

    :laugh: Yes, this is always a hotly-debated topic. I'm not really sure why; GUIDs are a perfectly valid way to solve some problems. They're not the only solution, and there are pros and cons, but that's kinda true of most things!

  • Paul White NZ wrote:

    ... this is always a hotly-debated topic. I'm not really sure why; GUIDs are a perfectly valid way to solve some problems. They're not the only solution, and there are pros and cons, but that's kinda true of most things!

    Agreed on all counts, Paul!

  • I'm not intentionally trying to throw gasoline on the fire, but I'd just like to point out that, since a GUID is basically a combination of various attributes like machine id, timestamp, and/or sequence number, then it could be argued that this is a violation of First Normal Form. 🙂

    "Do not seek to follow in the footsteps of the wise. Instead, seek what they sought." - Matsuo Basho

  • Eric Russell 13013 (10/26/2010)


    I'm not intentionally trying to throw gasoline on the fire, but I'd just like to point out that, since a GUID is basically a combination of various attributes like machine id, timestamp, and/or sequence number, then it could be argued that this is a violation of First Normal Form. 🙂

    Humorous, but of course incorrect because by definition the key is a surrogate and not meant to imply information about the entity. The inputs are simply a means of generating a meaningless value. In fact, one of the problems with identity columns is that developers try to impart meaning (sequence) on the value.

  • Paul White NZ (10/26/2010)


    Craig-315134 (10/26/2010)


    It would seem there are now more posts on the subject of GUIDS than there are GUIDS. 😀

    :laugh: Yes, this is always a hotly-debated topic. I'm not really sure why; GUIDs are a perfectly valid way to solve some problems. They're not the only solution, and there are pros and cons, but that's kinda true of most things!

    The response of the day. Of all the things we should strive to achieve as developers, one of the most important is to not be locked into solutions because of dogma. As the Merovingian from the Matrix might say, "Without the "why" you know nothing." Guids are not appropriate for all solutions but neither are integer columns. They help with certain types of problems but if the probability of having those types of problems are minute, then guids are a poor choice.

  • Thomas-282729 (10/26/2010)


    Eric Russell 13013 (10/26/2010)


    I'm not intentionally trying to throw gasoline on the fire, but I'd just like to point out that, since a GUID is basically a combination of various attributes like machine id, timestamp, and/or sequence number, then it could be argued that this is a violation of First Normal Form. 🙂

    Humorous, but of course incorrect because by definition the key is a surrogate and not meant to imply information about the entity. The inputs are simply a means of generating a meaningless value. In fact, one of the problems with identity columns is that developers try to impart meaning (sequence) on the value.

    Well, when you look behind the curtain, a GUID is similar in concept to an identity, except that it's adding a machine id or random number to the sequential id. Developers try to impart meaning to a GUID too, just in a more abstract way, while the identity is more transparent in how it functions.

    For most transactional records, soemthing like a client_id or cashregister_id would serve the same purpose as a machine id, and the transaction date/time column would serve the same purpose as a sequential number. In fact, the datetime datatype is an 8 byte integer that increments by 1 every few miliseconds.

    In a point of sale application, POS_ID + TRAN_DATETIME can serve the same functional purpose as a GUID, except it can be half the width, and when someone queries the SALES_HISTORY table, they're probably searching on POS_ID and/or TRAN_DATETIME anyhow, not some GUID or identity key, so making that the clustered index is very efficient.

    "Do not seek to follow in the footsteps of the wise. Instead, seek what they sought." - Matsuo Basho

  • Eric Russell 13013 (10/26/2010)


    I'm not intentionally trying to throw gasoline on the fire, but I'd just like to point out that, since a GUID is basically a combination of various attributes like machine id, timestamp, and/or sequence number, then it could be argued that this is a violation of First Normal Form. 🙂

    Just an FYI... That's what GUID's in SQL Server used to be which used to make them globally unique (if you believe that machine ID's are unique) as well as a security risk. Now aday's, MS uses a type 4 GUID which isn't much more than a random number on steroids.

    http://en.wikipedia.org/wiki/Universally_unique_identifier#Version_4_.28random.29

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Jeff Moden (10/26/2010)


    Now aday's, MS uses a type 4 GUID which isn't much more than a random number on steroids.

    Not for NEWSEQUENTIALID - that uses version 1. I mentioned this a few posts back: http://www.sqlservercentral.com/Forums/FindPost1010492.aspx

    You can prove this for yourself by running:

    DECLARE @T TABLE (guid UNIQUEIDENTIFIER DEFAULT NEWSEQUENTIALID());

    INSERT @T OUTPUT inserted.guid DEFAULT VALUES;

    Example output: 3C1F75B1-72E1-DF11-BEA5-5CAC4CE467D7

    Notice that guid does not follow the xxxxxxxx-xxxx-4xxx-yxxx-xxxxxxxxxxxx pattern for version 4 ids, it follows the xxxxxxxx-xxxx-1xxx-xxxx-xxxxxxxxxxxx pattern for version 1 ids.

  • Haven't had the pleasure of working with sequential GUID's in 2k8, yet. You're correct... it's plain to see that they're not type 4's but are you sure they're actually type 1's which contain the MAC address of the system? I ask because MS went to type 4's to get away from the security risk of type 1's.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Ah... never mind. In a backhanded sort of fashion, I've answered my own question. They do appear to be type 1's based on a statement in the following article...

    http://msdn.microsoft.com/en-us/library/aa379322(VS.85).aspx

    ... which states...

    For security reasons, UuidCreate was modified so that it no longer uses a machine's MAC address to generate UUIDs. UuidCreateSequential was introduced to allow creation of UUIDs using the MAC address of a machine's Ethernet card.

    As a side bar, the article also states...

    Computers with ethernet/token ring addresses generate UUIDs that are guaranteed to be globally unique.

    So, does anyone know of an article that guarantees that ethernet addresses a globally unique because that's the only way such a guarantee could be made.

    I also don't like the idea of the MAC address being involved and need to find out if the MAC address could be reverse engineering on sequential GUIDS. I know they could be reverse engineered on true type 1 GUIDs, but I don't know if these are true type 1 GUIDS.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Jeff Moden (10/26/2010)


    Haven't had the pleasure of working with sequential GUID's in 2k8, yet. You're correct... it's plain to see that they're not type 4's but are you sure they're actually type 1's which contain the MAC address of the system? I ask because MS went to type 4's to get away from the security risk of type 1's.

    Yes - the fixed '1' in the GUID generated is the version number.

  • Paul White NZ (10/26/2010)


    Jeff Moden (10/26/2010)


    Haven't had the pleasure of working with sequential GUID's in 2k8, yet. You're correct... it's plain to see that they're not type 4's but are you sure they're actually type 1's which contain the MAC address of the system? I ask because MS went to type 4's to get away from the security risk of type 1's.

    Yes - the fixed '1' in the GUID generated is the version number.

    Since the MAC address is buried in the sequential GUID...

    http://www.google.com/search?hl=en&source=hp&biw=1012&bih=550&q=uuidcreatesequential+mac+address&aq=3&aqi=g4g-m1&aql=&oq=UuidCreateSequential

    ... and a whole lot of people consider it to be a security risk, I'm thinking that I won't use sequential GUIDs.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

Viewing 15 posts - 106 through 120 (of 169 total)

You must be logged in to reply to this topic. Login to reply