Are There That Many GUIDs?

  • SanDroid (10/12/2010)


    One little known fact about the GUID generation algorithm is the user's network card MAC address is used as a base for the last group of GUID digits so that data can be tracked back to the computer that created it.

    That's true only of version 1 GUIDs, and even then only for computers that have at least 1 MAC address; none of versions 2,3,4,and 5 use MAC address for the node field, nor do the GUIDs used by RSS or by Atom (almost version 3 or 5 GUIDs but with the node field unhashed) nor COMB (version 4 with a generation fix to make them usable as cluster key, probably obsolete since the intoduction of NEWSEQUENTIALID which does it with version 1) and Microsoft's WINAPI generates version 4 GUIDs.

    Shared MAC addresses are a problem for version 1 GUIDs, so virtualisation may in time consign them to history (except that the old COM and DCOM version 1 GUIDs will still be used after we stop generating new version 1 GUIDs, because those things are ulikely to go away while COM and DCOM are still with us).

    Tom

  • David Walker-278941 (10/12/2010)


    As for the "are there that many" question, it's interesting to use an identifier that is essentially infinite.

    Essentially infinite is way over the top. Maybe we will generate GUIDs at the rate of 1 billion per second (the standards guys have spent effort on making sure there are ways a single system that generates more than 1% of that will not have serious problems even using Version 1 GUIDS, and although the V1 structure is arranged so that a naive generator will be limited to 1% of that V4 has no such limit) so across the whole world maybe that's not an unreasonable number. At that rate, in just 100 years we would have a better than even chance of having at least one duplicate - and if we continue to use version 1 GUIDs the chances will be much higher than that (because some fields are less likely to have a lower portion of their possible values used than others and at least one bit is far more likely to be 0 than 1 when we use V1 - V1 GUIDs are not well randomised).

    But yes, a GUID has nearly twice as many variable bits as a bigint (122 bits if we exclude the reserved (variant and version fields)) so there are probably going to be enough of them in the forseeable future (and since some duplications will be harmless - well done MS with COM and DCOM - it's better than it might have been).

    There are enough GUIDs to enumerate every atom in the known universe (at least THIS universe).

    That's just plain wrong, by a factor of at least a trillion duodecillion (in big - American - illions, not old style British ones).:hehe:

    Anyone who thinks that 122 bits is enough to enumerate a set with cardinality somewhere in the range 1E78 to 1E82 needs to learn some elementary mathematics (since power(2,122) is less than 1E37). 😉

    And anyone who thinks there are fewer than 1E37 atoms in the observable universe needs to learn some physics. :w00t:

    Tom

  • Jeff Moden (10/27/2010)


    Paul White NZ (10/26/2010)


    So, does anyone know of an article that guarantees that ethernet addresses a globally unique because that's the only way such a guarantee could be made.

    Aside from the article you just read you mean? 😛

    BWAA-HAAA!!! Oh no... Not quite what I meant. I know that 48 bit MAC addresses are larger than they'll ever use and when they run out of those in about a billion years, they can switch over the the 64 bit version. 😉 What I meant was the safe-guards a manufacturer uses to not accidentally repeat numbers. Some of the less popular brands may not be so careful.

    You may recall a litle problem some time back when daisy-chaining certain disc brands (all same brand or differnt brands, didn't matter) was pretty impossible ith windows and with several other platforms. There was a story that that was caused by duplicate GUIDs, but I'm not sure whether I believe it or not.

    Tom

  • I'm feeling a bit of an idiot for responding to posts that were 5 years ago - oh, well, never mind. And thank heaven's for Paul with his insertion of sanity into the earlier discussion! But now there's something current to comment on.

    WayneS (3/24/2015)


    Toby Harman (3/23/2015)


    ...

    There's no reason to have a GUID as there's no possibility of running out of INT let alone BIGINT.

    ...

    ...

    In your reply, the implication (emphasized by me above) is that a GUID has a larger capacity than a bigint. The actual number of unique values of a GUID is 2^122; for a bigint data type it is 2^126 (using the full range from the most negative value to the most positive value).

    That's quite an achievement, representing 2^126 values using only 64 bits. Have you perhaps multiplied 2^63 by 2 and got 2^126 instead of 2^64?

    It reminds me of the famous disc compression scam back in the early 90s (I think - but it may have been late 80s) when an American outfit was going round western Europe selling this wonderful compression firmware which was guaranteed to compress anything, even stuff it had already compressed (no matter how many times). We had a visit from them in ICL West Gorton (I've no idea who arranged for them to be allowed to present their stuff, the guilty person remained anonymous) and were both bemused (how had they go through the defenses) and amused (gales of laughter once they's departed in confusion). What they were selling would have enabled us to represent even more that 2^126 values using only 64 bits.

    Tom

  • How many of my databases have used GUIDs? A couple that I inherited did but they don't now, so none. Do I look at what my databases need to do and seriously consider the use of GUIDs? Yes. Do I believe there are situations where GUIDs are needed: yes: I used to work on distributed systems and on systems that co-operated with eachother over networks in the days when there was no internet, and cooperative systems need something like GUIDs: there were several ideas sculling around the academic and industrial communities back then, but no standardisation efforts because no-one exected a really universal network to happen any time soon, but some of the things that were being talked about wouldn't look too strange to someone familiar with UUIDs.

    It's really quite simple: UUIDs (or GUIDs) are pretty well essential for some things if we want genuine universality, and even if we merely want universality within a fairly large network we need pretty much the same thing. On the other hand, most database applications that I've actually seen don't need them. So the only sane thing to do is to look at each case and decide whether UUIDs or something else are the right tool for the job - and I expect that the number of cases where they are the right tool is on the increase. Anyone who believes they are always wrong is a fool, and so is eveyone who believes they are always right.

    Are there enough? I doubt it. I suspect that in two or three decades we will have enough of them to be hit by ocassional collisions unless we increase the length. And if we want a standard that will last "for ever" we need to encode the length in a form that unambiguosly determines its own length as well as the length of the UUID containing this length indicator.

    It would be nice to have something that would not be just probability based to enable us to have guaranteed uniqueness without compromising security. I don't know a way of doing that that doesn't have either a central trusted entity or a central authority for issue - it looks like an interesting cryptological problem, but for all I know it's already been proved impossible (and the people who are likely to know probably aren't going to tell us as they are all in Fort George G Meade, or Malvern, or Tel Aviv - or maybe Moscow or Beijing).

    Tom

Viewing 5 posts - 166 through 169 (of 169 total)

You must be logged in to reply to this topic. Login to reply