Clustered index in uniqueidentifiers column

  • Hi All,

    Is it true that putting Clustered indexes on columns that are uniqueidentifiers is not good because this will cause major performance issues when storing the data ?

    thanksss

  • This was removed by the editor as SPAM

  • I would say yes. but it depends on Index maintenance.

    the yes is because an ideal clustered index should be small, unique, static and ever increasing(in a sequence). Unique identifiers are most definitely not small at 16 bytes they are 4 times as big as an int, but the real kicker is that they are not ever increasing but are random. This results in the index insert being out of order which in turn results in index fragmentation which is the big performance issue you have been warned about and this can happen very quickly.

    The impact of index fragmentation can be lessened by frequent de-fragmentation maintenance. if you have a look you'll find some really good index de-fragmentation code samples.

    If you have an int identifier for each row you are far better using that but i suspect that you may not have a choice depending on the database structure you have.

  • If you use NEWID there are two problems: the GUID is 16 bytes (which is a potential problem as it is a lot of strorage if the table is big) and the sequence of GUIDs is pretty random, not monotonic (which is a certain problem if the table is anything other than very small).

    If you use NEWSEQUENTIALID instead of NEWID there is only one problem, because teh GUID sequence is monotonic increasing so only the size remains as a potential problem.

    Tom

  • Yes, it's true.

    If absolutely necessary, create a nonclustered index(es) on the GUID(s).

    SQL DBA,SQL Server MVP(07, 08, 09) "It's a dog-eat-dog world, and I'm wearing Milk-Bone underwear." "Norm", on "Cheers". Also from "Cheers", from "Carla": "You need to know 3 things about Tortelli men: Tortelli men draw women like flies; Tortelli men treat women like flies; Tortelli men's brains are in their flies".

  • Thanks all ! Appreciate your responses!!

  • Just some extra info just in case you're thinking about it...

    It's not such a hot idea to use a Clustered Index on NEWSEQUENTIALID either. From BOL 2014... (http://msdn.microsoft.com/en-us/library/ms189786.aspx)

    Creates a GUID that is greater than any GUID previously generated by this function on a specified computer since Windows was started. After restarting Windows, [font="Arial Black"]the GUID can start again from a lower range[/font], but is still globally unique.

    Because the NEWSEQUENTIALIDs can be lower after a reboot, you could (most likely WILL) experience some pretty good page splitting after a reboot.

    Neither is it necessarily "globally unique". If you read the the article I provided the link for, you'll come across a note that says...

    NEWSEQUENTIALID is a wrapper over the Windows UuidCreateSequential function.

    ... and clicking on that link leads you to an article that contains the following...

    The UuidCreateSequential function returns RPC_S_UUID_LOCAL_ONLY when the originating computer does not have an ethernet/token ring (IEEE 802.x) address. In this case, the generated UUID is a valid identifier, and is guaranteed to be unique among all UUIDs generated on the computer. [font="Arial Black"]However, the possibility exists that another computer without an ethernet/token ring address generated the identical UUID. [/font]Therefore you should never use this UUID to identify an object that is not strictly local to your computer. Computers with ethernet/token ring addresses generate UUIDs that are guaranteed to be globally unique.

    Most of us have a card in our machines that has an ethernet/token ring address but I thought I'd bring the possibility up. I'll also mention that it might be possible to reverse engineer your address from such things like you could in the old TYPE 1 GUIDs that SQL Server used to have. That could be a security risk. Admittedly, I've not done a deep dive on that aspect but that's only because I don't use them. 😉

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

Viewing 7 posts - 1 through 6 (of 6 total)

You must be logged in to reply to this topic. Login to reply