primary keys

  • I don't believe there can be any sound basis for saying that primary keys should be clustered "most of the time" because it depends on what criteria are used to select primary keys in the first place. Some textbook criteria commonly used in the design and choice of good keys are : stability, simplicity, familiarity (uniqueness, non-nullability and irreducibility being given pre-requisites). Do those things make a good choice of cluster key? Not necessarily. Even if they did, given a choice of possible keys only one of which can also be clustered, what basis is there for saying that the one chosen as "primary" should also be the clustered one? The only basis I can think of is the totally uhelpful, self-justifying one: "make x your primary key because x also makes a good cluster key". Oddly, that apparently futile self-justifying argument seems to get used over and over again in the SQL Server world.

    Lynn's answer is indeed the only prudent one: it depends. Microsoft has a lot to answer for in making the primary key default to a clustered index.

  • Lynn's answer is indeed the only prudent one: it depends. Microsoft has a lot to answer for in making the primary key default to a clustered index.

    Well, of course it depends... I still don't see your justification for Microsoft having a lot to answer for. Its likes saying that Nissan has "a lot to answer for" because their default color for the car is black, but you want white. You can always change it? Default is just that... Default unless otherwise specified. So specify what you want. I'll concede that "primary keys should be clustered most of the time" is a debatable statement... However, I will argue that "primary keys ARE clustered most of the time" is a safe statement. Even if they "should not" be.

    Data structures in different industries are going to follow different patterns. Fine. However, look at the world in general and how we describe things. Many times, the only way to make individual aspects of data unique is to force them to be unique. We can do that in many ways, like we do with names. However, as intelligent beings we have come to realize that the simplest way to guarantee the uniqueness of an object, subject, entity, whatever... is to give it a number or id. This is because, in the sense of having to assign that id, we can always guarantee uniqueness by incrementing that number by a specified amount: John the 1st, John the 2nd. Subject 1001, subject 1002. Prisoner 40892 (Rod Blagojevich lol) or 40893. Product 10067... Now, your specific business case may not fit this model, or previous design or business decisions may not follow this. However, "IN GENERAL" we tend to think in sequences as humans and so design our models this way.

    The only logical default (assuming that there should even be a default index assigned to a primary key) is to have it clustered. Maybe there should not even be a default. However, when you have the options of clustered or non-clustered index for a primary key, I would find it a nuisance to have it default to non-clustered and I think a large percentage of the users would also find that a nuisance.

    Jared
    CE - Microsoft

  • SQLKnowItAll (5/15/2012)


    Well, of course it depends... I still don't see your justification for Microsoft having a lot to answer for. Its likes saying that Nissan has "a lot to answer for" because their default color for the car is black, but you want white. You can always change it? Default is just that... Default unless otherwise specified. So specify what you want. I'll concede that "primary keys should be clustered most of the time" is a debatable statement... However, I will argue that "primary keys ARE clustered most of the time" is a safe statement. Even if they "should not" be.

    Data structures in different industries are going to follow different patterns. Fine. However, look at the world in general and how we describe things. Many times, the only way to make individual aspects of data unique is to force them to be unique. We can do that in many ways, like we do with names. However, as intelligent beings we have come to realize that the simplest way to guarantee the uniqueness of an object, subject, entity, whatever... is to give it a number or id. This is because, in the sense of having to assign that id, we can always guarantee uniqueness by incrementing that number by a specified amount: John the 1st, John the 2nd. Subject 1001, subject 1002. Prisoner 40892 (Rod Blagojevich lol) or 40893. Product 10067... Now, your specific business case may not fit this model, or previous design or business decisions may not follow this. However, "IN GENERAL" we tend to think in sequences as humans and so design our models this way.

    The only logical default (assuming that there should even be a default index assigned to a primary key) is to have it clustered. Maybe there should not even be a default. However, when you have the options of clustered or non-clustered index for a primary key, I would find it a nuisance to have it default to non-clustered and I think a large percentage of the users would also find that a nuisance.

    Keys are also a means of identification however, not just arbitrary uniqueness. It is only a certain subset of keys that are simple, orderly, non-composite, non-alpha numbers. While it's true that numbers are often used as a means of identification, that doesn't necessarily make them the best cluster keys in a table e.g. they may not be guraranteed to arrive in the database in numerical order or there may be a more stable and useful alternative key.

    If you are talking about surrogate keys then presumably you are making the assumption that the surrogate will also be designated the primary key. My point in my first post in this thread was that there is no good basis for that assumption because the choice of "primary" key makes no practical difference.

    The reason I say that Microsoft has a lot to answer for is this: The default-to-clustered "feature" achieves nothing much for wise developers but its unintended consequence is that too many SQL Server users (even experienced ones who should know better) just assume that choosing a clustered index and choosing a primary key are the same thing. Microsoft have muddled the important logical / physical independence that is so central to good database design.

  • sqlvogel (5/15/2012)


    SQLKnowItAll (5/15/2012)


    Well, of course it depends... I still don't see your justification for Microsoft having a lot to answer for. Its likes saying that Nissan has "a lot to answer for" because their default color for the car is black, but you want white. You can always change it? Default is just that... Default unless otherwise specified. So specify what you want. I'll concede that "primary keys should be clustered most of the time" is a debatable statement... However, I will argue that "primary keys ARE clustered most of the time" is a safe statement. Even if they "should not" be.

    Data structures in different industries are going to follow different patterns. Fine. However, look at the world in general and how we describe things. Many times, the only way to make individual aspects of data unique is to force them to be unique. We can do that in many ways, like we do with names. However, as intelligent beings we have come to realize that the simplest way to guarantee the uniqueness of an object, subject, entity, whatever... is to give it a number or id. This is because, in the sense of having to assign that id, we can always guarantee uniqueness by incrementing that number by a specified amount: John the 1st, John the 2nd. Subject 1001, subject 1002. Prisoner 40892 (Rod Blagojevich lol) or 40893. Product 10067... Now, your specific business case may not fit this model, or previous design or business decisions may not follow this. However, "IN GENERAL" we tend to think in sequences as humans and so design our models this way.

    The only logical default (assuming that there should even be a default index assigned to a primary key) is to have it clustered. Maybe there should not even be a default. However, when you have the options of clustered or non-clustered index for a primary key, I would find it a nuisance to have it default to non-clustered and I think a large percentage of the users would also find that a nuisance.

    Keys are also a means of identification however, not just arbitrary uniqueness. It is only a certain subset of keys that are simple, orderly, non-composite, non-alpha numbers. While it's true that numbers are often used as a means of identification, that doesn't necessarily make them the best cluster keys in a table e.g. they may not be guraranteed to arrive in the database in numerical order or there may be a more stable and useful alternative key.

    If you are talking about surrogate keys then presumably you are making the assumption that the surrogate will also be designated the primary key. My point in my first post in this thread was that there is no good basis for that assumption because the choice of "primary" key makes no practical difference.

    The reason I say that Microsoft has a lot to answer for is this: The default-to-clustered "feature" achieves nothing much for wise developers but its unintended consequence is that too many SQL Server users (even experienced ones who should know better) just assume that choosing a clustered index and choosing a primary key are the same thing. Microsoft have muddled the important logical / physical independence that is so central to good database design.

    Point taken. Are you suggesting, then, that there should be no default index created?

    Jared
    CE - Microsoft

  • ...

    The reason I say that Microsoft has a lot to answer for is this: The default-to-clustered "feature" achieves nothing much for wise developers but its unintended consequence is that too many SQL Server users (even experienced ones who should know better) just assume that choosing a clustered index and choosing a primary key are the same thing. Microsoft have muddled the important logical / physical independence that is so central to good database design.

    I guess one of the reason Microsoft decided to make PK clustered by default, is to ensure that table will have clustered index and will not be a heap one. I have a feeling that more "beginner" database developers are aware of PK then of importance to have clustered index. So, in my humble opinion it is the least one from "has a lot to answer for" Microsoft list.

    I would ask them another one: Why we have second month of rain here in UK started from the day when a "draught" was declared? :hehe:

    _____________________________________________
    "The only true wisdom is in knowing you know nothing"
    "O skol'ko nam otkrytiy chudnyh prevnosit microsofta duh!":-D
    (So many miracle inventions provided by MS to us...)

    How to post your question to get the best and quick help[/url]

  • Eugene Elutin (5/15/2012)


    ...

    I would ask them another one: Why we have second month of rain here in UK started from the day when a "draught" was declared? :hehe:

    I blame this on the government and also put it forward as an excuse as to why England will not do very well in the Euros. Though hopefully the country will recover to get a few medals in the Olympics.:-P

  • thanks all

    more than enough information to be considering

Viewing 7 posts - 16 through 21 (of 21 total)

You must be logged in to reply to this topic. Login to reply