Indexing and Primary Keys

  • If I have one table with a Primary Key on columns B and C

    CREATE TABLE Main

    (B uniqueidentifier NOT NULL,

    C int NOT NULL,

    D varchar(24) NULL,

    E bit NOT NULL,

    CONSTRAINT MainKey PRIMARY KEY CLUSTERED (B, C))

    and a second table with a Primary Key on columns A, B and C

    CREATE TABLE Sub

    (A int NOT NULL,

    B uniqueidentifier NOT NULL,

    C int NOT NULL,

    F varchar(42) NULL,

    G smallmoney NOT NULL,

    H bit NULL SPARSE,

    CONSTRAINT SubKey PRIMARY KEY CLUSTERED(A, B, C),

    CONSTRAINT FK_Sub_Main FOREIGN KEY (B, C) REFERENCES Main(B, C))

    The second table has a Foreign Key to the first on B and C

    I then query both tables

    SELECT M.B, M.C, M.E, S.F, S.G

    FROM Main M

    INNER JOIN Sub S

    ON M.B = S.B

    AND M.C = S.C

    Should I have an index on B and C on the second table?

    --------------------------------------
    When you encounter a problem, if the solution isn't readily evident go back to the start and check your assumptions.
    --------------------------------------
    It’s unpleasantly like being drunk.
    What’s so unpleasant about being drunk?
    You ask a glass of water. -- Douglas Adams

  • The way it's written, yes. However, if you can make the PK be on (B, C, A), then you wouldn't need to.

    Edit: also, the index on B,C isn't for just the query. When you delete from the first table, it has to verify that there aren't any child records in the second table. Having an index on those columns will make that faster also.

    Wayne
    Microsoft Certified Master: SQL Server 2008
    Author - SQL Server T-SQL Recipes


    If you can't explain to another person how the code that you're copying from the internet works, then DON'T USE IT on a production system! After all, you will be the one supporting it!
    Links:
    For better assistance in answering your questions
    Performance Problems
    Common date/time routines
    Understanding and Using APPLY Part 1 & Part 2

  • WayneS (9/14/2010)


    The way it's written, yes. However, if you can make the PK be on (B, C, A), then you wouldn't need to.

    Edit: also, the index on B,C isn't for just the query. When you delete from the first table, it has to verify that there aren't any child records in the second table. Having an index on those columns will make that faster also.

    Just to make sure I have it right, you're saying if I make the second table

    CREATE TABLE Sub

    (A int NOT NULL,

    B uniqueidentifier NOT NULL,

    C int NOT NULL,

    F varchar(42) NULL,

    G smallmoney NOT NULL,

    H bit NULL SPARSE,

    CONSTRAINT SubKey PRIMARY KEY CLUSTERED(B, C, A),

    CONSTRAINT FK_Sub_Main FOREIGN KEY (B, C) REFERENCES Main(B, C))

    then I don't need an index on B, C on the second table for the stated query, but I should have one there anyway for checks on the table for deletes from the main table.

    If I add an index on the Sub table on columns B and C, should I still reconfigure hte Primary Key to B, C, A?

    --------------------------------------
    When you encounter a problem, if the solution isn't readily evident go back to the start and check your assumptions.
    --------------------------------------
    It’s unpleasantly like being drunk.
    What’s so unpleasant about being drunk?
    You ask a glass of water. -- Douglas Adams

  • Stefan Krzywicki (9/14/2010)


    then I don't need an index on B, C on the second table for the stated query, but I should have one there anyway for checks on the table for deletes from the main table.

    No, because an index on B, C will be redundant with the PK, being a left-based subset of the index key

    If I add an index on the Sub table on columns B and C, should I still reconfigure hte Primary Key to B, C, A?

    Do one or the other. Unless there are queries that filter only on A, it's probably more efficient to re-order the PK than to add a second index.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • Just a general point, you should be careful using a uniqueidentifier as part of a Clustered Index. Due to the random nature of Unique Identifiers, you can end up with a lot of fragmentation as new rows can be inserted at any location within the clustered index, rather than at the end.

  • Thanks Gail, appreciate the advice. I'll reorder the PK today.

    --------------------------------------
    When you encounter a problem, if the solution isn't readily evident go back to the start and check your assumptions.
    --------------------------------------
    It’s unpleasantly like being drunk.
    What’s so unpleasant about being drunk?
    You ask a glass of water. -- Douglas Adams

  • HowardW (9/15/2010)


    Just a general point, you should be careful using a uniqueidentifier as part of a Clustered Index. Due to the random nature of Unique Identifiers, you can end up with a lot of fragmentation as new rows can be inserted at any location within the clustered index, rather than at the end.

    Yeah, this one I know, but thanks! Always appreciate the help. Unfortunately, I'm stuck with uniqueidentifiers for the time being. I'm re-indexing the main table every day after the once-a-day bulk insert. I have to! Fragmentation in that table goes up to over 40% every day.

    --------------------------------------
    When you encounter a problem, if the solution isn't readily evident go back to the start and check your assumptions.
    --------------------------------------
    It’s unpleasantly like being drunk.
    What’s so unpleasant about being drunk?
    You ask a glass of water. -- Douglas Adams

  • One other point to make: the keys that make up the clustered index will be used in all other indexes. You might want to consider having an identity column that is the clustered PK, and then using an UNIQUE index on B,C,A. A uniqueidentifier in all indexes sounds to me like an awful lot of IO on updates, and more IO when querying due to wider pages, etc.

    Wayne
    Microsoft Certified Master: SQL Server 2008
    Author - SQL Server T-SQL Recipes


    If you can't explain to another person how the code that you're copying from the internet works, then DON'T USE IT on a production system! After all, you will be the one supporting it!
    Links:
    For better assistance in answering your questions
    Performance Problems
    Common date/time routines
    Understanding and Using APPLY Part 1 & Part 2

  • WayneS (9/15/2010)


    One other point to make: the keys that make up the clustered index will be used in all other indexes. You might want to consider having an identity column that is the clustered PK, and then using an UNIQUE index on B,C,A.

    It is a little wider than I would typically recommend for a clustered index. That plus the uniqueidentifier... Failing on 2 of the 4 recommendations for a cluster at least.

    That said, you can keep the 3 columns as the pk, just make it a nonclustered pk and put the cluster elsewhere.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • GilaMonster (9/15/2010)


    WayneS (9/15/2010)


    One other point to make: the keys that make up the clustered index will be used in all other indexes. You might want to consider having an identity column that is the clustered PK, and then using an UNIQUE index on B,C,A.

    It is a little wider than I would typically recommend for a clustered index. That plus the uniqueidentifier... Failing on 2 of the 4 recommendations for a cluster at least.

    That said, you can keep the 3 columns as the pk, just make it a nonclustered pk and put the cluster elsewhere.

    I normally stay away from multi-column keys. I don't like them at all. However, these tables are partitioned and you have to include a partitioned column in the primary key. At best, I could use an IDENTITY plus the partitioned column (B) as the PK and then a unique index on the original B, A. I'm also getting this data from a 3rd party that I have no control over, so I can't get rid of the uniqueidentifier or change the subtable key from the combination uniqueidentifier+int that it currently is. Perhaps in time I'll be able to shift that to a foreign key with a bigint replacing it, but not for the forseeable future.

    Once we've finished the current upgrade, I'll bring up that possibility, but my manager won't want to change that yet because of all the underlying changes that would have to go along with it.

    The good thing is that there are no tables that reference this table.

    Do you have a link for the recommendations for a cluster that you mentioned? I'd love to read more on it.

    --------------------------------------
    When you encounter a problem, if the solution isn't readily evident go back to the start and check your assumptions.
    --------------------------------------
    It’s unpleasantly like being drunk.
    What’s so unpleasant about being drunk?
    You ask a glass of water. -- Douglas Adams

  • Stefan Krzywicki (9/15/2010)


    Do you have a link for the recommendations for a cluster that you mentioned? I'd love to read more on it.

    Is it a valid reference if I wrote it?

    http://www.sqlservercentral.com/articles/Indexing/68563/

    If you want my sources, try Kimberly Tripp's blog (http://www.sqlskills.com/blogs/kimberly) I'm sure it's there somewhere.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • GilaMonster (9/15/2010)


    Stefan Krzywicki (9/15/2010)


    Do you have a link for the recommendations for a cluster that you mentioned? I'd love to read more on it.

    Is it a valid reference if I wrote it?

    http://www.sqlservercentral.com/articles/Indexing/68563/

    If you want my sources, try Kimberly Tripp's blog (http://www.sqlskills.com/blogs/kimberly) I'm sure it's there somewhere.

    I'd say it likely is. Thanks!

    --------------------------------------
    When you encounter a problem, if the solution isn't readily evident go back to the start and check your assumptions.
    --------------------------------------
    It’s unpleasantly like being drunk.
    What’s so unpleasant about being drunk?
    You ask a glass of water. -- Douglas Adams

Viewing 12 posts - 1 through 11 (of 11 total)

You must be logged in to reply to this topic. Login to reply