Clustered Index Key vs Unique Row Identifier

  • I'm rereading this section in the MCTS Exam 70-433 book for SQL Server Development titled Techniques to Improve Query Performance. The section "Using Clustered Indexes" I beleive had some misleading language and I just wanted to bounce it off your guys so I know I'm not going crazy.

    If a clustered index exists on a table, all nonclustered indexes on that table use the key of the clustered index as the row pointer from the nonclustered index to the table. If a clustered index does not exist, the row identifier is used, which takes up 8 bytes of storage in each row of each nonclustered index. This can significantly increase the index size for larger tables.

    Then it gives a scenario of a large table with a covering clustered index vs a 4 byte clustered index to compare the space savings. My confusion comes in where there were apparently saying that not having a clustered index and just using a row identifier can increase the index size for larger tables.

    In actuality however, it looks to me that a nonclustered index utilizing an 8 byte row identier is analagous (in terms of space) to having the nonclustered index point to a 4 byte clustered index key (since 4 bytes are added in case of duplicates in the clustered index).

    Am I correct in my assumption?

    P.S. Also, if the clustered index key is unique (say a primary key), then does SQL server still tack on the extra 4 bytes when nonclustered indexes reference it?

    Keith Wiggans

  • These may help.

    http://www.sqlservercentral.com/articles/Indexing/68439/

    http://www.sqlservercentral.com/articles/Indexing/68563/

    Part 3 will be published tomorrow

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • kwiggans (11/17/2009)


    Then it gives a scenario of a large table with a covering clustered index vs a 4 byte clustered index to compare the space savings.

    Clustered indexes are always covering indexes, since they have the entire data row at the leaf level

    In actuality however, it looks to me that a nonclustered index utilizing an 8 byte row identier is analagous (in terms of space) to having the nonclustered index point to a 4 byte clustered index key (since 4 bytes are added in case of duplicates in the clustered index).

    The 4 byte uniquifier is only present when the cluster is not defined as unique. If it is there, it's part of the clustering key, hence present at all levels of the clustered index and whereever the clustering key is in the nonclustered indexes. Furthermore, it's only given a value for the duplicate values of the clustering key.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • You are awesome... Gold Star!

    Later,

    😀

    Keith Wiggans

Viewing 4 posts - 1 through 3 (of 3 total)

You must be logged in to reply to this topic. Login to reply