Necessity of having clustered index in huge tables

  • Hi All,

    We have a fact table containing 28489211 rows and 18997272 KB.

    We have created a few non clustered indexes based on the queries those are regularly used.

    But there is not much of unique values in any of the tables.

    In spite of combining 3-4 columns we are getting around 270 repetitive rows for some combination.

    In this scenario is it a wise decision to go for clustered index.

    Does it have any adverse effect on performance of having such a huge table in Heap?

  • Knowledge Hunter (10/24/2011)


    Hi All,

    We have a fact table containing 28489211 rows and 18997272 KB.

    We have created a few non clustered indexes based on the queries those are regularly used.

    But there is not much of unique values in any of the tables.

    In spite of combining 3-4 columns we are getting around 270 repetitive rows for some combination.

    In this scenario is it a wise decision to go for clustered index.

    Does it have any adverse effect on performance of having such a huge table in Heap?

    Yes, you have adverse performance with the table in a HEAP. Remember, Primary Key != Clustered Index key.

    I highly recommend watching Kimberly Tripp's Indexing MCM readiness videos on technet.

    Here's a couple of reasons to get you started:

    1. In the leaf level of all non-clustered indexes is a pointer to the data. If the table has a clustered index, it the clustered index key. If the table is a heap, it's the RID (combination of File, Page, Slot-- 8 bytes). If you added an INT Identity column, and made a clustered index on it, you'd add 28 million rows * 4 bytes = 112 MB of DATA to the Table, but each non-clustered index would be smaller (4 byte int key vs 8 byte RID), so you'll save 112 MB per non-clustered index. (less data to scan, backup, hold in memory, smaller transaction logs, etc)

    2. You can rebuild a clustered index, and the non-clustered indexes don't change. With a heap, since the RID is the physical location on disk, if you do something that moves the data, you'll force a rebuild on all of your non-clustered indexes.

Viewing 2 posts - 1 through 1 (of 1 total)

You must be logged in to reply to this topic. Login to reply