Implementing Clustered Indexes After the Fact on Huge Tables

  • So, I've inherited various databases, and find that many of them have no clustered indexes. The field that should be a primary key just has a non-clustered index on it. These tables have from 5 million to 100 million rows, and a some have up to 10 other non-clustered indexes. I'd like to fix some performance issues and think this might be one area to improve. Any suggestions about the best way to go about this, or other things to consider ?

    Edit: These are on 2000 databases, not 2005

  • Just one thing to think about. The Primary Key isn't always the one that should be the Clustered Index. As you look at the processing that occurs, you need to look at the various queries that are being processed, and when creating the Clustered Index, may the appropriate choice based on the workloads on the databases.

  • I've been looking at some of the queries, and many of them join on RWA_ID which is an application wide field used as a primary key, but not actually specified as such. It is unique in all tables (except for a couple where it should be unique). I was thinking that would be a good candidate in most cases.

  • That may be a good primary key, but remember that doesn't mean it should necessarily be the clustered index.

  • Also, you want to make sure you are not creating a clustered index in an order in which adding new data will fragment your index. If you do, it will force you to re-index regularly.

  • So, it seems that RWA_ID is created by 2 different processes. One increments numbers starting at 6000000000, and another increments starting at 8000000000. A bit strange design, since eventually the 6+ numbers will catch up to the 8+ numbers, but it looks like it will take about 50 years. So, the records are not getting inserted in sequence since the 6+ and 8+ records are interspersed

    Anyway, would there be a benefit to creating a primary key on RWA_ID (but not a clustered index) even though there's already a non-clustered index ?

    How should I go about determining if a clustered index is appropriate at this point ?

  • I'd say by looking at what types of queries are being run against the database. You can als look at the missing index data management views and see what SQL thinks is missing. That doesn't mean to make all those indexes, just look at them for now. You may find a pattern in the missing indexes that may point to a clustered index and other nonclustered indexes.

    A clustered index would be good if you find a significant number of queries that do ranged selects over the same or nearly the same columns.

    There are also those out there that say you should always have a clustered index. I agree, just be sure to select the one that makes the most sense.

Viewing 7 posts - 1 through 6 (of 6 total)

You must be logged in to reply to this topic. Login to reply