Indexing Question - Moving from Heap to Clustered

  • Currently in the process of adding a Primary Key and taking advantage of its clustered index for some heap tables in our DB. On one of the tables we are looking to convert already has a non clustered index which includes some columns. E.G.

    Table Customer
    ________________
    CustomerID
    FName
    LName
    etc.

    Non clustered index on CustomerID includes FName and LName.

    When we add the primary key to CustomerID, creating the clustered index, should we just keep the current non clustered index or is there a better way to deal with the include of FName and LName; I believe we added it to speed up some of the searches. It seems silly to have two indexes on CustomerID, one clustered and one non just to accommodate the includes part.

  • It depends on the overall width of your table and the usage patterns.  If your non-clustered index is only a fraction of the size of your clustered index, and you have a frequently used query that's covered by the non-clustered index, it may be worth keeping it.  You might want to bear in mind also that an ID column is frequently not the best choice for a clustered index.  If you have an ascending column such as CreateDate, you might consider using that as the clustering key, and creating the primary key non clustered.

    John

  • Thanks for the reply, the CustomerID is an auto increment integer column. The CustomerID is used regularly in our ERP to bring up the record, passed when making updates....

    SELECT Cols FROM CUSTOMER WHERE CustomerID=1
    UPDATE CUSTOMER SET Cols=Vals WHERE CustomerID=1

    I was under the impression that it would be the best col to cluster on.

    Josh

  • Josh

    Sorry, I mentioned that the clustering key ideally needs to be increasing, but another good-to-have is if it's used for range searches.  IDs rarely are used for that; dates are more likely to be.  That's why a date often makes a better clustered index.  I'm not saying that'll necessarily be true on your table, but it's worth considering.

    John

  • OK, I gotcha. So no harm other than space used in having the non-clustered on CustomerID with the includes as well as the Clustered on CustomerID?

    Josh

  • josh-1127203 - Thursday, March 30, 2017 8:32 AM

    OK, I gotcha. So no harm other than space used in having the non-clustered on CustomerID with the includes as well as the Clustered on CustomerID?

    Josh

    So both indexes will have CustomerID as the first column?

    In that case, you may find that the non-clustered index is very seldom used. This is because one of the primary driving factors for index selection is the histogram, and the histogram is only ever on the first column. If they're the same histogram (yeah, sampling etc, means they won't be identical, but they'll probably be very close) and one of them is the clustered index, which means the columns are always available at the page level (similar to INCLUDE), the clustered index will generally get picked over the non-clustered index.

    However, testing is your friend here. Validate the behavior in your situation by checking the performance and the execution plans for your most frequently used queries.

    "The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
    - Theodore Roosevelt

    Author of:
    SQL Server Execution Plans
    SQL Server Query Performance Tuning

Viewing 6 posts - 1 through 5 (of 5 total)

You must be logged in to reply to this topic. Login to reply