Identifying Fields for Clustered Indexes

  • From what I understand I should try to create a clustered index on all tables in the DB. Having an index on the primary key is not always a best practice. However, I’ve not read anything that tells me when having it on the primary key is ok.

    I was also told that looking at the fields in the where clause could identify good candidates for indexes.

    Below is a list of tables and the fields i'm thinking about placing an index on. For tables with no Clustered? notes, I had no ideas.

    To see the entity relationship diagram, go to

    http://www.geocities.com/jacobpressures/

    Employees

    EmpID*

    FirstName

    LastName 

    BadgeNumber Clustered?

    DeptID

    CenterID

    Centers

    CenterID*

    Center

    Departments

    DeptID*

    Department

    Tests

    TestID*

    TestName

    Categories

    CategoryID*

    TestID Clustered?

    Category

    CategoryPurpose

    Resources

    ResourceID*

    CategoryID

    ResourceTitle

    FileName

    FileType

    LastUpdated

    Learning

    LearningID*

    CategoryID

    LearningObjective

    LastUpdated

    Questions

    QID*

    CategoryID Clustered?

    Question

    References

    LastUpdated

    Answers

    AnswerID*

    QID

    Answer

    Correct

    Diagrams

    DiagramID*

    QID

    FileName

    DiagramDesc

    LastUpdated

    EmployeeTests

    EmployeeTestID*

    TestID Clustered?

    EmpID Clustered?

    EmpTestStartDate Clustered?

    EmpTestFinishDate

    Completed_Categories

    CCID*

    EmployeeTestID Clustered?

    CategoryID

    StartTime

    FinishTime

    Comment

    Completed_Questions

    CQID*

    QID

    CCID Clustered?

    Comment

    Responses

    ResponseID*

    CQID Clustered?

    AnswerID

    Smallest Tables

    Centers, Departments, Tests, Categories, Resources, Diagrams will all probably be less than 100 rows.

    Employees table will have no more than 300 entries.

    Here are some queries from this section.

    "SELECT * From Employees WHERE BadgeNumber='" & sBadge & "'"

    "SELECT e.EmpID From Employees e WHERE e.BadgeNumber='" & gEmployee.BadgeNumber & "' AND e.FirstName = '" & gEmployee.FirstName & "' AND e.LastName = '" & gEmployee.LastName & "'"

    I am assuming that the following queries provide no value at all in regards to indexes since they serve as lookups with no Where statement

    "SELECT * From Departments"

    "SELECT * From Centers"

    "SELECT Department FROM Departments"

    "SELECT Center FROM Centers"

    Questions

    I was thinking about placing a Clustered Index on First and Last Name but I wonder if BadgeNumber might be more appropriate, since it is the most often used in where and BadgeNumber is the most unique.

    The other tables seem so small I wonder if an index is necessary. We only have about 11 centers, less than 30 departments, 5 tests. Does it really make sense to place an index on the small tables?

    The Questions and Answers Tables

    These tables are larger.

    There are at present 200 questions with the potiential of getting larger. For each Question, there are at the most 4 or 5 possible answers (at least 1,000 total possible answers so far).

    Since the Clustered Index should be as unique as possible, I’m debating whether the Index should be on the PK or the FK.

    There are at least 20 questions per category. The FK in the Questions table is CategoryID. There can be about 100 answers for each QID, FK, in Answers Table.

    What should the Clustered Index be?

    Here are some sample queries.

    "SELECT c.CategoryID, c.TestID, c.Category, c.CategoryPurpose, c.MaterialsNeeded, l.LearningObjective FROM Categories c LEFT JOIN Learning l ON c.CategoryID = l.CategoryID WHERE c.TestID = 1"

    "SELECT CategoryID, Category FROM Categories WHERE TestID =" & intTestID(intTestCounter)

    "SELECT q.QID, q.Question, a.AnswerID, a.Answer, a.Correct FROM Questions q LEFT JOIN Answers a ON q.QID = a.QID WHERE CategoryID =" & gTestCategory(giCategoryNumber).CategoryID

    "SELECT q.QID, q.CategoryID, q.Question, q.References, a.AnswerID, a.Answer, a.Correct FROM Questions q LEFT JOIN Answers a ON q.QID = a.QID" ' WHERE q.CategoryID IN (" & sCategoryIDs & " )"

    Based on these queries, i would probably put the clustered index on TestID in the Categories table and CategoryId in the Questions table.

    The Largest Tables

    The largerest tables are EmployeeTests, Completed_Categories, Completed_Questions and Responses.

    They increase in size respectively.

    The 200+ employees can take numerous tests with up to 10 categories apiece to complete with numerous questions.

    The question here is similar to the one above. Which field is the best? PK or one of the FKs?

    Here are some queries.

    "SELECT * FROM EmployeeTests WHERE EmpID = " & gEmployee.EmployeeID & " AND TestID = " & gTestCategory(giCategoryNumber).TestID & " ORDER BY EmpTestStartDate ASC"

    "SELECT EmployeeTestID, EmpTestFinishDate FROM EmployeeTests WHERE EmpID = " & gEmployee.EmployeeID & " AND TestID = " & intTestID(x) & " ORDER BY EmpTestFinishDate ASC"

    “SELECT CQID, QID FROM Completed_Questions WHERE CCID = " & intID(x, 0)

    "SELECT CQID FROM Completed_Questions WHERE CCID = " & IDHolder & " AND QID = " & gEmpCompletedCat(x).TestQuestion(y).QuestionID

    "SELECT AnswerID FROM Responses WHERE CQID = " & intID2(y, 0)

    "SELECT CCID, StartTime, FinishTime FROM Completed_Categories WHERE EmployeeTestID = " & IDHolder & " AND CategoryID = " & gTestCategory(giCategoryNumber).CategoryID & " ORDER BY StartTime ASC"

    Thanks very much. If anyone knows of a list of guidelines for selecting Clustered Indexes, please let me know. Thanks again.

  • Here is a start:

    Tips on Optimizing SQL Server Clustered Indexes

     


    * Noel

  • Thanks for the article Noel. I read it. But i guess i'm not going to get any suggestions on this. I didn't think it was that difficult at least for someone with experience. The only thing i knew to do was provide some queries and explain my line of thinking. I thought someone could at least tell me if i was on the right track or not. Or give me some ideas for the tables i have no clue where to put an index.

    Thanks again!

  • Some consider indexing more of an art than a science, it can be very challenging placing indexes on a table when sooooo many queries can be effected by those indexes.

    The article noeld posted is a rule of thumb. Every index you create, or do not create has a cost benefit associated with it.

    One option that usually yields descent results is Run a load test on your application, and use the index tuning wizard to help you find helpful indexes on your tables.

    Some companies with high volume and high response time requirements have full time dba's working on nothing but query performance. If that tells you anything about how complex it can be.

    Yet more about performance tuning. http://www.sql-server-performance.com/optimizing_indexes.asp

    Your on the right track because you have done some analysis on what your table structures will look like.

    But yes index fields commonly used in where clauses, if the table is of any substance the put a clustered index on at least the primary key.

    Smaller static, or almost static tables you can probably get away without using clustered index. also remember index padding, if your going to put a clustered index on a non-Identity, column, then you will want to pad the index so when inserts happen for the record it does cause unnecessary time to commit the transaction becuse of torn index pages

  • Thanks that information was helpful.

Viewing 5 posts - 1 through 4 (of 4 total)

You must be logged in to reply to this topic. Login to reply