Identifying Fields for Clustered Indexes

Question

Post reply

Identifying Fields for Clustered Indexes

Jacob Pressures

SSCertifiable

Points: 5804
More actions
July 26, 2005 at 1:50 pm

#165436

From what I understand I should try to create a clustered index on all tables in the DB. Having an index on the primary key is not always a best practice. However, I’ve not read anything that tells me when having it on the primary key is ok.
I was also told that looking at the fields in the where clause could identify good candidates for indexes.
Below is a list of tables and the fields i'm thinking about placing an index on. For tables with no Clustered? notes, I had no ideas.
To see the entity relationship diagram, go to
http://www.geocities.com/jacobpressures/
Employees
EmpID*
FirstName
LastName
BadgeNumber Clustered?
DeptID
CenterID
Centers
CenterID*
Center
Departments
DeptID*
Department
Tests
TestID*
TestName
Categories
CategoryID*
TestID Clustered?
Category
CategoryPurpose
Resources
ResourceID*
CategoryID
ResourceTitle
FileName
FileType
LastUpdated
Learning
LearningID*
CategoryID
LearningObjective
LastUpdated
Questions
QID*
CategoryID Clustered?
Question
References
LastUpdated
Answers
AnswerID*
QID
Answer
Correct
Diagrams
DiagramID*
QID
FileName
DiagramDesc
LastUpdated
EmployeeTests
EmployeeTestID*
TestID Clustered?
EmpID Clustered?
EmpTestStartDate Clustered?
EmpTestFinishDate
Completed_Categories
CCID*
EmployeeTestID Clustered?
CategoryID
StartTime
FinishTime
Comment
Completed_Questions
CQID*
QID
CCID Clustered?
Comment
Responses
ResponseID*
CQID Clustered?
AnswerID
Smallest Tables
Centers, Departments, Tests, Categories, Resources, Diagrams will all probably be less than 100 rows.
Employees table will have no more than 300 entries.
Here are some queries from this section.
"SELECT * From Employees WHERE BadgeNumber='" & sBadge & "'"
"SELECT e.EmpID From Employees e WHERE e.BadgeNumber='" & gEmployee.BadgeNumber & "' AND e.FirstName = '" & gEmployee.FirstName & "' AND e.LastName = '" & gEmployee.LastName & "'"
I am assuming that the following queries provide no value at all in regards to indexes since they serve as lookups with no Where statement
"SELECT * From Departments"
"SELECT * From Centers"
"SELECT Department FROM Departments"
"SELECT Center FROM Centers"
Questions
I was thinking about placing a Clustered Index on First and Last Name but I wonder if BadgeNumber might be more appropriate, since it is the most often used in where and BadgeNumber is the most unique.
The other tables seem so small I wonder if an index is necessary. We only have about 11 centers, less than 30 departments, 5 tests. Does it really make sense to place an index on the small tables?
The Questions and Answers Tables
These tables are larger.
There are at present 200 questions with the potiential of getting larger. For each Question, there are at the most 4 or 5 possible answers (at least 1,000 total possible answers so far).
Since the Clustered Index should be as unique as possible, I’m debating whether the Index should be on the PK or the FK.
There are at least 20 questions per category. The FK in the Questions table is CategoryID. There can be about 100 answers for each QID, FK, in Answers Table.
What should the Clustered Index be?
Here are some sample queries.
"SELECT c.CategoryID, c.TestID, c.Category, c.CategoryPurpose, c.MaterialsNeeded, l.LearningObjective FROM Categories c LEFT JOIN Learning l ON c.CategoryID = l.CategoryID WHERE c.TestID = 1"
"SELECT CategoryID, Category FROM Categories WHERE TestID =" & intTestID(intTestCounter)
"SELECT q.QID, q.Question, a.AnswerID, a.Answer, a.Correct FROM Questions q LEFT JOIN Answers a ON q.QID = a.QID WHERE CategoryID =" & gTestCategory(giCategoryNumber).CategoryID
"SELECT q.QID, q.CategoryID, q.Question, q.References, a.AnswerID, a.Answer, a.Correct FROM Questions q LEFT JOIN Answers a ON q.QID = a.QID" ' WHERE q.CategoryID IN (" & sCategoryIDs & " )"
Based on these queries, i would probably put the clustered index on TestID in the Categories table and CategoryId in the Questions table.
The Largest Tables
The largerest tables are EmployeeTests, Completed_Categories, Completed_Questions and Responses.
They increase in size respectively.
The 200+ employees can take numerous tests with up to 10 categories apiece to complete with numerous questions.
The question here is similar to the one above. Which field is the best? PK or one of the FKs?
Here are some queries.
"SELECT * FROM EmployeeTests WHERE EmpID = " & gEmployee.EmployeeID & " AND TestID = " & gTestCategory(giCategoryNumber).TestID & " ORDER BY EmpTestStartDate ASC"
"SELECT EmployeeTestID, EmpTestFinishDate FROM EmployeeTests WHERE EmpID = " & gEmployee.EmployeeID & " AND TestID = " & intTestID(x) & " ORDER BY EmpTestFinishDate ASC"
“SELECT CQID, QID FROM Completed_Questions WHERE CCID = " & intID(x, 0)
"SELECT CQID FROM Completed_Questions WHERE CCID = " & IDHolder & " AND QID = " & gEmpCompletedCat(x).TestQuestion(y).QuestionID
"SELECT AnswerID FROM Responses WHERE CQID = " & intID2(y, 0)
"SELECT CCID, StartTime, FinishTime FROM Completed_Categories WHERE EmployeeTestID = " & IDHolder & " AND CategoryID = " & gTestCategory(giCategoryNumber).CategoryID & " ORDER BY StartTime ASC"
Thanks very much. If anyone knows of a list of guidelines for selecting Clustered Indexes, please let me know. Thanks again.

Viewing 5 posts - 1 through 5 (of 5 total)

You must be logged in to reply to this topic. Login to reply

noeld SSC Guru Points: 96590 More actions · Answer 1

Here is a start:

Tips on Optimizing SQL Server Clustered Indexes

* Noel

Jacob Pressures SSCertifiable Points: 5804 More actions · Answer 2

Thanks for the article Noel. I read it. But i guess i'm not going to get any suggestions on this. I didn't think it was that difficult at least for someone with experience. The only thing i knew to do was provide some queries and explain my line of thinking. I thought someone could at least tell me if i was on the right track or not. Or give me some ideas for the tables i have no clue where to put an index.

Thanks again!

Ray M SSC-Insane Points: 21093 More actions · Answer 3

Some consider indexing more of an art than a science, it can be very challenging placing indexes on a table when sooooo many queries can be effected by those indexes.

The article noeld posted is a rule of thumb. Every index you create, or do not create has a cost benefit associated with it.

One option that usually yields descent results is Run a load test on your application, and use the index tuning wizard to help you find helpful indexes on your tables.

Some companies with high volume and high response time requirements have full time dba's working on nothing but query performance. If that tells you anything about how complex it can be.

Yet more about performance tuning. http://www.sql-server-performance.com/optimizing_indexes.asp

Your on the right track because you have done some analysis on what your table structures will look like.

But yes index fields commonly used in where clauses, if the table is of any substance the put a clustered index on at least the primary key.

Smaller static, or almost static tables you can probably get away without using clustered index. also remember index padding, if your going to put a clustered index on a non-Identity, column, then you will want to pad the index so when inserts happen for the record it does cause unnecessary time to commit the transaction becuse of torn index pages

Jacob Pressures SSCertifiable Points: 5804 More actions · Answer 4

Jacob Pressures

SSCertifiable

Points: 5804

July 27, 2005 at 8:09 am

#577554

Thanks that information was helpful.