April 29, 2008 at 8:05 am
I know the general rule for a non-clustered index is ~95% uniqueness or whatever different people say, but what about a clustered index? I've got 3 tables with 500k-5.5mil rows and there's data all the way back to 2003-ish. I need to filter on the current year and I was wondering if it would be worth the overhead to create the index.
If that won't be a good idea, I was thinking about puting a clustered index on the student IDs. The biggest table is a class roster, so each student ID will be repeated in the table for each class that the student is in and for each year the student is enrolled. about 50k students * ~10 classes * 8 years. Even though the ID isn't anywhere unique on this table, would it be worth the sort performed by the index creation?
Oh, and this is for a daily 1 time import of raw data, but these queries will be used several times during the import. For every import, we have to truncate the tables before the DTS runs.
Thanks
April 29, 2008 at 9:13 am
Just my personal point of view.
Adding a cluster on academic year may not be a bad idea if you often retrieve data based on years.
Normally, studentIDs use SSN. It is not a good candidate for a cluster index in your case.
You need to do some tests before determining which one is better.
April 29, 2008 at 9:35 am
Since I asked this question, this is what I came up with and some extra info.
I have 3 columns that I always filter for the same. Version is always '00' for the "active" version, district is always '1000' and schoolyear rarely changes. I made a composite clustered index for (Version, District, School Year) in that order. There are about 6 different versions, 3 districts, and ~6 school years. I then made a separate index on studentID. about 90% of my queries HAVE to always filter on (Version, District, School Year) and about 10% of my queries for studentID, but when I join on studentID, it's a BIG join.
I know there's only so much a person can speculate without my showing table structure and understanding my queries, but does this sound fairly good in theory?
Any constructive criticism is warmly welcomed.
April 29, 2008 at 9:42 am
SQL ORACLE (4/29/2008)
Just my personal point of view.Adding a cluster on academic year may not be a bad idea if you often retrieve data based on years.
Normally, studentIDs use SSN. It is not a good candidate for a cluster index in your case.
You need to do some tests before determining which one is better.
Thanks for your input, reading your post triggered a few thoughts, even some unrelated but still helpful.
On a side note, SSN as ID in general sounds bad to me. Anyway, I have access to tens of thousands of schools and having student SSN info and names would probably be a bit of a privacy issue for a 3rd party company to be storing.
Viewing 4 posts - 1 through 3 (of 3 total)
You must be logged in to reply to this topic. Login to reply