December 16, 2011 at 11:36 am
Does anyone have a good script or method of determining the best candidate columns for the key of a clustered index. I have inherited a bunch of databases that had developers that didn't understand the need for a clustered index and the fields are non-descriptive and I need to figure out which columns are best used as the clustering key. Any thoughts?
Owner & Principal SQL Server Consultant
Im Your DBA, Inc.
https://www.imyourdba.com/
December 16, 2011 at 11:39 am
Kimberly Tripp - The Clustering Index Debate is what you're looking for.
December 19, 2011 at 1:34 pm
Thanks. I understand the concepts when you know the data but what if you don't know the data? Is there a quick script that will show each columns data uniqueness?
Owner & Principal SQL Server Consultant
Im Your DBA, Inc.
https://www.imyourdba.com/
December 19, 2011 at 1:40 pm
SQLJocky (12/19/2011)
Thanks. I understand the concepts when you know the data but what if you don't know the data? Is there a quick script that will show each columns data uniqueness?
it's not just uniqueness, but the most commonly searched for columns as well; for example, in a dictionary, the clustered index should probably be on the word, so scans are efficient, even though one word might be repeated a dozen times,each row with a different definition right?
but what if the dictionar'y's "Latin root word" was the most unique , per your example. putting the clustered index on that column would not help much if we never search for that column;
I think you have to analyze or predict what columns whill be searched,a dn compliment that with how unique the statistics are.
putting
Lowell
December 19, 2011 at 3:50 pm
If you don't know hour data, and how your data is used, you cannot determine your best clustering key. You can run database engine tuning advisor (DTA), but I would definitely not blindly trust it.
December 19, 2011 at 9:23 pm
SQLJocky (12/19/2011)
Thanks. I understand the concepts when you know the data but what if you don't know the data? Is there a quick script that will show each columns data uniqueness?
The best clustered index doesn't have to be unique (don't confuse a clustered index with a PK... they DON'T have to be the same) although, as you'll see in Kimberly's presentation, usually make the best choice. Frequently, the best clustered index is something that will keep the data in the table in a temporal order (DateTime column or IDENTITY {guaranteed to be unique} column) of one type or another or at least the same order that inserts are expected in to keep the underlying table from becoming heavily fragmented by page splits in short periods of time. If you do chose non temporal keys for the clustered index, then make sure you have a good index maintenance plan to keep the fragmentation due to page splits low.
Shifting gears, I agree with what the others stated. If you don't know what the data is in the tables, you could actually do more harm than good.
--Jeff Moden
Change is inevitable... Change for the better is not.
Viewing 7 posts - 1 through 6 (of 6 total)
You must be logged in to reply to this topic. Login to reply