Find best key columns for Clustered Index

Question

Post reply

Find best key columns for Clustered Index

ImYourDBA

SSCertifiable

Points: 5654
More actions
December 16, 2011 at 11:36 am

#264881

Does anyone have a good script or method of determining the best candidate columns for the key of a clustered index. I have inherited a bunch of databases that had developers that didn't understand the need for a clustered index and the fields are non-descriptive and I need to figure out which columns are best used as the clustering key. Any thoughts?
Owner & Principal SQL Server Consultant
Im Your DBA, Inc.
https://www.imyourdba.com/

Viewing 7 posts - 1 through 6 (of 6 total)

You must be logged in to reply to this topic. Login to reply

okbangas SSChampion Points: 11773 More actions · Answer 1

Kimberly Tripp - The Clustering Index Debate is what you're looking for.

Ole Kristian Velstadbråten Bangås - Virinco - Facebook - Twitter

Concatenating Row Values in Transact-SQL[/url]

ImYourDBA SSCertifiable Points: 5654 More actions · Answer 2

Thanks. I understand the concepts when you know the data but what if you don't know the data? Is there a quick script that will show each columns data uniqueness?

Owner & Principal SQL Server Consultant
Im Your DBA, Inc.
https://www.imyourdba.com/

Lowell SSC Guru Points: 323518 More actions · Answer 3

SQLJocky (12/19/2011)
Thanks. I understand the concepts when you know the data but what if you don't know the data? Is there a quick script that will show each columns data uniqueness?

it's not just uniqueness, but the most commonly searched for columns as well; for example, in a dictionary, the clustered index should probably be on the word, so scans are efficient, even though one word might be repeated a dozen times,each row with a different definition right?

but what if the dictionar'y's "Latin root word" was the most unique , per your example. putting the clustered index on that column would not help much if we never search for that column;

I think you have to analyze or predict what columns whill be searched,a dn compliment that with how unique the statistics are.

putting

Lowell

--help us help you! If you post a question, make sure you include a CREATE TABLE... statement and INSERT INTO... statement into that table to give the volunteers here representative data. with your description of the problem, we can provide a tested, verifiable solution to your question! asking the question the right way gets you a tested answer the fastest way possible!

okbangas SSChampion Points: 11773 More actions · Answer 4

If you don't know hour data, and how your data is used, you cannot determine your best clustering key. You can run database engine tuning advisor (DTA), but I would definitely not blindly trust it.

Ole Kristian Velstadbråten Bangås - Virinco - Facebook - Twitter

Concatenating Row Values in Transact-SQL[/url]

okbangas SSChampion Points: 11773 More actions · Answer 5

Double post by Safari 🙁

Ole Kristian Velstadbråten Bangås - Virinco - Facebook - Twitter

Concatenating Row Values in Transact-SQL[/url]

Jeff Moden SSC Guru Points: 1004441 More actions · Answer 6

SQLJocky (12/19/2011)
Thanks. I understand the concepts when you know the data but what if you don't know the data? Is there a quick script that will show each columns data uniqueness?

The best clustered index doesn't have to be unique (don't confuse a clustered index with a PK... they DON'T have to be the same) although, as you'll see in Kimberly's presentation, usually make the best choice. Frequently, the best clustered index is something that will keep the data in the table in a temporal order (DateTime column or IDENTITY {guaranteed to be unique} column) of one type or another or at least the same order that inserts are expected in to keep the underlying table from becoming heavily fragmented by page splits in short periods of time. If you do chose non temporal keys for the clustered index, then make sure you have a good index maintenance plan to keep the fragmentation due to page splits low.

Shifting gears, I agree with what the others stated. If you don't know what the data is in the tables, you could actually do more harm than good.

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)