November 2, 2012 at 10:56 am
If you don’t have a clustered index on a table, you can have serious problems with empty pages that never get de-allocated. I once found a heap table that had 125 small rows, but was using 16 GB of storage. Performance was horrible when the query had to scan the table.
Another reason to have a clustered index is to reduce the space used by the table. If you don’t have a clustered index, each index will usually consume more space.
November 2, 2012 at 11:10 am
Michael Valentine Jones (11/2/2012)
If you don’t have a clustered index on a table, you can have serious problems with empty pages that never get de-allocated. I once found a heap table that had 125 small rows, but was using 16 GB of storage. Performance was horrible when the query had to scan the table.Another reason to have a clustered index is to reduce the space used by the table. If you don’t have a clustered index, each index will usually consume more space.
Yeah, but you can get around that by doing your deletes with tablock 😉
Don't get me wrong, i'm all in favour of clustered indexes, it just seemed to me that there would be a point on a small table (caveats of data, usage, etc...) at which the clustered index wouldn't be of benefit and I wondered if there was a rule of thumb and judging by the response there isn't other than "always use a clustered index".
November 2, 2012 at 11:20 am
Seems that clustered vs heap consideration is more a product of usage rather than size.
I also suspect that any difference in overheads/benefits/etc on small tables would be so minuscule as to be irrelevant.
November 2, 2012 at 11:20 am
Michael Valentine Jones (11/2/2012)
scogeb (11/2/2012)
I like to think there is always an exception to every rule. Take for instance a table that ALWAYS has inserts and NEVER has any selects. Putting a clustered index on that table would be a waste.A table that never has any selects is a waste with or without a clustered index. If there are never any selects, what use would the table be since the data isn't used?
Some kind of log table. A table that would only be used on a very rare occaision, and hopefully never, but the data would be there if needed.
Say you have a table like that, that is constantly getting inserts, but you many never read from it. What would be the benefit of a clustered index? I see more harm then good.
November 2, 2012 at 11:24 am
One thing to note is that sql server defaults to a clustered primary key. If the tables have a Primary key they most likely (Note: not always and certainly could have been created with a nonclustered primary key) are clustered on that key if they have no other indexes. i know when im creating tables i just declare a primary key and let sql server use it as its clustering key unless i know what data will be going into the table and that a different column will make a better clustering key.
EDIT: After posting and then looking at sys.indexes after creating a test table the primary key index is shown as clustered. so depends on where you looked as to the validity of my comment to the OP's situation.
For performance Issues see how we like them posted here: How to Post Performance Problems - Gail Shaw[/url]
Need to Split some strings? Jeff Moden's DelimitedSplit8K[/url]
Jeff Moden's Cross tab and Pivots Part 1[/url]
Jeff Moden's Cross tab and Pivots Part 2[/url]
November 2, 2012 at 11:33 am
You should always add a clustered index on a table, unless you can show a REALLY good reason not to.
If you believe otherwise, it simply exposes your lack of understanding of SQL server internals.
November 2, 2012 at 2:19 pm
scogeb (11/2/2012)
Michael Valentine Jones (11/2/2012)
scogeb (11/2/2012)
I like to think there is always an exception to every rule. Take for instance a table that ALWAYS has inserts and NEVER has any selects. Putting a clustered index on that table would be a waste.A table that never has any selects is a waste with or without a clustered index. If there are never any selects, what use would the table be since the data isn't used?
Some kind of log table. A table that would only be used on a very rare occaision, and hopefully never, but the data would be there if needed.
Say you have a table like that, that is constantly getting inserts, but you many never read from it. What would be the benefit of a clustered index? I see more harm then good.
Did you post here hoping to get an answer that you didn't get?
If you don't feel like putting clustered indexes on the tables, don't bother.
November 2, 2012 at 4:02 pm
scogeb (11/2/2012)
Michael Valentine Jones (11/2/2012)
scogeb (11/2/2012)
I like to think there is always an exception to every rule. Take for instance a table that ALWAYS has inserts and NEVER has any selects. Putting a clustered index on that table would be a waste.A table that never has any selects is a waste with or without a clustered index. If there are never any selects, what use would the table be since the data isn't used?
Some kind of log table. A table that would only be used on a very rare occaision, and hopefully never, but the data would be there if needed.
Say you have a table like that, that is constantly getting inserts, but you many never read from it. What would be the benefit of a clustered index? I see more harm then good.
What kind of "harm"?
--Jeff Moden
Change is inevitable... Change for the better is not.
November 2, 2012 at 4:34 pm
scogeb (11/2/2012)
Michael Valentine Jones (11/2/2012)
scogeb (11/2/2012)
I like to think there is always an exception to every rule. Take for instance a table that ALWAYS has inserts and NEVER has any selects. Putting a clustered index on that table would be a waste.A table that never has any selects is a waste with or without a clustered index. If there are never any selects, what use would the table be since the data isn't used?
Some kind of log table. A table that would only be used on a very rare occaision, and hopefully never, but the data would be there if needed.
Say you have a table like that, that is constantly getting inserts, but you many never read from it. What would be the benefit of a clustered index? I see more harm then good.
Given that a clustered index doesn't cost you much space (just for the upper/non-leaf levels of the index), I tend to cluster log tables as well. However, unlike most people I suspect, instead of identity I use log_datetime. The keys are still inserted sequentially, and typically when you do search a log it's by date and time. (Yes, it's not inherently unique, so SQL has to add a "uniquifier".)
In general, you'll often see an 80-20 rule bandied about by many people in SQL Server -- i.e., 80%(+) of tables should use identity as the clus index, 20% something else. When tuning, I've found that's backwards, and that generally 10-20% of tables should use identity and the rest a more useful column(s).
SQL DBA,SQL Server MVP(07, 08, 09) "It's a dog-eat-dog world, and I'm wearing Milk-Bone underwear." "Norm", on "Cheers". Also from "Cheers", from "Carla": "You need to know 3 things about Tortelli men: Tortelli men draw women like flies; Tortelli men treat women like flies; Tortelli men's brains are in their flies".
November 3, 2012 at 2:13 am
Stueyd (11/2/2012)
I do find it hard to believe there's not a table size where it's actually beneficial not to bother (1 row even?).
Nevertheless, it is the case. A clustered index may not be beneficial on a one row table, but it will definitely not be harmful (ie beneficial not to bother).
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
November 3, 2012 at 7:14 am
arnipetursson (11/2/2012)
You should always add a clustered index on a table, unless you can show a REALLY good reason not to.If you believe otherwise, it simply exposes your lack of understanding of SQL server internals.
Easy... no need to bring it down to that level. That's what this thread is in the process of doing... helping folks understand the internals more. 🙂
--Jeff Moden
Change is inevitable... Change for the better is not.
November 3, 2012 at 7:34 am
ScottPletcher (11/2/2012)
Given that a clustered index doesn't cost you much space (just for the upper/non-leaf levels of the index), I tend to cluster log tables as well. However, unlike most people I suspect, instead of identity I use log_datetime. The keys are still inserted sequentially, and typically when you do search a log it's by date and time. (Yes, it's not inherently unique, so SQL has to add a "uniquifier".)
I take it one step further on my log tables (which are the largest tables in the system). I have two columns in the cluster index, the entry date column and the IDENTITY column. The reason why is that there's also a huge benefit from the clustered index being UNIQUE. Kimberly Tripp has an outstanding "movie" presentation on the subject. Like any good movie, you'll need to watch it more than once to pick up all the details but it's worth every minute.
http://technet.microsoft.com/en-us/sqlserver/gg508879.aspx
--Jeff Moden
Change is inevitable... Change for the better is not.
November 3, 2012 at 10:15 am
scogeb (11/2/2012)
I like to think there is always an exception to every rule. Take for instance a table that ALWAYS has inserts and NEVER has any selects. Putting a clustered index on that table would be a waste.
And the table itself would be a complete waste of space. Data is only as useful as it gets used. That means "selects" somewhere along the way.
- Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
Property of The Thread
"Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon
November 3, 2012 at 10:19 am
Jeff Moden (11/3/2012)
ScottPletcher (11/2/2012)
Given that a clustered index doesn't cost you much space (just for the upper/non-leaf levels of the index), I tend to cluster log tables as well. However, unlike most people I suspect, instead of identity I use log_datetime. The keys are still inserted sequentially, and typically when you do search a log it's by date and time. (Yes, it's not inherently unique, so SQL has to add a "uniquifier".)I take it one step further on my log tables (which are the largest tables in the system). I have two columns in the cluster index, the entry date column and the IDENTITY column. The reason why is that there's also a huge benefit from the clustered index being UNIQUE. Kimberly Tripp has an outstanding "movie" presentation on the subject. Like any good movie, you'll need to watch it more than once to pick up all the details but it's worth every minute.
I do the same. Never saw KT's data on it, figured it out myself about 10 years ago based on a database that was about 5% ETL data and about 95% audit logs.
- Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
Property of The Thread
"Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon
November 5, 2012 at 9:21 am
Not my intention. Just trying to point the conversation in the right direction.
Will be more careful re tone...
The answer to why we should cluster is all about internals (storage and how the optimizer works).
Viewing 15 posts - 16 through 30 (of 43 total)
You must be logged in to reply to this topic. Login to reply