How many rows in a table before using a clustered index?

Question

How many rows in a table before using a clustered index?

Viewing 15 posts - 16 through 30 (of 43 total)

You must be logged in to reply to this topic. Login to reply

Michael Valentine Jones SSC Guru Points: 64818 More actions · Answer 1

If you don’t have a clustered index on a table, you can have serious problems with empty pages that never get de-allocated. I once found a heap table that had 125 small rows, but was using 16 GB of storage. Performance was horrible when the query had to scan the table.

Another reason to have a clustered index is to reduce the space used by the table. If you don’t have a clustered index, each index will usually consume more space.

Stueyd Ten Centuries Points: 1315 More actions · Answer 2

Michael Valentine Jones (11/2/2012)
If you don’t have a clustered index on a table, you can have serious problems with empty pages that never get de-allocated. I once found a heap table that had 125 small rows, but was using 16 GB of storage. Performance was horrible when the query had to scan the table.
Another reason to have a clustered index is to reduce the space used by the table. If you don’t have a clustered index, each index will usually consume more space.

Yeah, but you can get around that by doing your deletes with tablock 😉

Don't get me wrong, i'm all in favour of clustered indexes, it just seemed to me that there would be a point on a small table (caveats of data, usage, etc...) at which the clustered index wouldn't be of benefit and I wondered if there was a rule of thumb and judging by the response there isn't other than "always use a clustered index".

Gazareth One Orange Chip Points: 27737 More actions · Answer 3

Seems that clustered vs heap consideration is more a product of usage rather than size.

I also suspect that any difference in overheads/benefits/etc on small tables would be so minuscule as to be irrelevant.

scogeb SSCrazy Points: 2059 More actions · Answer 4

Michael Valentine Jones (11/2/2012)
scogeb (11/2/2012)
I like to think there is always an exception to every rule. Take for instance a table that ALWAYS has inserts and NEVER has any selects. Putting a clustered index on that table would be a waste.
A table that never has any selects is a waste with or without a clustered index. If there are never any selects, what use would the table be since the data isn't used?

Some kind of log table. A table that would only be used on a very rare occaision, and hopefully never, but the data would be there if needed.

Say you have a table like that, that is constantly getting inserts, but you many never read from it. What would be the benefit of a clustered index? I see more harm then good.

CapnHector SSCoach Points: 16143 More actions · Answer 5

One thing to note is that sql server defaults to a clustered primary key. If the tables have a Primary key they most likely (Note: not always and certainly could have been created with a nonclustered primary key) are clustered on that key if they have no other indexes. i know when im creating tables i just declare a primary key and let sql server use it as its clustering key unless i know what data will be going into the table and that a different column will make a better clustering key.

EDIT: After posting and then looking at sys.indexes after creating a test table the primary key index is shown as clustered. so depends on where you looked as to the validity of my comment to the OP's situation.

For faster help in answering any problems Please read How to post data/code on a forum to get the best help - Jeff Moden[/url] for the best way to ask your question.

For performance Issues see how we like them posted here: How to Post Performance Problems - Gail Shaw[/url]

Need to Split some strings? Jeff Moden's DelimitedSplit8K[/url]
Jeff Moden's Cross tab and Pivots Part 1[/url]
Jeff Moden's Cross tab and Pivots Part 2[/url]

arnipetursson SSCertifiable Points: 6557 More actions · Answer 6

You should always add a clustered index on a table, unless you can show a REALLY good reason not to.

If you believe otherwise, it simply exposes your lack of understanding of SQL server internals.

Michael Valentine Jones SSC Guru Points: 64818 More actions · Answer 7

scogeb (11/2/2012)
Michael Valentine Jones (11/2/2012)
scogeb (11/2/2012)
I like to think there is always an exception to every rule. Take for instance a table that ALWAYS has inserts and NEVER has any selects. Putting a clustered index on that table would be a waste.
A table that never has any selects is a waste with or without a clustered index. If there are never any selects, what use would the table be since the data isn't used?
Some kind of log table. A table that would only be used on a very rare occaision, and hopefully never, but the data would be there if needed.
Say you have a table like that, that is constantly getting inserts, but you many never read from it. What would be the benefit of a clustered index? I see more harm then good.

Did you post here hoping to get an answer that you didn't get?

If you don't feel like putting clustered indexes on the tables, don't bother.

Jeff Moden SSC Guru Points: 1004414 More actions · Answer 8

scogeb (11/2/2012)
Michael Valentine Jones (11/2/2012)
scogeb (11/2/2012)
I like to think there is always an exception to every rule. Take for instance a table that ALWAYS has inserts and NEVER has any selects. Putting a clustered index on that table would be a waste.
A table that never has any selects is a waste with or without a clustered index. If there are never any selects, what use would the table be since the data isn't used?
Some kind of log table. A table that would only be used on a very rare occaision, and hopefully never, but the data would be there if needed.
Say you have a table like that, that is constantly getting inserts, but you many never read from it. What would be the benefit of a clustered index? I see more harm then good.

What kind of "harm"?

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

ScottPletcher SSC Guru Points: 101126 More actions · Answer 9

scogeb (11/2/2012)
Michael Valentine Jones (11/2/2012)
scogeb (11/2/2012)
I like to think there is always an exception to every rule. Take for instance a table that ALWAYS has inserts and NEVER has any selects. Putting a clustered index on that table would be a waste.
A table that never has any selects is a waste with or without a clustered index. If there are never any selects, what use would the table be since the data isn't used?
Some kind of log table. A table that would only be used on a very rare occaision, and hopefully never, but the data would be there if needed.
Say you have a table like that, that is constantly getting inserts, but you many never read from it. What would be the benefit of a clustered index? I see more harm then good.

Given that a clustered index doesn't cost you much space (just for the upper/non-leaf levels of the index), I tend to cluster log tables as well. However, unlike most people I suspect, instead of identity I use log_datetime. The keys are still inserted sequentially, and typically when you do search a log it's by date and time. (Yes, it's not inherently unique, so SQL has to add a "uniquifier".)

In general, you'll often see an 80-20 rule bandied about by many people in SQL Server -- i.e., 80%(+) of tables should use identity as the clus index, 20% something else. When tuning, I've found that's backwards, and that generally 10-20% of tables should use identity and the rest a more useful column(s).

SQL DBA,SQL Server MVP(07, 08, 09) "It's a dog-eat-dog world, and I'm wearing Milk-Bone underwear." "Norm", on "Cheers". Also from "Cheers", from "Carla": "You need to know 3 things about Tortelli men: Tortelli men draw women like flies; Tortelli men treat women like flies; Tortelli men's brains are in their flies".

Gail Shaw SSC Guru Points: 1004504 More actions · Answer 10

Stueyd (11/2/2012)
I do find it hard to believe there's not a table size where it's actually beneficial not to bother (1 row even?).

Nevertheless, it is the case. A clustered index may not be beneficial on a one row table, but it will definitely not be harmful (ie beneficial not to bother).

Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

We walk in the dark places no others will enter
We stand on the bridge and no one may pass

Jeff Moden SSC Guru Points: 1004414 More actions · Answer 11

arnipetursson (11/2/2012)
You should always add a clustered index on a table, unless you can show a REALLY good reason not to.
If you believe otherwise, it simply exposes your lack of understanding of SQL server internals.

Easy... no need to bring it down to that level. That's what this thread is in the process of doing... helping folks understand the internals more. 🙂

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

Jeff Moden SSC Guru Points: 1004414 More actions · Answer 12

ScottPletcher (11/2/2012)
Given that a clustered index doesn't cost you much space (just for the upper/non-leaf levels of the index), I tend to cluster log tables as well. However, unlike most people I suspect, instead of identity I use log_datetime. The keys are still inserted sequentially, and typically when you do search a log it's by date and time. (Yes, it's not inherently unique, so SQL has to add a "uniquifier".)

I take it one step further on my log tables (which are the largest tables in the system). I have two columns in the cluster index, the entry date column and the IDENTITY column. The reason why is that there's also a huge benefit from the clustered index being UNIQUE. Kimberly Tripp has an outstanding "movie" presentation on the subject. Like any good movie, you'll need to watch it more than once to pick up all the details but it's worth every minute.

http://technet.microsoft.com/en-us/sqlserver/gg508879.aspx

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

GSquared SSC Guru Points: 260824 More actions · Answer 13

scogeb (11/2/2012)
I like to think there is always an exception to every rule. Take for instance a table that ALWAYS has inserts and NEVER has any selects. Putting a clustered index on that table would be a waste.

And the table itself would be a complete waste of space. Data is only as useful as it gets used. That means "selects" somewhere along the way.

- Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
Property of The Thread

"Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon

GSquared SSC Guru Points: 260824 More actions · Answer 14

Jeff Moden (11/3/2012)
ScottPletcher (11/2/2012)
Given that a clustered index doesn't cost you much space (just for the upper/non-leaf levels of the index), I tend to cluster log tables as well. However, unlike most people I suspect, instead of identity I use log_datetime. The keys are still inserted sequentially, and typically when you do search a log it's by date and time. (Yes, it's not inherently unique, so SQL has to add a "uniquifier".)
I take it one step further on my log tables (which are the largest tables in the system). I have two columns in the cluster index, the entry date column and the IDENTITY column. The reason why is that there's also a huge benefit from the clustered index being UNIQUE. Kimberly Tripp has an outstanding "movie" presentation on the subject. Like any good movie, you'll need to watch it more than once to pick up all the details but it's worth every minute.
http://technet.microsoft.com/en-us/sqlserver/gg508879.aspx

I do the same. Never saw KT's data on it, figured it out myself about 10 years ago based on a database that was about 5% ETL data and about 95% audit logs.

- Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
Property of The Thread

"Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon

arnipetursson SSCertifiable Points: 6557 More actions · Answer 15

Not my intention. Just trying to point the conversation in the right direction.

Will be more careful re tone...

The answer to why we should cluster is all about internals (storage and how the optimizer works).