November 12, 2009 at 10:33 am
Thanks for the reply Gail, I may indeed start a new post topic as I would like to understand the unused space issue better. Would you mind expanding on your comment that the problem is not SQL 2000 specific, is it fixed in SQL 2008 for instance?
Cheers,
Mark.
November 12, 2009 at 10:46 am
mark.stringer (11/12/2009)
Would you mind expanding on your comment that the problem is not SQL 2000 specific, is it fixed in SQL 2008 for instance?
Technically, it's not a bug. It's a side-effect of some of the architectural details of a heap and how it reuses space. It is not version-specific. (at least not in the recent versions)
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
November 12, 2009 at 8:44 pm
I second that! Thanks Gail. I can't wait to read Part 3.
November 13, 2009 at 8:17 am
In OLTP environment, where read is more frequent than write, the Cluster index is your biggest gun and your major weapon. Over the decades its best usage is well known to most.
If your cluster is like a giant sword and your other indexes are small knives, it will not be wise to use the giant sword to cut vegetables and use your small knives to defend yourself. Of course there are exceptions but to say I prefer to use that giant sword to do cut up small things or not use it at all or “organize” so it looks pretty just seems a bit over the top.
Your statement for page splits, direct updates, and narrow index does not negate this.
So it won’t be a totally negative criticism, I would say you should attempt to make one (as there can be but one) of your most frequent or most important (depends on your need) “to be” your cluster and you’ll be surprised with the result. Using non-intelligent keys, fix length, and not null defaulted columns should help you eliminate the page splits, direct updates, and narrow index. Of course there are exceptions, but in performance tuning it is always the maximum bang for the buck wins the day.
November 13, 2009 at 10:11 am
Then we are going to have to agree to disagree on this. I understand your approach and I've used it before but in general I prefer the one that I described in the article
Generally, for OLTP (where there are lots of small queries run frequently and lots of data modification), I would want 2-3 (rarely more than 5) small indexes to handle those queries. Since the clustered index is the biggest index, I prefer to use nonclusters where possible. Won't be noticeable for singleton seeks, but range scans it usually is, if the nonclusters are covering and the queries are only returning small portions of the columns. Now, if I can get the cluster onto the column that's the most frequent access path (often the pk) and have the other considerations satisfied then that's a major win.
A data warehouse, where reads greatly exceed writes and queries are very complex has other rules for indexing. With a data warehouse I'll happily put the cluster onto the most common access path regardless of what that is, because page splits and split updates don't matter so much if the data's read-only for the majority of the time.
I said nothing about making the table look good. I don't care how it looks. I care that it's not having page splits frequently (very expensive operations), I care that the cluster is not deeper than necessary (slowing down clustered index seeks and lookups). I care that my nonclusters are not deeper than necessary. I care that rows are not moving all over the place (requiring an update to be split into a delete/insert pair, again slower than necessary).
As I said, we're going to have to agree to disagree here.
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
November 19, 2009 at 5:52 am
Hi Gail
Very interesting and useful article on clustered index architecture. I've recently noticed a clustered index in our system with a high (43%) level of fragmentation. It's based on an identity column which is the PK, so fits your qualifications for a CI pretty well. We don't update the key column but rows can be deleted. Would this deletion be the cause of the fragmentation?
Also, how is row space allocated on the page? If a row expands beyond the space available (e.g. due to updating varchar columns) can this cause splits & thus fragmentation?
John
November 19, 2009 at 7:08 am
Johnc (11/19/2009)
We don't update the key column but rows can be deleted. Would this deletion be the cause of the fragmentation?
No. Deletes can't cause fragmentation. They can result in pages that are partially (mostly) empty, but they cannot cause logical fragmentation
If a row expands beyond the space available (e.g. due to updating varchar columns) can this cause splits & thus fragmentation?
Absolutely yes. The rows are stored at the leaf level of the clustered index, if a row grows (update of varchar or a null to non-null value) and the new row won't fit, it will cause a page split, resulting in fragmentation.
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
November 26, 2009 at 2:26 pm
Great Article, Gail. Keep up the good work!
[font="Times New Roman"]-- RBarryYoung[/font], [font="Times New Roman"] (302)375-0451[/font] blog: MovingSQL.com, Twitter: @RBarryYoung[font="Arial Black"]
Proactive Performance Solutions, Inc. [/font][font="Verdana"] "Performance is our middle name."[/font]
November 27, 2009 at 2:54 am
OK that makes sense. What about free space? I assume SS allocates a default % headroom to each row to allow for growth and that the % is adjustable so we could reduce the fragmentation caused by updates to column data?
November 27, 2009 at 3:02 am
Johnc (11/27/2009)
What about free space? I assume SS allocates a default % headroom to each row to allow for growth and that the % is adjustable so we could reduce the fragmentation caused by updates to column data?
It doesn't unless you tell it to. Free space on a page increases the number of IOs required to get a range of rows. Can do, but it's a trade off, page splits vs larger than necessary tables.
The setting is Fill Factor.
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
November 27, 2009 at 3:08 am
Gail;
I am not sure but another issue on FILL FACTOR. If you do not spacify PAD_INDEX then I discovered that the mechanism will not create a fill factor by the specified %age.
November 27, 2009 at 3:18 am
Would you like to expand on that? (details or example)
PAD_INDEX controls whether the fill factor applies to just the leaf levels or the entire index (leaf and non-leaf levels)
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
November 27, 2009 at 3:26 am
Can do, but it's a trade off, page splits vs larger than necessary tables.
True, but if you know there will be updates to varchars I guess it's a choice between using a fill factor or using chars instead of varchars?
November 27, 2009 at 4:27 am
Not any update to a varchar, one that increases the size of the varchar, hence increasing the size of the row.
It's an option, just not a free one. Tradeoffs, just like with virtually everything else.
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
January 19, 2010 at 5:53 am
bob 54859 (11/12/2009)
Of course, dev manager vs architect... you know who wins. That's ok, I have bigger fish to fry. 😉BOb
Actually, I don't know who wins. It will depend on whether the dev manager is (a) a real manager or (b) the result of the Peter principle, and in case (a) also on whether the architect is (a1) a real architect or (a2) the result of a post Peter "lateral promotion", and in case (a1) it will also depend on whether there is time for proper discussion and if there isn't on the strength of each person's convictions.
Of course in case (a1) - the only sane case - the idea that there is a winner and a loser doesn't occurr to anyone: the only thing consideration is what is best for the project, for the whole team; so the answer to "who wins" is "eveyone".
bob 54859 (11/11/2009)
I'm am scared that the answer will be "it depends".
For your particular question, of course, the answer realy IS "it depends" so all your fears have come true! - it depends on whether cost of the bookmark lookups caused by having by the clustered index added to the cost of the extra space required by the (meaningless) identity column outweigh the cost of having to update other indexes when records change when you don't have a clustered index.
Tom
Viewing 15 posts - 61 through 75 (of 122 total)
You must be logged in to reply to this topic. Login to reply