October 15, 2009 at 6:27 pm
The Scenario is this:
We maintain a roster table, tblMembers, which contains approximately 400,000 records. Members are identified by the group to which they belong via a foreign key to tblGroup.
Periodically we get a file from a group with an update for the members of their group. Basically, we run an insert query to add new members, an update to deactivate members in our table who are no longer in the group, and an update query to update address information for current members.
Recently, we added members from three new groups. This required only the first query, the insert to add them to our member table.
The Problem:
After importing the first member file, I ran a couple of queries to verify the new members were added correctly. The query timed out. After puzzling for a while, I decided to rebuild the indexes on the member table and sure enough, performance went right back to normal. I repeated this with the other two group files and confirmed that this was the case for each new set of members. Inasmuch as I'd not seen this problem with previous updates in which the final step was an update against the table, I am assuming that the update query did cause indexes to be rebuilt.
The Question:
Obviously, and counter to my expectation, the insert query did not rebuild the indexes. That only happened when I rebuilt them manually. The general question is, what events cause indexes to get rebuilt and what events don't (Insert doesn't, Update does?)
And second, more specifically, am I correct in my interpretation of what happened?
Thanks.
George
October 15, 2009 at 8:45 pm
There is only one circumstance when an index will not be updated and that is when the index is disabled. If it was disabled, it might as well not exist since it will not be considered by the optimizer.
If you rebuild the indexes, it is automatically enabled and will be used by the optimizer.
October 15, 2009 at 9:34 pm
The indexes are continuously updated during the inserts, updates, and deletes. What is not updated are the statistics, and the statistics are what the optimizer uses to decide the optimal query plan. If you change (or add) enough records the stats will no longer be good, I *believe* the number is 20%.
Statistics rebuilds can often be long painful time consuming tasks for the system and as such are not done in real-time..
CEWII
October 16, 2009 at 12:33 am
Thanks for the suggestions. I appreciate the insight.
October 16, 2009 at 6:25 am
It's also possible that you're getting statistics updated, but the standard statistics update is sampled. You might need to update the statistics through a full scan, which does happen as you rebuild an index. Experimentation as to which method of statistics updating is the only way to be sure. I still haven't found a way to measure the validity of statistics except by examining the execution plans they help generate.
"The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
- Theodore Roosevelt
Author of:
SQL Server Execution Plans
SQL Server Query Performance Tuning
October 16, 2009 at 7:03 am
happycat59 (10/15/2009)
There is only one circumstance when an index will not be updated and that is when the index is disabled. If it was disabled, it might as well not exist since it will not be considered by the optimizer.
A disabled (non-clustered) index only exists as metadeta (the definition of the index) so it literally doesn't exist.
A non-clustered index will also not generally be updated if none of the columns it references change value.
Paul White
SQLPerformance.com
SQLkiwi blog
@SQL_Kiwi
October 16, 2009 at 7:11 am
To add to Grant's reply, it is also possible that the statistics just aren't covering the new values at all.
Consider a table with an indexed column which always increases when new records are added. A simple example might be an IDENTITY column. Say column values 1-1000 existed when the statistics were last built. Now let's say you add 500 new records with values 1001-1500. According to the statistics, no rows exist above 1000, so any query plan involving a join is likely to be optimized for a single row (an estimate of zero rows is never used).
There are a couple of trace flags available to help with this (2389 and 2390) - see http://blogs.msdn.com/ianjo/archive/2006/04/24/582227.aspx for brief details.
If you consider using these, make sure you test extra thoroughly - I haven't used them for a good while now, but on one memorable occasion I did see a very large machine brought to its knees (constant 100% CPU across all CPUs) through enabling these flags...
Paul White
SQLPerformance.com
SQLkiwi blog
@SQL_Kiwi
October 16, 2009 at 7:43 am
Again, Thanks. I believe I'm getting a clearer picture of what happens here with indexes and statistics. I have one more group to add sometime in the next week. I'll try setting those trace flags to see if I can get a better idea what's going on while I process and add them.
George
Viewing 8 posts - 1 through 7 (of 7 total)
You must be logged in to reply to this topic. Login to reply