January 21, 2013 at 9:40 am
Hi all:
Early Sunday morning, our daily check of database integrity failed on one of our databases. The task is not set to auto repair. Then, early this morning the check passed without error. Even though the issue seems to have resolved itself, I am still concerned, so I wanted to post to get other comments.
Other information: According to the error log, just before the integrity check completed there were a couple of entries indicating some I/O requests were taking longer than 15 seconds and another saying "Worker 0x03FA60E8 appears to be non-yeilding on scheduler 1." I've done a google search on this, but I don't see anything obvious so far that relates to my situation. Also, there was a couple of "timeout while waiting for buffer latch - type2" messages.
I'll continue to google search some of this, but I'd be interested in any comments you guys may have based on the above.
January 21, 2013 at 9:49 am
Got an index rebuild after the integrity check?
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
January 21, 2013 at 9:52 am
GilaMonster (1/21/2013)
Got an index rebuild after the integrity check?
Yes, we sure do. So, are you saying the index rebuild probably fixed the corruption?
January 21, 2013 at 10:14 am
Fixed, no. Deallocated the corrupt page, yes.
CheckDB only checks allocated pages. If an index rebuild deallocated a corrupt page as part of the rebuild, the corruption would 'vanish', because it's no longer part of the allocated set of pages
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
January 21, 2013 at 1:35 pm
GilaMonster (1/21/2013)
If an index rebuild deallocated a corrupt page as part of the rebuild, the corruption would 'vanish', because it's no longer part of the allocated set of pages
Thank you, but I'm not clear on my next course of action. I've read the following links and am still unclear:
To me, it sounds like the corruption is still there and can reappear when that page or pages become "reallocated." Secondly, one of the articles tells how to enable checksums, but doesn't say how to check if they are already enabled. I've only been here less than a year, so I have no way of knowing if the database was originally created in version 2005 or if it was sooner. Thirdly, if I've only seen this once and not repeatedly, is it necessary to look into possible problems with the I/O subsystem?
Thanks for your time and help.
January 21, 2013 at 1:53 pm
Del Lee (1/21/2013)
To me, it sounds like the corruption is still there and can reappear when that page or pages become "reallocated."
No. When a page is reallocated, it's rewritten, so anything on that page is overwritten.
Secondly, one of the articles tells how to enable checksums, but doesn't say how to check if they are already enabled. I've only been here less than a year, so I have no way of knowing if the database was originally created in version 2005 or if it was sooner.
page_verify in sys.databases
Thirdly, if I've only seen this once and not repeatedly, is it necessary to look into possible problems with the I/O subsystem?
Yes. Corruption doesn't appear from nowhere for no reason. Something went wrong to cause it at all.
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
January 31, 2013 at 10:08 am
I have not seen any additional corruption since my initial post. My server specialist indicated there were warnings related to I/O latency that evening, but since then he has been monitoring things and here are his comments below.
However, I am still seeing warnings related to I/O latency on SQL02. The warning happens when the average I/O latency increases.
It is normally running in the range of 2ms write and 5ms read.
The spike is a consistent and predictable spike every night during our maintenance window starting at 12am.
The latency jumps to the 50-70ms for reads and 15-20ms for writes.
As soon as everything is done, it drops back to normal.
I'm not sure there's anything I can do about the maintenance jobs that are running (these are CheckDB, Backups, Index Rebuilds, etc). Since we are not experiencing any errors, should I be content or take some other action?
Thanks...
Viewing 7 posts - 1 through 6 (of 6 total)
You must be logged in to reply to this topic. Login to reply