DBCC Question

Question

DBCC Question

Viewing 13 posts - 16 through 27 (of 27 total)

You must be logged in to reply to this topic. Login to reply

Gail Shaw SSC Guru Points: 1004504 More actions · Answer 1

So to summarise...

Corruption appearing repeatedly in the same table across different databases on different servers?

Corruption 'disappearing' between a maintenance plan running checkDB and a manual run of checkDB with no index rebuilds or other large page-deallocating operations between the two?

Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

We walk in the dark places no others will enter
We stand on the bridge and no one may pass

dfrome Mr or Mrs. 500 Points: 533 More actions · Answer 2

I would rather just focus on the one server that keeps generating the error messages. I haven't heard from them after getting a clean DBCC, so once they post the maintenance plan results (again) i will let you know. Yes it is the same table, same db, appears to be different indexes now though.

Thanks for your help on this! I already checked the system for disk related errors in the event log, and the only event was when they swap out disks on a different drive (they rotate backups).

I also told them not to run chkdsk on the drive that SQL dbs are located on.

dfrome Mr or Mrs. 500 Points: 533 More actions · Answer 3

That's correct as well, no index rebuilds, the only thing defined is index reorganize, but that only runs once a month, I'm not sure when it ran last.

dfrome Mr or Mrs. 500 Points: 533 More actions · Answer 4

They just got finished rerunning the plan, came back totally clean again. Very odd!

Microsoft(R) Server Maintenance Utility (Unicode) Version 10.50.1600 Report was generated on "ServerName".

Maintenance Plan: SV MaintenancePlan

Duration: 00:20:29

Status: Succeeded.

Details:

Check Database Integrity (ServerName)

Check Database integrity on Local server connection Databases that have a compatibility level of 70 (SQL Server version 7.0) will be skipped.

Databases: All databases

Include indexes

Task start: 2012-04-16T13:05:42.

Task end: 2012-04-16T13:26:07.

Success

Command:USE [master]

GO

DBCC CHECKDB(N''master'') WITH NO_INFOMSGS

GO

USE [model]

GO

DBCC CHECKDB(N''model'') WITH NO_INFOMSGS

GO

USE [msdb]

GO

DBCC CHECKDB(N''msdb'') WITH NO_INFOMSGS

GO

USE [ReportServer]

GO

DBCC CHECKDB(N''ReportServer'') WITH NO_INFOMSGS

GO

USE [ReportServerTempDB]

GO

DBCC CHECKDB(N''ReportServerTempDB'') WITH NO_INFOMSGS

GO

USE [Foo]

GO

DBCC CHECKDB(N''Foo'') WITH NO_INFOMSGS

GO

chrisfradenburg SSCrazy Eights Points: 9592 More actions · Answer 5

The application doesn't happen to drop and recreate that table as part of operation, does it? Or is it possible a vendor tech got in due to a report from a user and handled it? It's a long shot but would explain this.

dfrome Mr or Mrs. 500 Points: 533 More actions · Answer 6

Heck no, no drop/creates, all maintenance is done outside the application, and their application tech messaged me about it cause he didn't know what to do either. The last time any index was entirely rebuild was in 3/2011, ever since then it's just index reorganization.

And nothing was done (no reboots either) between the message coming back with errors, and then rerunning the commands a bunch and it coming back clean.

I even had them run DROP CLEANBUFFERS prior to rerunning the maintenance plan. The only think I can think of even remotely is something happening in-memory, even the first error message in post #1 is weird, the fact it says REPAIR_ALLOW_DATALOSS, but it's index id 2 (just a rebuild would fix!).

Gail Shaw SSC Guru Points: 1004504 More actions · Answer 7

It has been suggested to me that the problem may lie in the SAN caches. Can you get the SAN admin to check that there are no errors (dropped pages is what I was told) and maybe see if the caches can be disabled for a while (though that may seriously impact performance, so take care)

Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

We walk in the dark places no others will enter
We stand on the bridge and no one may pass

dfrome Mr or Mrs. 500 Points: 533 More actions · Answer 8

This particular server isn't on a SAN, it's just your ho-hum normal DAS RAID, (I think it's raid 5 actually). It was a different server that was on a SAN (i've not got any messages since from that server).

dfrome Mr or Mrs. 500 Points: 533 More actions · Answer 9

I can check if write-caching has been enabled on the hardware RAID though.

Gail Shaw SSC Guru Points: 1004504 More actions · Answer 10

Do so, and see if the read cache can be disabled for testing.

Is it possible to shut that SQL Server down and stress-test the IO subsysten?

Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

We walk in the dark places no others will enter
We stand on the bridge and no one may pass

dfrome Mr or Mrs. 500 Points: 533 More actions · Answer 11

It's not possible to shut down the server right now :/ It's one of those 24x7 applications, I have passed on the information and will let you know about the write back cache, they are going to send a tech out to look at the drives anyways.

Gail Shaw SSC Guru Points: 1004504 More actions · Answer 12

Make sure that the tech checks everything, drives, cache, controllers, etc. I doubt this is the actual physical disk that's the problem, but culd weasily be something else in the IO stack. Also check for new versions of firmware and drivers.

Finally, I'd suggest seeing if you can get the DB onto alternate storage, even if just so you can stress test the DAS.

Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

We walk in the dark places no others will enter
We stand on the bridge and no one may pass

Perry Whittle SSC Guru Points: 234013 More actions · Answer 13

had something similar to this on an exchange server some years ago, repeated corruption turned out to be a whacko RAID backplane\bios. check and double check all areas of the storage subsystem

-----------------------------------------------------------------------------------------------------------

"Ya can't make an omelette without breaking just a few eggs" 😉