April 19, 2012 at 1:29 pm
The db is working as normal, but in the SQL logs I see these 3 messages
The operating system returned error incorrect checksum (expected: 0x1b0a0fbe; actual: 0x1b0a0fbe) to SQL Server during a read at offset 0x00000ae7e9c000 in file 'D:\Data\MyData.mdf'. Additional messages in the SQL Server error log and system event log may provide more detail. This is a severe system-level error condition that threatens database integrity and must be corrected immediately. Complete a full database consistency check (DBCC CHECKDB). This error can be caused by many factors; for more information, see SQL Server Books Online.
Error: 823, Severity: 24, State: 7.
Operating system error 'incorrect checksum (expected: 0x1b0a0fbe; actual: 0x1b0a0fbe)' resulted from attempt to read the following: sort run page (3:5717838), in file 'D:\Data\MuData_data.mdf', in database with ID 23. Sort is retrying the read.
I've run dbcc checkdb(MyData) but that said 0 errors found 0 errors repaired.
This is a sql 2005 instance and the DB page verify is set to checksum
Any ideas from a SQL viewpoint? whilst I also ask the infrastructure team to check the D: drive
April 19, 2012 at 1:54 pm
It's a sort page, so temporarily allocated for a sort operation, probably an index rebuild, and deallocated afterwards. Hence by the time CheckDB ran, the page had long since been deallocated. Deallocated pages can't be checked with checkDB as they are not part of the consistent database structure.
That said, something's up with the IO subsystem if that could happen at all, check carefully.
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
April 20, 2012 at 3:00 am
thanks for the reply.
Yes the times this has been noted is during an maintenance task that rebuilds indexes and when new application objects are deployed/compiled.
So my understanding is that we have no immediate danager as dbcc checkdb says all ok, but a 'small' problem on the disk somewhere so does need to be investigate as quickly as possible.
Can you elaborate on 'check carefully', I mean is there anything from SQL I can do or just ask our hardware team to fully test the D: drive.
Thanks
April 20, 2012 at 3:42 am
Not necessarily a small problem on the disk, you can't conclude that. Could be a huge problem that's only shown up once so far.
No, nothing really from SQL. Check windows logs, check raid logs, san logs, etc.
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
April 24, 2012 at 7:30 am
Hi Gail,
I have an update from my hardware guys.
The D: drive is very fragmented and chdkdsk failed, so their solution is to create a new LUN on clean disks format as X: then take SQL offline, copy the entire contents of D: (just MDf/NDF files really) to X: then drop D: rename X: to D: then bring SQL back online.
I wanted to use detach/attach as we have many Db's for different countries so downtime would be just for that country being moved rather than server down until all are copied, although my way would mean we cannot use D: drive
Thanks for youe help.
April 24, 2012 at 8:30 am
Make sure you have backups of all of those databases first.
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
April 24, 2012 at 8:33 am
Hi,
Yes I am asking/insisting for this (as their is a fussy line as to responsilbilites due to geography and where its a SQL or infrastructure issue) the last reply I had from them was this
As we copy the data and do not write anything to the LUN in question, I see no need for (extra)backups – the risk of the old LUN to fail ultimately is not bigger as right now in this moment or any other.
He misses the whole point of a backup.
I shall insist though. 🙂
April 24, 2012 at 8:40 am
Ask him if he wants to bet his job on there being no additional problems (because without backups, that's what you would be doing)
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
May 19, 2012 at 11:00 am
GilaMonster (4/24/2012)
Ask him if he wants to bet his job on there being no additional problems (because without backups, that's what you would be doing)
+1000 Nobody has ever gotten fired for having backups and not needing them!
But then why are they even involved in the backup process. That should be the DBA's duty.
February 5, 2014 at 1:28 pm
sotn ,
I'm curious if moving your data to a new LUN has solved your issue. I'm running into the same issue with checksums on my tempdb and like you my expected and actual checksums are matching in the output. I'm using a SAN so wondering if I just create a new drive partition from my datastore and move all my dbs over. Thanks!
May 26, 2014 at 6:18 am
Sometimes due to high page level corruption the DBCC CHECKDB command refused to repair the SQL database and shows consistency failure errors. So if you have good backup then restore it otherwise in this condition to resolve it the use of external agent is more admirable.
SSMS Expert
July 22, 2014 at 1:06 pm
I've run dbcc checkdb(MyData) but that said 0 errors found 0 errors repaired.
Did you disconnect all users/connections from the database first?
http://msdn.microsoft.com/en-us/library/ms345598(v=sql.105).aspx"> http://msdn.microsoft.com/en-us/library/ms345598(v=sql.105).aspx
I imagine there are sections that dbcc could have missed. It could have been a temporary allocation of memory as mentioned earlier and the bad sector being deallocated for the moment as Gail mentioned.
----------------------------------------------------
July 22, 2014 at 1:11 pm
Please note: 2 year old thread.
Edit: and checkDB does not require users to be disconnected, it's an online operation.
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
July 22, 2014 at 2:34 pm
I did find something interesting on Books Online regarding CHECKDB : Just wanted to note here >
"databases that are created on SQL Server 2005 and later should never contain incorrect counts; however, databases that are upgraded to SQL Server 2005 and later might. "
Wow, another fine detail to memorize 🙂
http://msdn.microsoft.com/en-us/library/ms176064(v=sql.105).aspx
----------------------------------------------------
July 22, 2014 at 3:13 pm
Hence the standard recommendation to run DBCC UpdateUsage when upgrading a DB. And 2008 still has incorrect counts occasionally.
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
Viewing 15 posts - 1 through 14 (of 14 total)
You must be logged in to reply to this topic. Login to reply