May 12, 2005 at 2:25 pm
This is my first post to this forum. We are running SQL 2000 SP3a on Windows AS 2000. In the last few days we have experienced corruption with a few of our databases in the same SQL Server instance. On the first day we were getting the below errors on three of the databases in the instance:
Error: 823, Severity: 24, State: 2
I/O error (bad page ID) detected during read at offset 0x0000006a320000 in file 'E:\Program Files\Microsoft SQL Server\MSSQL$.......
I was able to restore the databases from a good backup with some data loss. One of the databases I was able to do a DBCC CHECKTABLE with REPAIR_ALLOW_DATA_LOSS and was able to recover the data. One of the databases received the below error during the DBCC CHECKDB command and I was unable to recover the data:
Table error: Allocation page (1:218376) has invalid PFS_PAGE page header values. Type is 0. Check type, object ID and page ID on the page.
After the databases were restored as much as possible, the DBCC CHECKDB command was run on them without reporting any errors. Then last night one of the same databases started getting the 823 errors again. This was the same database that reported the Msg 8946 error during the DBCC CHECKDB, and it reported this same error again during DBCC CHECKDB the second time it corrupted. This time I was able to take the most recent database backup and restore it on our development server and roll forward the logs without the database reporting corruption, the first time it corrupted I attempted this and the database reported corruption after the restore. I tried to restore the database from the same backup in production that was successful on the development server, but it still reported the corruption after the restore. So, I dropped the database in production, backed up the database I had restored successfully in development and used that backup to restore the database in production. The DBCC CHECKDB then came back clean. My question is, what can cause the Msg 8946:... invalid PFS_PAGE.... error? We see no disk or hardware errors in the event log and don't have a clue what is causing the corruption to occur. Any help would be very much appreciated. Thanks in advance.
May 12, 2005 at 4:04 pm
Even tought you have alreay checked the logs, I would suggest you go through hardware check completely.
May 13, 2005 at 6:37 am
Thank you very much for the response. As a matter of fact we are planning on running a disk repair on the volume where the corrupted databases are located next week. Fortunately, we have a backup server that can be used in the interim. I was just wondering if anyone else had experienced the invalid PFS_PAGE error before and how it was corrected.
May 13, 2005 at 12:20 pm
823 is a hardware error. Did you check Event viewer for any disk errors; controller? You might also run CHKDSK and see output.
May 13, 2005 at 12:25 pm
I did look in the event viewer for any disk or hardware errors, but didn't find any. We will be running disk repair on the affected volume next week and in the meantime we've directed the applicatin to use a backup database server.
May 13, 2005 at 12:35 pm
Michelle, we just ran into this same issue last week and lost almost a week of data. Look carefully at the System event log. There will most likely be a drive failure or power event. Our cause was a temporary power loss.
dab
May 13, 2005 at 12:52 pm
Thank all of you for the responses. One of the first things I did was look in the event log as I thought it was probably a hardware issue. Unfortunately, the event log had filled up, resulting in errors not being logged during the time the corruption first started and also caused the corruption to not be noticed immediately as our alert to serverity 24 errors was not triggered. Since then the log has been cleared and we have had more corruption errors, but nothing written to the event log indicating disk or hardware errors. The corrupt databases exist on a SQL Server cluster and the corruption started after fiber card maintenance had been done to the cluster. The maintenance caused the SQL Server instance to be failed over twice and we suspect something within that time frame caused the problem. We are researching disk and hardware problems now and will be doing a disk repair of the affected volume early next week. Thanks again for the info.
Michelle
Viewing 7 posts - 1 through 6 (of 6 total)
You must be logged in to reply to this topic. Login to reply