July 28, 2010 at 7:34 am
Hi all - I have the following errors from DBCC Check DB
DBCC results for 'AccrualSnapshotCCRN'.
Msg 8928, Level 16, State 1, Line 1
Object ID 181575685, index ID 1, partition ID 72057614513340416, alloc unit ID 72057614701428736 (type In-row data): Page (1:18504) could not be processed. See other errors for details.
Msg 8976, Level 16, State 1, Line 1
Table error: Object ID 181575685, index ID 1, partition ID 72057614513340416, alloc unit ID 72057614701428736 (type In-row data). Page (1:18504) was not seen in the scan although its parent (1:16125) and previous (1:18499) refer to it. Check any previous errors.
Msg 8944, Level 16, State 17, Line 1
Table error: Object ID 181575685, index ID 1, partition ID 72057614513340416, alloc unit ID 72057614701428736 (type In-row data), page (1:18504), row 22. Test (columnOffsets->offTbl [varColumnNumber] <= (nextRec - pRec)) failed. Values are 10420 and 267.
Msg 8944, Level 16, State 17, Line 1
Table error: Object ID 181575685, index ID 1, partition ID 72057614513340416, alloc unit ID 72057614701428736 (type In-row data), page (1:18504), row 22. Test (columnOffsets->offTbl [varColumnNumber] <= (nextRec - pRec)) failed. Values are 10420 and 267.
Msg 8978, Level 16, State 1, Line 1
Table error: Object ID 181575685, index ID 1, partition ID 72057614513340416, alloc unit ID 72057614701428736 (type In-row data). Page (1:18508) is missing a reference from previous page (1:18504). Possible chain linkage problem.
Msg 2570, Level 16, State 3, Line 1
Page (1:360972), slot 19 in object ID 181575685, index ID 1, partition ID 72057614513340416, alloc unit ID 72057614701428736 (type "In-row data"). Column "SnapshotDate" value is out of range for data type "datetime". Update column to a legal value.
Msg 2570, Level 16, State 3, Line 1
Page (1:2316625), slot 18 in object ID 181575685, index ID 1, partition ID 72057614513340416, alloc unit ID 72057614701428736 (type "In-row data"). Column "SnapshotDate" value is out of range for data type "datetime". Update column to a legal value.
There are 13079579 rows in 530136 pages for object "AccrualSnapshotCCRN".
CHECKDB found 0 allocation errors and 7 consistency errors in table 'AccrualSnapshotCCRN' (object ID 181575685).
Here is what I have done so far:
1. Update column to a legal value.
I have used SQl and DBCC PAGE to try and figure out what data is missing. What we have found is that if I do a count(*) of the rows using the data from DBCC page I get 16 rows, if I use the data to perform a full select I get an error and only 15 rows returned. We have scrutinised the data and its looks like no data is missing. This is confusing as where has this extra row come from, has anyone else has this situation?
2. Test (columnOffsets->offTbl [varColumnNumber] <= (nextRec - pRec)) failed.
Reading Gails troubleshooting guide at http://www.sqlservercentral.com/articles/Corruption/65804/ it seems there is little I can do about this unless we have a successful full backup as the corruption is in the clustered index (we don't - something I have remedied). Therefore the only thing will be to put the database into single user and run a REPAIR_ALLOW_DATA_LOSS, and investigate the hardware logs?
3. Possible chain linkage problem.
I am not sure about this one, is it again a page corruption due to some I/O issue, and is the solution that same as in point 2.
Thanks for any help
July 28, 2010 at 8:05 am
Kwisatz78 (7/28/2010)
1. Update column to a legal value.I have used SQl and DBCC PAGE to try and figure out what data is missing. What we have found is that if I do a count(*) of the rows using the data from DBCC page I get 16 rows, if I use the data to perform a full select I get an error and only 15 rows returned. We have scrutinised the data and its looks like no data is missing. This is confusing as where has this extra row come from, has anyone else has this situation?
What queries are you running? Any query that selects or filters the date column will cause an error, because it's out of range of the datetime.
You might be able to cast it to varchar and retrieve it that way. If you can post the table structure and the queries that you've been running I might be able to help you more.
2. Test (columnOffsets->offTbl [varColumnNumber] <= (nextRec - pRec)) failed.
Reading Gails troubleshooting guide at http://www.sqlservercentral.com/articles/Corruption/65804/ it seems there is little I can do about this unless we have a successful full backup as the corruption is in the clustered index (we don't - something I have remedied). Therefore the only thing will be to put the database into single user and run a REPAIR_ALLOW_DATA_LOSS, and investigate the hardware logs?
And accept that you've lost a page of data from the AccrualSnapshotCCRN table. I assume that the minimum level to repair was repair_allow_data_loss? Without a full backup from before the corruption, there's no real way to recover without data loss.
3. Possible chain linkage problem.
I am not sure about this one, is it again a page corruption due to some I/O issue, and is the solution that same as in point 2.
These are side effect errors. If you know your index strucure, you'll know that each page has a pointer to it's next and previous page and that the next, previous and the page above it in the index tree have a pointer to it.
What's happening here is that usually page 18504 is referenced by 3 pages, the previous page in the index, the next page in the index and the 'parent' page, and it references two pages, the previous page in the index and the next page in the index.
Because page 18504 can't be read, when CheckDB checks that all the pointers in the index are valid, it finds that there are links from previous and parent to page 18504 that are not valid, because the page they point to is damaged and unreadable, and it finds that a page is missing a link from its previous page, because the previous page (page 18504) is damaged and unreadable.
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
July 28, 2010 at 9:03 am
Hi Gail 🙂
In response to your questions, the queries I am using are below
--This returns a count of 16, where has extra row come from?
select count(*) from dbo.AccrualSnapshotCCRN
where StudyID = 4134
and StudyRecruitID = 20542
and SubTopicID = 359
and LRNID = 1146
and TrustID = 15
and SEID = 1239093
-- This returns 15 rows which after analysis is the correct number of rows for this data
select * from dbo.AccrualSnapshotCCRN
where StudyID = 4134
and StudyRecruitID = 20542
and SubTopicID = 359
and LRNID = 1146
and TrustID = 15
and SEID = 1239093
order by snapshotdate
The table structure itself has no primary key but has a clustered index on StudyID. The values used are taken from running DBCC Page.
I guess I will have to accept data loss, however it was odd that the CHECKDB command did not return a recommended repair level, what I posted was the full error log. weird?
Thanks also for the explanation on page linkage. One question however why is the page after 18504 page number 18508, should they not be sequential in number? or is this a fragmentation issue, which this table also has %60 fragmentation.
July 28, 2010 at 9:34 am
Kwisatz78 (7/28/2010)
Hi Gail 🙂In response to your questions, the queries I am using are below
The second is not going to work properly. The order by on the date column will make SQL look at that column as a date and, if there are invalid values in that column, will fail with a data type conversion.
Try this:
select <all columns in the table OTHER THAN snapshotdate>,
CAST(snapshotdate AS VARCHAR(500)) AS BadDate
from dbo.AccrualSnapshotCCRN
where StudyID = 4134
and StudyRecruitID = 20542
and SubTopicID = 359
and LRNID = 1146
and TrustID = 15
and SEID = 1239093
I guess I will have to accept data loss, however it was odd that the CHECKDB command did not return a recommended repair level, what I posted was the full error log. weird?
Thinking about it, no, it's not weird. Data purity errors (the invalid value in the datetime column) are not reparable errors. (as in CheckDB with repair can't fix them), hence it reports no minimum repair level because there's no level that you can run checkDB with that will fix all the errors.
I'd recommend that you fix the data purity error first, then run CheckDB again and see what it reports.
One question however why is the page after 18504 page number 18508, should they not be sequential in number? or is this a fragmentation issue, which this table also has %60 fragmentation.
Could have several meanings. Could be that pages 18505, 18506, 18507 are allocation pages, could be they're allocated to another object, could be they're out-of-sequence pages in this table.
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
July 28, 2010 at 10:01 am
No - it will still say repair_allow_data_loss, and most especially because of the corrupt clustered index.
Is the database read-only? Or are you running CHECKDB against an explicitly-created database snapshot?
How exactly are you running CHECKDB, or more specifically, where did you get the output that you posted? You didn't post the full output as, for instance, there's no summary message for the whole database in what you posted. If it's from a SQL Agent job, SQL Agent only captures the first 1000 bytes or so of job output so the trailing messages will be missing.
Thanks
Paul Randal
CEO, SQLskills.com: Check out SQLskills online training!
Blog:www.SQLskills.com/blogs/paul Twitter: @PaulRandal
SQL MVP, Microsoft RD, Contributing Editor of TechNet Magazine
Author of DBCC CHECKDB/repair (and other Storage Engine) code of SQL Server 2005
July 29, 2010 at 2:31 am
Hi Paul - I have run the command CheckDB with no_infomsgs against my database through a query window. To be sure I have done this again this morning, and the full output is below.
Msg 8928, Level 16, State 1, Line 1
Object ID 181575685, index ID 1, partition ID 72057614513340416, alloc unit ID 72057614701428736 (type In-row data): Page (1:18504) could not be processed. See other errors for details.
Msg 8944, Level 16, State 17, Line 1
Table error: Object ID 181575685, index ID 1, partition ID 72057614513340416, alloc unit ID 72057614701428736 (type In-row data), page (1:18504), row 22. Test (columnOffsets->offTbl [varColumnNumber] <= (nextRec - pRec)) failed. Values are 10420 and 267.
Msg 8944, Level 16, State 17, Line 1
Table error: Object ID 181575685, index ID 1, partition ID 72057614513340416, alloc unit ID 72057614701428736 (type In-row data), page (1:18504), row 22. Test (columnOffsets->offTbl [varColumnNumber] <= (nextRec - pRec)) failed. Values are 10420 and 267.
Msg 2570, Level 16, State 3, Line 1
Page (1:360972), slot 19 in object ID 181575685, index ID 1, partition ID 72057614513340416, alloc unit ID 72057614701428736 (type "In-row data"). Column "SnapshotDate" value is out of range for data type "datetime". Update column to a legal value.
Msg 2570, Level 16, State 3, Line 1
Page (1:2316625), slot 18 in object ID 181575685, index ID 1, partition ID 72057614513340416, alloc unit ID 72057614701428736 (type "In-row data"). Column "SnapshotDate" value is out of range for data type "datetime". Update column to a legal value.
CHECKDB found 0 allocation errors and 5 consistency errors in table 'AccrualSnapshotCCRN' (object ID 181575685).
Msg 8928, Level 16, State 1, Line 1
Object ID 357576312, index ID 1, partition ID 72057614513078272, alloc unit ID 72057614701166592 (type In-row data): Page (1:1296889) could not be processed. See other errors for details.
Msg 8944, Level 16, State 17, Line 1
Table error: Object ID 357576312, index ID 1, partition ID 72057614513078272, alloc unit ID 72057614701166592 (type In-row data), page (1:1296889), row 18. Test (columnOffsets->offTbl [varColumnNumber] <= (nextRec - pRec)) failed. Values are 20100 and 327.
Msg 8944, Level 16, State 17, Line 1
Table error: Object ID 357576312, index ID 1, partition ID 72057614513078272, alloc unit ID 72057614701166592 (type In-row data), page (1:1296889), row 18. Test (columnOffsets->offTbl [varColumnNumber] <= (nextRec - pRec)) failed. Values are 20100 and 327.
Msg 8928, Level 16, State 1, Line 1
Object ID 357576312, index ID 1, partition ID 72057614513078272, alloc unit ID 72057614701166592 (type In-row data): Page (1:1976845) could not be processed. See other errors for details.
Msg 8944, Level 16, State 17, Line 1
Table error: Object ID 357576312, index ID 1, partition ID 72057614513078272, alloc unit ID 72057614701166592 (type In-row data), page (1:1976845), row 22. Test (columnOffsets->offTbl [varColumnNumber] <= (nextRec - pRec)) failed. Values are 10424 and 267.
Msg 8944, Level 16, State 17, Line 1
Table error: Object ID 357576312, index ID 1, partition ID 72057614513078272, alloc unit ID 72057614701166592 (type In-row data), page (1:1976845), row 22. Test (columnOffsets->offTbl [varColumnNumber] <= (nextRec - pRec)) failed. Values are 10424 and 267.
CHECKDB found 0 allocation errors and 6 consistency errors in table 'AccrualSnapshot' (object ID 357576312).
CHECKDB found 0 allocation errors and 11 consistency errors in database 'ukcrn_reporting'.
There is 2 tables with issues here but I was only asking about 1 as the problems looked to be similar. But to be more specific there is no recommended repair level :blink:
July 29, 2010 at 4:31 am
The second is not going to work properly. The order by on the date column will make SQL look at that column as a date and, if there are invalid values in that column, will fail with a data type conversion.
Try this:
select <all columns in the table OTHER THAN snapshotdate>,
CAST(snapshotdate AS VARCHAR(500)) AS BadDate
from dbo.AccrualSnapshotCCRN
where StudyID = 4134
and StudyRecruitID = 20542
and SubTopicID = 359
and LRNID = 1146
and TrustID = 15
and SEID = 1239093
Hi Gail - Unfortunately when I tried that select all I get is an error of
Msg 542, Level 16, State 1, Line 1
An invalid datetime value was encountered. Value exceeds the year 9999.
July 29, 2010 at 4:52 am
Then take the reference to the snapshotdate out of that query entirely. You should be able to read the rest of the columns. Once you;ve got that, you can decide whether to update or just delete the row.
When you've fixed the data purity error, could you run another CheckDB with no_infomsgs please?
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
July 29, 2010 at 7:58 am
Hi Gail
I would love to fix the data purity error so we can see if the recommended repair level is stated, but at the moment this is proving difficult.
I have used the query we talked about above and found the row out of the 16 returned that must be the one thats corrupt. It has a different ID field (not a unique field) to others with the same conditions in the where clause.
The problem is if I try a select to include this ID no rows are returned, nor will it update or delete. So to be clear whats happening is if I run:
select StudyID,InvestigatorID,AccrualSiteID,StudyRecruitID,StudyEntryDate,EntryEvent,EntryEventNo,RecruitType,TopicID,SubTopicID,LRNID,TrustID,OrganisationID,Acronym,TopicName,TopicShortName,LRNName,LRNShortName,OrganisationName,OrganisationShortName,PortfolioEligibilityID,PortfolioEligibilityName,SEID,isLRN --,CAST(snapshotdate AS VARCHAR(500)) AS BadDate
from dbo.AccrualSnapshotCCRN
where StudyID = 4134
and StudyRecruitID = 20542
and SubTopicID = 359
and LRNID = 1146
and TrustID = 15
and SEID = 1239093
I get 16 rows returned of which the corrupted row has an OrganisationID of 1026.
If I run the select to include this condition as below
select StudyID,InvestigatorID,AccrualSiteID,StudyRecruitID,StudyEntryDate,EntryEvent,EntryEventNo,RecruitType,TopicID,SubTopicID,LRNID,TrustID,OrganisationID,Acronym,TopicName,TopicShortName,LRNName,LRNShortName,OrganisationName,OrganisationShortName,PortfolioEligibilityID,PortfolioEligibilityName,SEID,isLRN --,CAST(snapshotdate AS VARCHAR(500)) AS BadDate
from dbo.AccrualSnapshotCCRN
where StudyID = 4134
and StudyRecruitID = 20542
and SubTopicID = 359
and LRNID = 1146
and TrustID = 15
and SEID = 1239093
and OrganisationID = 1026
Then no rows are returned!!! :crazy: Nor can I update or delete using these conditions as it says 0 rows affected!!!
Can we go in at a page level to remove it?
July 29, 2010 at 8:16 am
Odd. Could be an effect of the other corruptions on the table.
Think I'm going to defer to Paul here. Paul, repair first (no backup) then see about the data purity errors?
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
July 29, 2010 at 8:28 am
Kwisatz78 (7/29/2010)
Can we go in at a page level to remove it?
Not unless you want a whole load more corruption errors. The chance that you could successfully edit the row on the page, the page checksum, the page header and all the related components is essentially 0.
What's the clustering key?
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
July 29, 2010 at 8:36 am
The clustering key is on the StudyID column, however its not unique and there is some pretty heavy fragmentation. %60.
As a side note I think I have fixed the second data purity error, using the techniques discussed, am running another CHECKDB now to see.
July 29, 2010 at 8:51 am
Ok the output is now
Msg 8928, Level 16, State 1, Line 1
Object ID 181575685, index ID 1, partition ID 72057614513340416, alloc unit ID 72057614701428736 (type In-row data): Page (1:18504) could not be processed. See other errors for details.
Msg 8944, Level 16, State 17, Line 1
Table error: Object ID 181575685, index ID 1, partition ID 72057614513340416, alloc unit ID 72057614701428736 (type In-row data), page (1:18504), row 22. Test (columnOffsets->offTbl [varColumnNumber] <= (nextRec - pRec)) failed. Values are 10420 and 267.
Msg 8944, Level 16, State 17, Line 1
Table error: Object ID 181575685, index ID 1, partition ID 72057614513340416, alloc unit ID 72057614701428736 (type In-row data), page (1:18504), row 22. Test (columnOffsets->offTbl [varColumnNumber] <= (nextRec - pRec)) failed. Values are 10420 and 267.
Msg 2570, Level 16, State 3, Line 1
Page (1:360972), slot 19 in object ID 181575685, index ID 1, partition ID 72057614513340416, alloc unit ID 72057614701428736 (type "In-row data"). Column "SnapshotDate" value is out of range for data type "datetime". Update column to a legal value.
CHECKDB found 0 allocation errors and 4 consistency errors in table 'AccrualSnapshotCCRN' (object ID 181575685).
Still no repair recommendation and one outstanding data purity error.
July 29, 2010 at 9:03 am
That's not the full output of checkDB. The database-level summary line is missing.
For the query to ID the damaged row, does anything change if you add WITH (INDEX = 0) hint to the from clause?
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
July 29, 2010 at 9:48 am
Brilliant that worked, I was also able to perform an update using that hint.
I ran the CHECKDB again and this is the full output
Msg 8928, Level 16, State 1, Line 1
Object ID 181575685, index ID 1, partition ID 72057614513340416, alloc unit ID 72057614701428736 (type In-row data): Page (1:18504) could not be processed. See other errors for details.
Msg 8944, Level 16, State 17, Line 1
Table error: Object ID 181575685, index ID 1, partition ID 72057614513340416, alloc unit ID 72057614701428736 (type In-row data), page (1:18504), row 22. Test (columnOffsets->offTbl [varColumnNumber] <= (nextRec - pRec)) failed. Values are 10420 and 267.
Msg 8944, Level 16, State 17, Line 1
Table error: Object ID 181575685, index ID 1, partition ID 72057614513340416, alloc unit ID 72057614701428736 (type In-row data), page (1:18504), row 22. Test (columnOffsets->offTbl [varColumnNumber] <= (nextRec - pRec)) failed. Values are 10420 and 267.
CHECKDB found 0 allocation errors and 3 consistency errors in table 'AccrualSnapshotCCRN' (object ID 181575685).
Msg 8928, Level 16, State 1, Line 1
Object ID 357576312, index ID 1, partition ID 72057614513078272, alloc unit ID 72057614701166592 (type In-row data): Page (1:1296889) could not be processed. See other errors for details.
Msg 8944, Level 16, State 17, Line 1
Table error: Object ID 357576312, index ID 1, partition ID 72057614513078272, alloc unit ID 72057614701166592 (type In-row data), page (1:1296889), row 18. Test (columnOffsets->offTbl [varColumnNumber] <= (nextRec - pRec)) failed. Values are 20100 and 327.
Msg 8944, Level 16, State 17, Line 1
Table error: Object ID 357576312, index ID 1, partition ID 72057614513078272, alloc unit ID 72057614701166592 (type In-row data), page (1:1296889), row 18. Test (columnOffsets->offTbl [varColumnNumber] <= (nextRec - pRec)) failed. Values are 20100 and 327.
Msg 8928, Level 16, State 1, Line 1
Object ID 357576312, index ID 1, partition ID 72057614513078272, alloc unit ID 72057614701166592 (type In-row data): Page (1:1976845) could not be processed. See other errors for details.
Msg 8944, Level 16, State 17, Line 1
Table error: Object ID 357576312, index ID 1, partition ID 72057614513078272, alloc unit ID 72057614701166592 (type In-row data), page (1:1976845), row 22. Test (columnOffsets->offTbl [varColumnNumber] <= (nextRec - pRec)) failed. Values are 10424 and 267.
Msg 8944, Level 16, State 17, Line 1
Table error: Object ID 357576312, index ID 1, partition ID 72057614513078272, alloc unit ID 72057614701166592 (type In-row data), page (1:1976845), row 22. Test (columnOffsets->offTbl [varColumnNumber] <= (nextRec - pRec)) failed. Values are 10424 and 267.
CHECKDB found 0 allocation errors and 6 consistency errors in table 'AccrualSnapshot' (object ID 357576312).
CHECKDB found 0 allocation errors and 9 consistency errors in database 'ukcrn_reporting'.
repair_allow_data_loss is the minimum repair level for the errors found by DBCC CHECKDB (ukcrn_reporting).
This time with the recommended repair level.
I guess all thats left is to run it and ensure our backups routines are in place. Many thanks for all your help its been extremely useful.
Thanks again
Viewing 15 posts - 1 through 15 (of 18 total)
You must be logged in to reply to this topic. Login to reply