April 16, 2012 at 8:11 pm
Msg 8933, Level 16, State 1, Line 1
Table error: Object ID 467961935, index ID 1, partition ID 72057599455330304, alloc unit ID 72057599500812288 (type In-row data). The low key value on page (1:42120649) (level 0) is not >= the key value in the parent (1:26128664) slot 39.
Msg 8933, Level 16, State 1, Line 1
Table error: Object ID 467961935, index ID 1, partition ID 72057599455330304, alloc unit ID 72057599500812288 (type In-row data). The low key value on page (1:42120650) (level 0) is not >= the key value in the parent (1:26128664) slot 40.
Msg 8934, Level 16, State 2, Line 1
Table error: Object ID 467961935, index ID 1, partition ID 72057599455330304, alloc unit ID 72057599500812288 (type In-row data). The high key value on page (1:43290246) (level 0) is not less than the low key value in the parent (0:1), slot 0 of the next page (1:42120649).
Msg 8934, Level 16, State 3, Line 1
Table error: Object ID 467961935, index ID 1, partition ID 72057599455330304, alloc unit ID 72057599500812288 (type In-row data). The high key value on page (1:43290246) (level 0) is not less than the low key value in the parent (0:1), slot 0 of the next page (1:42120649).
CHECKDB found 0 allocation errors and 4 consistency errors in table 'SummaryHeader' (object ID 467961935).
CHECKDB found 0 allocation errors and 4 consistency errors in database 'COREISSUE'.
repair_rebuild is the minimum repair level for the errors found by DBCC CHECKDB (COREISSUE).
This is a replicated table and it seems like the clustered index is hosed (sitting on the subscriber) and unless there's no other way to avoid it, dropping the subscription and recreating the publication really isn't an option.
From what I recall about running a DBCC CHECKDB(Table, REPAIR_REBUILD), I believe I would have to stop/pause replication, put the entire DB in single user model then run the above statement...
Can anyone help with alternatives? Steps for success?
______________________________________________________________________________Never argue with an idiot; Theyll drag you down to their level and beat you with experience
April 16, 2012 at 10:36 pm
Check the publisher and make sure that table is fine there.
You can remove the article 'SummaryHeader' from replication and re-add it without generating snapshot for other articles in the publication.
Another alternative is to do page restore of page 1:42120649
April 16, 2012 at 10:47 pm
The CHECKDB is fine on the Publisher database. No errors.
I am not confident on how to drop an article from replication WITHOUT regenerating the entire snapshot for the entire publication. This is production env and it takes several hours to snapshot everything...
I've set the immediate sync to "NO" but everytime I try to add a new article the darn thing attempts to re snapshot the entire publcation. Do you have a step-by-step for accomplishing this?
EXEC sp_changepublication
@publication = 'COREISSUE_TABLES',
@property = 'immediate_sync',
@value = 'false'
GO
I have no idea how to restore a page...do you have TSQL for this?
______________________________________________________________________________Never argue with an idiot; Theyll drag you down to their level and beat you with experience
April 17, 2012 at 12:02 am
1) Run the following (with your database and publication name):
USE <Database>
GO
EXEC sp_changepublication
@publication = 'MainPub',
@property = N'allow_anonymous',
@value = 'false'
GO
EXEC sp_changepublication
@publication = 'MainPub',
@property = N'immediate_sync',
@value = 'false'
GO
2) Add new article to the publication using UI.
3) Right click on the publication. Select "View Snapshot Agent Status".
Click on "Start" button.
This will generate snapshot for only the newly added articles.
Page restore example:
restore database repl page = '1:300' from disk = 'D:\BACKUP\page.bak'
backup log repl to disk = 'D:\BACKUP\page.trn'
restore log repl from disk = 'D:\BACKUP\page.trn'
April 17, 2012 at 4:29 am
REpair_rebuild should work, however the DB has to be in single user mode for the duration of the repair. That'll be at least as long as a normal checkDB, possibly longer.
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
April 17, 2012 at 11:07 am
Thanks for the responses! So if I do the repair_rebuild option, since this is a db at the subscriber the subscription and repl_commands are going bomb while they attempt to deliver them to the DB. Can I just simply stop the logreader and related Agents on the publisher while I do this maintenance or do I need to do something else to preserve the publication/subscription?
______________________________________________________________________________Never argue with an idiot; Theyll drag you down to their level and beat you with experience
April 17, 2012 at 12:11 pm
Don't stop the log reader unless you want a full log on the publisher to boot. No, there's no problems from taking a subscriber offline, replication is designed to handle that.
Its the users of the subscriber you need to worry about.
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
April 17, 2012 at 2:34 pm
I'm sure this is probably a stupid question...but is there a preferred method to taking a subscriber offline?
______________________________________________________________________________Never argue with an idiot; Theyll drag you down to their level and beat you with experience
April 17, 2012 at 2:41 pm
ALTER DATABASE ... SET SINGLE USER (and make sure you are that single user) 🙂
Replication's more robust than most people give it credit for. The distributor knows what's been sent and it will retry. As long as you don't keep the subscriber down for longer than the retention period (which is usually days or weeks), you should not have any problems.
Do another checkDB after you finish the repair, make sure everything is fixed.
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
April 18, 2012 at 9:28 pm
Okay, last night I tried the ALTER DB SET SINGLE USER method and it ended up not being able to repair everything (I accidentally overwrote the error output so i can't post that (sorry)).
Tonight I thought I'd try the dropping of the existing article, re-adding it, then starting the snapshot agent. Everything appeared to go okay, the snapshot completed, the files are present at the UNC share, but I immediately began to see these errors from the log reader agent
The process could not execute 'sp_replcmds' on 'SERVER'. (Source: MSSQL_REPL, Error number: MSSQL_REPL20011)
Could not locate text information records for the column "FraudMemo", ID 840 during command construction. (Source: MSSQLServer, Error number: 18773)
The Log-Scan Process failed to construct a replicated command from log sequence number (LSN) {00066b3f:00004ead:0001}. Back up the publication database and contact Customer Support Services. (Source: MSSQLServer, Error number: 18805)
The process could not execute 'sp_replcmds' on 'SERVER'. (Source: MSSQL_REPL, Error number: MSSQL_REPL22037)
Now, FraudMemo isn't even in the table/article I dropped and re-added, it's in a completely different table!
Seems like it screwed up the entire publication 🙁
______________________________________________________________________________Never argue with an idiot; Theyll drag you down to their level and beat you with experience
April 18, 2012 at 11:32 pm
After I calmed down (stopped freaking out) a quick search showed me all the objects that had this "bad column - FraudMemo" and this time stopped the Log Reader Agent before making any changes (not sure if this helped or not). I located the two tables and dropped their corresponding article from the publication (one-by-one) then started up the log reader agent. After the first dropped article I still got the exact same errors. Dropped the second article, and errors went away.
Fishy. :unsure:
Replication for this publication is still currently behind (about 650,000 commands) but I do see the record counts increasing within the replicated tables - so I believe I can assume this is progress.
I wish I knew more about the inner workings of T-replication (if only there were 30 hours in the day) - could the log reader agent running have caused this or is it more likely this was all just bad timing and something else occurred at the same time I was doing this? In the SQL error logs I did see this (which occurred in the same timeframe as my replication tweaks):
During a maintenance window
A fatal error occurred while reading the input stream from the network. The session will be terminated (input error: 64, output error: 0).
There's really not much else in the logs...
______________________________________________________________________________Never argue with an idiot; Theyll drag you down to their level and beat you with experience
April 19, 2012 at 11:01 pm
For anyone wondering about this thread:
The steps taken from the prior attempt were the proper steps. It would appear that the "network blip" that occurred while I was doing the replication changes caused an issue with a different article (which I assume corrupted the repl_commands for that table en route to the distributor. Dropping the corrupted article and then re-adding it back in went smoothly.
I followed Suresh's steps again today (the only exception being I did temporarily stop the log reader agent) on a different article and everything completed successfully.
My errors have disappeared from the error logs, and hopefully things will continue to stay that way!
Thanks Suresh and Gail for your assistance. As always, it's GREATLY appreciated!!!
______________________________________________________________________________Never argue with an idiot; Theyll drag you down to their level and beat you with experience
April 19, 2012 at 11:23 pm
Thanks for your updates. Happy to know that you are able to fix the issues.
Viewing 13 posts - 1 through 12 (of 12 total)
You must be logged in to reply to this topic. Login to reply