July 3, 2008 at 8:43 am
I've been a SQL DBA for close to six years and I've never seen this exact issue.
I have a database that is the target of a log shipping setup, the past two days the database has restored seemingly fine from full backups and tlogs, however when I try to run a query on a specific table in the restored database I get this error:
----------------------------------------------------------------
Server: Msg 3624, Level 20, State 1, Line 1
Location: recbase.cpp:1378
Expression: m_offBeginVar < m_SizeRec
SPID: 103
Process ID: 384
Connection Broken
----------------------------------------------------------------
I have it narrowed down to a single row in the table that is causing the problem.
When I run a repair on the db it finishes and fixes it by deleting the single record.
My concern is that the subsequent backups coming from the remote database are going to continue to have this problem since the last two days have both done it. Prior to this, the backup from three days ago is fine and contains the same row but has no issues.
My question, has anyone seen this before? Is there something I can do to fix the problem going forward? What might be the source of the issue? Obviously I've already googled like crazy but there doesn't seem to be much suggestions other than do a repair and hope it doesn't happen again.
Any help is greatly appreciated.
Ben
July 3, 2008 at 11:37 am
July 3, 2008 at 12:18 pm
Thanks for the links Mani. I checked out the articles you listed but none apply specifically to my problem. I don't get the error when I back up the DB or when I restore from the backup. I don't have any clustered indexes and I'm not using any lock hints.
Any other ideas on this?
July 3, 2008 at 12:28 pm
Can you run DBCC CHECKDB (dbname) WITH ALL_ERRORMSGS, NO_INFOMSGS and post the RESULT here..
Maninder
www.dbanation.com
July 3, 2008 at 12:34 pm
Here's what I get:
-----------------------------------------------------------------------
Server: Msg 8928, Level 16, State 1, Line 1
Object ID 1371151930, index ID 0: Page (1:2948) could not be processed. See other errors for details.
Server: Msg 8944, Level 16, State 1, Line 1
Table error: Object ID 1371151930, index ID 0, page (1:2948), row 2. Test (VarColOffsets + (int)(sizeof (class VarColOffset) * nVarCols) <= (nextRec - pRec)) failed. Values are 1245 and 863.
Server: Msg 8976, Level 16, State 1, Line 1
Table error: Object ID 1371151930, index ID 1. Page (1:2948) was not seen in the scan although its parent (1:3045) and previous (1:3114) refer to it. Check any previous errors.
Server: Msg 8978, Level 16, State 1, Line 1
Table error: Object ID 1371151930, index ID 1. Page (1:3116) is missing a reference from previous page (1:2948). Possible chain linkage problem.
CHECKDB found 0 allocation errors and 4 consistency errors in table 'invhdr' (object ID 1371151930).
CHECKDB found 0 allocation errors and 4 consistency errors in database 'cm_hoo_mpg'.
repair_allow_data_loss is the minimum repair level for the errors found by DBCC CHECKDB (cm_hoo_mpg ).
-----------------------------------------------------------------------
If you can make sense of that I applaud your intimate knowledge of SQL Server!
Thanks Again for you help.
July 3, 2008 at 2:26 pm
Pl can you post the result for
dbcc page (cm_hoo_mpg, 1, 2948, 3)
If you are ready to Take a HIT at possible Data Loss (Maybe a record or someting)
use this:
ALTER DATABASE cm_hoo_mpg SET SINGLE_USER
DBCC CHECKDB (cm_hoo_mpg, REPAIR_ALLOW_DATA_LOSS)
Maninder
www.dbanation.com
July 3, 2008 at 2:35 pm
select name,dbid from sys.sysdatabases
get the DBID from above result then
select object_name(id,yourdatabase id),id from sys.sysobjects
here id=1371151930
check the name of the Object and figure out if this is a index or other object... and post the results
Maninder
www.dbanation.com
July 3, 2008 at 2:39 pm
That statement doesn't seem to run, here's what comes back:
------------------------------------------------------------------------------
[Microsoft][ODBC SQL Server Driver][DBNETLIB]ConnectionCheckForData (CheckforData()).
Server: Msg 11, Level 16, State 1, Line 0
General network error. Check your network documentation.
Connection Broken
------------------------------------------------------------------------------
I've tried running DBCC CHECKDB (cm_hoo_mpg, REPAIR_ALLOW_DATA_LOSS)
and it did successfully fix the problem by deleting one row from the table. The problem is this is a log shipping database so when the next full backup gets applied it just recreates the original problem. The original database does not have any problems that I can tell so all I can think is something is getting messed up suring the backup process or when it's being sent over the network.
July 3, 2008 at 3:11 pm
Have you done a checkDB on the source database?
While it's not unheard of for a backup to have an error, to have multiple backups producing the same error is 'unusual' to say the least. If there were network errors I would more expect there to be different errors each time.
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
July 3, 2008 at 3:21 pm
Mani Singh (7/3/2008)
select name,dbid from sys.sysdatabasesget the DBID from above result then
select object_name(id,yourdatabase id),id from sys.sysobjects
here id=1371151930
check the name of the Object and figure out if this is a index or other object... and post the results
The object with that ID is a table.
July 3, 2008 at 3:22 pm
GilaMonster (7/3/2008)
Have you done a checkDB on the source database?While it's not unheard of for a backup to have an error, to have multiple backups producing the same error is 'unusual' to say the least. If there were network errors I would more expect there to be different errors each time.
I just ran a check on the source database and it found no issues.
July 3, 2008 at 3:25 pm
Very strange...
Do you have space on the source server to restore one of the full backups that produces an error on the standby server? If so, please restore the backup there and see if the corruption's present. If not, see if you can find some other server somewhere and see if the backup will restore clean on that machine.
Perhaps also run some disk checks on the standby server?
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
July 3, 2008 at 3:38 pm
GilaMonster (7/3/2008)
Very strange...Do you have space on the source server to restore one of the full backups that produces an error on the standby server? If so, please restore the backup there and see if the corruption's present. If not, see if you can find some other server somewhere and see if the backup will restore clean on that machine.
Perhaps also run some disk checks on the standby server?
I can't really do a restore directly back onto the source server until off hours.
I just did a restore of the backup onto a test server and the exact same thing happens when I try to query the table mentioned in the DBCC CHECK results. So that at least makes me feel better that my log shipping target server probably doesn't have a hardware issue. I would've had a rough time relaxing this weekend worrying about that:w00t:
Good idea though, thanks for the help everyone!
July 3, 2008 at 4:12 pm
Unless anyone else chimes in this weekend with anything else to try I'll probably end up opening a ticket with MS on Monday. Hopefully they will be able to figure this out.
Thanks again everyone.
July 3, 2008 at 5:16 pm
you will have to run this on the production database and not on the backup database. because that is the source of the issue.
See if you can get a outage on the production database....
Maninder
www.dbanation.com
Viewing 15 posts - 1 through 15 (of 21 total)
You must be logged in to reply to this topic. Login to reply