Corrupted DB Help Needed

Question

Post reply

Corrupted DB Help Needed

beechpilot

Right there with Babe

Points: 779
More actions
June 15, 2004 at 9:01 am

#102066

OK, after a truly harrowing day yesterday, I was sitting in my lawyer's office for some more harrowing discussion (non-criminal and no, not in trouble), when the call comes in: The website is down, the database server (remote, about 60 miles away) won't respond.
He took the website offline (releasing the several hundred connections that were pending on that box - Solaris 8 running FreeTDS to connect to SQL Server 7.0), and could finally get to the server with Terminal Services. Upon inspection of the error logs, we find a stack dump as well as Error 3314 and 3414. In checking this site, I found a reference to Error 605 (no help it seems).
I had him redirect the website traffic to a known, good, server (but not exactly "up to date" with data to be replicated from the source (and now corrupted) database.
The short of it - it failed while trying to rollback about a million rows being inserted via a DTS copy. They apparently hard-booted the server during this rollback and according to the stack dump, that's exactly where the logfile went down the tubes (here's the short version if the dump - spaces around some colons inserted intentionally to keep smileys out of it):
MemoryLoad = 15% TotalPhys = 2047 MB AvailPhys = 1725 MB
TotalPageFile = 4095 MB
Dump thread - spid = 8, PSS = 0x4127e084, EC = 0x4127e234
Stack Dump being sent to D:\MSSQL7\log\SQL00001.dmp
Short Stack Dump
0x77f94956 Module(ntdll+14956) (ZwGetContextThread+b)
0x0078c44a Module(sqlservr+38c44a) (utassert_fail+1a0)
0x005ad5ce Module(sqlservr+1ad5ce) (Page : : DeleteRow+38)
0x0042eab7 Module(sqlservr+2eab7) (PageRef : : DeleteRows+367)
0x005b756e Module(sqlservr+1b756e) (XDES::UndoPageOperation+124)
0x004346b7 Module(sqlservr+346b7) (XDES::RollbackToLsn+229)
0x005b4fc6 Module(sqlservr+1b4fc6) (RecoveryMgr::UndoPass+1c8)
0x004c181b Module(sqlservr+c181b) (RecoveryMgr::RecoverDb+369)
0x004ca910 Module(sqlservr+ca910) (DBTABLE::Startup+630)
0x004c9eed Module(sqlservr+c9eed) (DBMgr::StartupDB+2c4)
0x004d07dc Module(sqlservr+d07dc) (StartDBsInParallel+147)
0x41092cff Module(ums+2cff) (ProcessWorkRequests+102)
0x41092a6a Module(ums+2a6a) (FiberStartRoutine+ae)
0x7c57438b Module(KERNEL32+438b) (TlsSetValue+f0)
0x7c571608 Module(KERNEL32+1608)
To really mess things up, instead of doing the obvious (DBCC CHECKDB with options), they DETACHED the <expletive> files!
Now I'm left with two database files (a data file and a log file), which cannot be attached either via sp_attach_db or sp_attach_single_file_db, and there's data in there that I need to get. The logfile is apparently useless to update this database to a point where I can pull the affected transactions from the db and apply them to the stable one I built last night (and the wee hours of the morning). The data loss is unacceptable, and I know that I'm going to be taking a bit of heat this afternoon.
So - if anyone has some options I can try, to get these files reattached (with or without the log - I can whittle through the data and find what I need), I'd really be appreciative. (No advice on what to do with the perpetrators of this monumental screw-up -- I have that well in hand... they will be mowing my 2 acre back yard with push mowers this weekend (just kidding)...
Thanks in advance for any help anyone can provide!

-- Joe

Viewing 10 posts - 1 through 9 (of 9 total)

You must be logged in to reply to this topic. Login to reply

SQLBill SSC Guru Points: 51440 More actions · Answer 1

Gotta ask you:

(a data file and a log file), which cannot be attached either via sp_attach_db or sp_attach_single_file_db,

Why not? Did you get error messages when you tried it?

-SQLBill

SQLBill SSC Guru Points: 51440 More actions · Answer 2

If you HAVEN'T tried this already:

Rename the .ldf file (this is important) to something like old_dbname_log.ldf (just add old_ to the name).

Then try the sp_attach_single_file_db command (or it's SQL7 equivelant).

If you don't rename or move the .ldf, SQL will still try to attach it even when using the sp_attach_single_file_db command.

-SQLBill

Johan Bijnens SSC Guru Points: 135369 More actions · Answer 3

you've already been advised to use sp_attach_db ... (see bol)

Then perform the checkdb

Any backup-files ??? (full / diff / log)

All ddl scripted nicely ?

Review you backup-schema !

Johan

Learn to play, play to learn !

Dont drive faster than your guardian angel can fly ...
but keeping both feet on the ground wont get you anywhere :w00t:

- How to post Performance Problems
- How to post data/code to get the best help[/url]

- How to prevent a sore throat after hours of presenting ppt

press F1 for solution, press shift+F1 for urgent solution 😀

Need a bit of Powershell? How about this

Who am I ? Sometimes this is me but most of the time this is me

beechpilot Right there with Babe Points: 779 More actions · Answer 4

Hey SQLBill

Here's the logs from ERRORLOG and error message (and I also renamed the logfile as you suggested). "dbname" substituted for the actual database name. Also, since a database of the same name is already on the server, I was trying to use a suffix of 'bad' on the dbname (see below).

Command Issued:

sp_attach_single_file_db 'dbname_bad', 'd:\mssql7\data\dbname_d1.mdf'

Error Message from MSSQL/QA:

Server: Msg 945, Level 14, State 2, Line 1

Database 'dbname_bad' cannot be opened because some of the files could not be activated.

Server: Msg 1813, Level 16, State 2, Line 1

Could not open new database 'dbname_bad'. CREATE DATABASE is aborted.

Logfile entries:

2004-06-15 15:50:02.99 spid40 Opening file d:\mssql7\data\dbname_d1.mdf.

2004-06-15 15:50:03.03 spid40 Closing file d:\mssql7\data\dbname_d1.mdf. 2004-06-15 15:50:03.10 spid40 Starting up database 'dbname_bad'.

2004-06-15 15:50:03.10 spid40 Opening file d:\mssql7\data\dbname_d1.mdf.

2004-06-15 15:50:03.18 spid40 File d:\mssql7\data\dbname_L1.ldf does not exist - unable to activate.

2004-06-15 15:50:03.21 spid40 Closing file d:\mssql7\data\dbname_d1.mdf.

So apparently, it's still looking for that logfile, even though it's name is now "bad_dbname_L1.mdf"

Same thing if plain "sp_attach_db" used.

To alzdba:

As for backups and such, the databases are backed up to disk every day at 3:00 AM, and the backup files are taken off to tape the next morning. These are retained for two weeks. Full schemas are kept at various locations so recreating the database is not a problem.

Our problem was not recovery of most of the data (all 10G of it) - it was a small subset from the database in question from about 3:00 PM EDT to about 7:00 PM EDT. This data was never replicated to the various copies we have in other locations. Recovery of all of the data except that in the timeframe above, was done by DTS magic pulling data from another location over a high-speed link and 2.5 hours later, we were back up again...

We just need to get into these files from the corrupted db, so we can extract the data from the timeframe where data was lost. Hope that clarifies things a bit.

Cheers,

-- Joe

hoo-t SSChampion Points: 12713 More actions · Answer 5

I wonder if a tool like Lumigent Log Explorer would work? I don't know if it works on a log file that is not attached to a db.

Steve

Allen Cui-55137 SSC Guru Points: 51650 More actions · Answer 6

Create new database with same name, stop SQL Server, replace the MDF file from original database file and restart SQL Server to see whether your database is back in suspect mode. If it does, reset the database to emergency mode and run DBCC Rebuild_Log to recreate the log file. Once database is back to normal, run DBCC checkdb to ensure there is no integrity error, otherwise, you have to BCP the data out.

Undocumented DBCC Command REBUILD_LOG

EXEC sp_configure 'allow updates', 1

RECONFIGURE WITH OVERRIDE

GO

BEGIN TRAN

UPDATE master..sysdatabases

SET status = status | 32768

WHERE name = 'MyDatabase'

IF @@ROWCOUNT = 1

BEGIN

COMMIT TRAN

RAISERROR('emergency mode set', 0, 1)

END

ELSE

BEGIN

ROLLBACK

RAISERROR('unable to set emergency mode', 16, 1)

END

GO

EXEC sp_configure 'allow updates', 0

RECONFIGURE WITH OVERRIDE

GO

-- Restart SQL Server at this point.

DBCC REBUILD_LOG('MyDatabase','C:\MyDatabase.ldf')

/*Perform physical and logical integrity checks at this point.

Bcp data out if your integrity checks demonstrate that problems exist.

*/

ALTER DATABASE MyDatabase SET MULTI_USER

GO

-- Set database options and recovery model as desired.

GO

beechpilot Right there with Babe Points: 779 More actions · Answer 7

Thanks for everything folks - got the data I needed late last night, and it's been safely merged back into a new production db.

Just a quick reflection: I had a hard time selling SQL Server to the Unix guys here, as they thought the DB they were using under the Open Source license was absolutely fantastic -- until it started slowing down to a crawl. A failure last November (and a 41 day recovery period!) allowed me to slip this in and get it through.

With this failure - a weird one at that - and only a little over a day to recover, we were able to continue production while recovering, and the amount of data lost was miniscule but fully recoverable. It goes to show that not only is SQL Server a good DB to use and better than some of the other DBs out there - but the adage "you gets what you pays for" definately applies here

Cheers to all - and thanks!

-- Joe

SQLBill SSC Guru Points: 51440 More actions · Answer 8

You say you got it to work. Great news.

However, what enabled you to solve the issue? Was it one of our suggestions? Or one of your own?

Post the solution here or say which solution helped you. That way someone else who may have the same issue and finds this thread will know what solved it for you.

Thanks,

-SQLBill

Jeff Babcock Grasshopper Points: 11 More actions · Answer 9

I had the same error and the suggestions posted by Allen Cui worked - putting the database into emergency mode, then rebuilding the log, etc.

Cheers

Jeff