Disaster Recovery Questions ??

Question

Disaster Recovery Questions ??

dilip-196624

Ten Centuries

Points: 1330
More actions
June 18, 2005 at 8:13 am

#164980

Dear All,
Greetings!
I'm new to Backup/Restore strategies
I have the following queries with respect to backup and restore strategies for disaster recovery:
1. As per SQL BOL, transaction logs cannot be backed up when full database backup is being done. Suppose, I have a big database (30 gb) which takes 1.5 hrs approx. to backup. Say after 1 hour of database being backed up, suddenly I have a system crash i.e. the blue death, where I cannot recover anything i mean i cannot even boot into the system by any source. So in this case how I can recover the transactions which were done in 1 hour when my full database was being backed up??
2. Is there a way apart from clustering so that I dont loose even 1 min of data? I know we can have stand by server and log shipping but there is a time delay in that methodology too. Will replication/linked servers/custom triggers concept can be used to tackle this?
3. How do I simulate a real production database crash on my test servers?
Waiting for expert thoughts on these questions.
Thanks and Regards,
Dilip

Viewing 12 posts - 1 through 12 (of 12 total)

You must be logged in to reply to this topic. Login to reply

Derrick Leggett SSCrazy Points: 2778 More actions · Answer 1

I'm sure you'll get a lot of responses on these. Here's a quick stab at them though.

1. IF you had a failure during the middle of the full backup, you would lose to the start of the last complete transaction log backup. Generally, you want to run the full backups in a "maintenance window" when there is less data to lose. If you lose the database, but still have the transaction log files, you could recover all transactions possibly using a tool like Lumigent Log Explorer.

2. Clustering doesn't help you with loss of data. It helps you with loss of servers. Log shipping, SQL Server replications, and hardware replication, mirroring, snapshots, etc provide high availability of data. Clustering provides high availability of server hardware. Replication is a decent solution. You also might want to look at SAN snapshots, mirroring, cloning, hardware replication, etc if you are in a SAN environment.

3. Come up with a scenario test plan. Hardware failure is easy to simulate. Use your imagination on that one. Your DR test scripts should have scenario groupings to also test loss of data (set up test accounts, corrupt data at various points and test recover), etc. Real production crash can be a LOT of different things. Anytime data is unavailable that the business is expecting, they consider it a crash. Consider that when constructing your plans. Also, be sure to weigh cost of implementation, support, and testing vs benefit to the busienss. Make sure they know those metrics and make the final decisions.

Derrick Leggett
Mean Old DBA
When life gives you a lemon, fire the DBA.

dilip-196624 Ten Centuries Points: 1330 More actions · Answer 2

Derrick,

Thanks for the info, but i think im looking for further deep introspection into these:

1.I dont have any "maintenance window". please assume that my app is running 24*7. In this case I dont want to loose any data. If a backup is going on and it has completed say 50% (ie. 1 hrs time), in this case no tlog backups are taking place, so how can i recover the 1 hour data??

2. Replication has its overheads too i believe..Im thinking of Log shipping. Will it be an appropriate solution, but even log shipping implementation, i have heard runs into lots of errors especially with tlogs restoration.

3. I would like some scenarios, If anyone can please put it down, like how to corrupt ldf or system database files? Should I delete it or how do i Test a database crash on my test system? I want some examples.

Thanks in advnace

Bye

Dilip

Jason Hall Hall of Fame Points: 3906 More actions · Answer 3

1) You cannot recover this data. If a crash happens in the middle of a full backup you cannot recover the data that was updated/inserted/deleted since the last tlog backup. This is a limitation of SQL Server and without a substantial investment in some hardware level mirroring you are hosed.

2) Log shipping is a very nice solution with very low overhead but its limitation is that you risk some data loss. Essentially you risk losing N minutes of data assuming that you are shipping logs every N minutes.

3) a. Pull the plug on your production server. This will replicate a power failure.

b. Kill the sqlservr.exe process

c. Manually corrupt the master database.

dilip-196624 Ten Centuries Points: 1330 More actions · Answer 4

Jason, Thanks for the inputs. It will do me a lot of good.

But I have some more queries boggling in my mind regarding backup strategies.

1. I'm curious to know when a backup is taking place, i know the data is inserted in the tables, but at the same time, it also stores the DML and other operations in TLog. can we read this or is there any system tables where this info is stored at the time of full backup? Another thing, which comes to my mind is during the window of 2 hrs when a full backup is being going on, i can place some triggers so that parallely it inserts data into another database for those 2 hrs.i.e. implementing trigger/setting up replication. Is there any other approach? mine is an erp database which hosts 1200 tables, so i think i cannot set triggers becoz it may cause quite a lot overhead. Merge Replication in this scenario i think would do good i.e. just for 2 hrs during the full backup window. Any other approach or ideas?

Thanks for the simulation techniques of crashing a database.

Regards,

Dilip

Jason Hall Hall of Fame Points: 3906 More actions · Answer 5

Replication or triggers will give you what your looking for but at a substantial overhead price. The only solutions I can think of are either expensive (in price), for example hardware mirroring or snapshot technologies for high availability, or expensive in database performance.

Your best bet is to find ways to reduce your backup window. 2 hours is a very long time to back up a database that is only 30 GB. You should either try backing up locally or backing up to a faster disk system for improved performance. I also recommend taking a look at a software utility that will compress your backup files. This will give you a substantially faster backup as well. Take a look at LiteSpeed (www.imceda.com)

Does anyone else have any advice for "cheap" high availability solutions that do not impose a significant performance hit?

dilip-196624 Ten Centuries Points: 1330 More actions · Answer 6

Jason,

Thanks for the inputs..I had another look in backupset table, actually it takes avg. 56 mins not 2 hrs to complete the backup of 31 GB database. I will try LiteSpeed. Thanks for the info..Can Snapshot backups be taken by Veritas? Is there any disadvantage with Snapshot backups? In SQL BOL there is little info on Snapshot backups. Please suggest.

Regards,

Dilip

Jason Hall Hall of Fame Points: 3906 More actions · Answer 7

I'm not sure if snapshot backups can be done by veritas or not. There are a few companies that do snapshot backups but is typically an expensive solution. Snapshot backups are great for fast backups but do have their holes. For example if you ever have a logical failure in your database (someone drops a table or mistakenly deletes a row or corrupts the system DB's) your snapshot will be a duplicate of the logical failure. Essentially you will have two copies of the bad database. With backup/recovery you have the ability to do things like point in time recovery should you ever need to recover up to the second before the failure and greatly reduce your data loss.

I think of snapshot solutions as more of a high availability solution rather than an all inclusive DR solution.

dilip-196624 Ten Centuries Points: 1330 More actions · Answer 8

Ok I understand, Snapshots can be used as a combination with the existing complete backup and tlog backup in DR strategy but cannot be used as the only solution and cannot be relied upon too. Thanks for the useful info.

Dilip

Ian Scarlett SSC-Insane Points: 23197 More actions · Answer 9

You could reduce your exposure slightly by doing a full backup only once a week, and doing differential backups the other days of the week, which should take less time than the full backup.

Alternatively... wait for SQL 2005, I remember reading somewhere that a full backup no longer blocks log backups, and it also supports database mirroring, i.e. log shipping in real-time.

Jason Hall Hall of Fame Points: 3906 More actions · Answer 10

Right. In SQL 2005 you are able to take a log backup while a full backup is going on. This will revolutionize backup strategies ;}

I also agree with the idea of taking differential backups more frequently. These should dramatically reduce your backup window especially if the database is fairly static. If all of the 30GB of data is frequently changing they won't save you very much.

dilip-196624 Ten Centuries Points: 1330 More actions · Answer 11

I recently came across file backups mentioned in BOL, which allows you to have the tlogs backup going on with file backup..But as per my theoretical knowledge, file backup involves a lot of complexities as far as arranging ur tables onto different disks, determing the frequently modified and infrequently modified data etc. Since mine is an ERP database which hosts around 1200 tables so in such a scenario, determining the most frequently accessed tables and somewhat accessed tables is bit difficult because ERP tables are very much interrelated and im not sure if there's an easy way to organize the tables in different filegroups..

Anyone having real-hands on experience using file backup on an enterprise level database, it would be great if you can put down your experiences with file backups on huge databases which is online almost 24 hours??? Thanks in advance..