DR for large database

  • I am planning for DR for a 300Gb database. I intend to make a full backup once a month to the DR server and use log shipping to keep the DR database as up to date as possible - so far so good.

    But what happens if the live server fails half way through the full restore of the monthly backup? I do not want to have to wait 12 hours for the backup to complete. I did think of have two databases - one (the primary DR db) which keeps on receiving the logs while the second is doing the restore and then renaming the dbs. But this doesn't work as you can't rename a db in a restoring status.

    Has anyone planned for this situation and what do you recommend?

    Thanks

  • Why are you going to restore a full backup once a month?

    With log shipping, you usually do a full backup/restore once, and then keep the secondary up to date with just the transaction logs. If you need to bring the DR site online, it's just a question of restoring the last log backup(s).

    As to your original question... if you are backing up a database called "TOM", why not restore it to one called "DICK", and have the second DR database called "HARRY". To use either of your DR databases, you are going to have to bring them online, and you can rename whichever one it is to "TOM".

  • The full backup/restore once a month is a customer requirement - I won't go into detail....

    I like the TOM, DICK and HARRY approach - it will need a clever approach with the restore operations in the log shipping schedules but that should not be to difficult to sort out.

    Thansk for your comments

  • Hi John.

    I am assuming that the monthly restore is to prove that the backups actually work. If this is the case, gold star to the client. If not, perhaps you can persuade them otherwise by using the following.

    TOM is your working server which you hopefully backup more that once a month

    DICK is your DR server which you logship to from TOM.

    HARRY is used as a test restore machine to prove your backups work. You can also push, but not process, logs to. This gives you extra redundancy as you can start log shipping as and when.

    You could also run HARRY as a named instance on DICK.

    To sumerise:

    * Breaking your log shipping every month will exposing you to potential downtime and loss of service. If TOM fails just as you start the restore, the client's system is down!!

    * If TOM cannot be fixed, you will need to wait for the database restore and log shipping to complete before the client can resume normal working.

    * This could be hours!! Are they willing to accept this?

    One last thing. Consider setting log shipping up so that you can ship both ways. If you fail over to the DR site, you can log ship back to TOM which should make your fail back easier.

    Let us know how you get on.

    Hope this helps.

  • Thanks for the excellent feedback. The client wants more than just a gold star and the comment about proving that the backups are working is 100% correct!

    However, he is looking for a Belt, Braces and Elasticated Trousers approach:-).

    All things considered, I think that I may go with the following:

    TOM - Primary DB log ships to DICK, secondary DB on DR server.

    TOM - also has monthly full backups and daily differentials

    HARRY - Tertiary DB on DR server will have a full restore once a month and a daily differential.

    DICK will also have log shipping set up (but disabled) back to TOM.

    A question though:

    If TOM fails and we cut across to DICK and do a few hours of work on DICK, won't we have to backup DICK and copy the backup to TOM and restore before the reverse log shipping can be enabled?

    If this is the case we are again vulnerable if DICK fails before log shipping is fully restored from DICK to TOM.

    Am I fully understanding the issue?

  • Monthly Fulls and daily Diffs ... Ouch! :hehe: Take diffs in between daily backups by all means but ...

    Personally, I would be looking at daily full backups with weekly Fulls and daily Diffs as a compromise. Rip their gold star back ...

    If you are without either node (or the network between them) for any length of time, you are at risk. This is the time you really start to urn your money ...

    If you have to fail over from TOM to DR, your priorities are:

    * Recover the database

    * Start a full backup

    * Start pushing logs files out, even if this is to another computer near TOM

    * When the backup completes, whisk it, with all good speed, over to TOM

    * Restore the database and start importing the logs

    When the logs have caught up, you can start to relax. 🙂

    I would be very forceful against breaking a perfectly good log shipping system to run test restore. Test the restores, yes but on a test machine please ...

    Hope this helps.

  • The big problem is the size of the db and the full back up. Currently the DB is 350Gb and growing at 2Gb per day.

    The full backup takes about 5 hours and copying to DR across WAN takes 8 hours. And then we have another 8 hours for the restore - you can see the issue. And disk space, as always is at a premium especially as we are on a SAN and ££££ are an issue.

    Given these stats and info Daily & weekly backups are probably out of the question.

    I have got to build a DR strategy for the next two years when the DB will be 2Tb. We are on SQL 2000 and an upgrade to SQL 2008 with compression and mirroring is in scope. However log shipping has to be implemented before we a decision on an upgrade can be made.

  • Some times you win, some times you loose. I use to run daily backups and log ship a 1T database but requirements vary.

    Can you not run a test restore on a machine at the working end? Saves all that copying ...

    I still wouldn't go breaking a perfectly good log shipping system to run test restore though.

    Good luck.

  • If I plan things properly I will not break log shipping - I agree that I should not play around with the way that something has been designed to work.

    How long did your 1T db take to backup?

  • A few more belated comments...

    The client wants more than just a gold star and the comment about proving that the backups are working is 100% correct!

    If the monthly restore is just to prove that everything is working, you don't need to go through hoops with a 3rd database to do that. Just restore one of the logs on the DR server WITH STANDBY (instead of NORECOVERY, unless you are already restoring WITH STANDBY!!). You then have your DR database in read only mode, and you can run as many queries as you want to prove that the restores have been working. You won't break the log shipping sequence, and can just carry on as normal afterwards.

    If TOM fails and we cut across to DICK and do a few hours of work on DICK, won't we have to backup DICK and copy the backup to TOM and restore before the reverse log shipping can be enabled?

    No... if you use the built in stored procedures to reverse the roles, everything is automatically done and you don't need to start with a full backup/restore. The act of making TOM a secondary leaves it in NORECOVERY, ready to accept log backups from DICK. Making DICK the new primary starts taking log backups of DICK, and shipping them back to TOM. Unless you have actually lost TOM through a disaster, reversing it all after a few hours of using DICK as the primary is just as simple and quick as the initial reversig of roles.

  • Did you evaluate database mirroring as an alternative to log shipping? It somewhat depends on distance between the servers, but it might serve your purposes in terms of making sure that the databases are in sync.

  • Ian Scarlett (10/8/2009)


    You then have your DR database in read only mode, and you can run as many queries as you want to prove that the restores have been working. You won't break the log shipping sequence, and can just carry on as normal afterwards.

    The test restore is regarding testing the full backup, not the log files. And yes, you can restore a database with standby=<filename> to test that the logs have taken.

    I was under the impression the once your restore with recovery, you cannot go back and start restoring from logs again. Never tried it though.

    The 1Tb database took about 8 hours to backup if I remember correctly.

  • Mark_Pratt (10/8/2009)


    I was under the impression the once your restore with recovery, you cannot go back and start restoring from logs again. Never tried it though.

    It is possible... it's something I tested (endlessly!) as part of a previous log shipping installation.

    The syntax sounds a bit strange, but if you BACKUP LOG WITH NORECOVERY, it backs up the tail of the log and then leaves your database in the RECOVERING state to allow you to RESTORE LOGs.

    Have a look in BOL for a bit more detail, it specifically mentions its use when failing over to a secondary.

  • Of course!! As the LSNs match, the logs will apply.

    Good call Captain!

  • john.woods (10/8/2009)


    The big problem is the size of the db and the full back up. Currently the DB is 350Gb and growing at 2Gb per day.

    The full backup takes about 5 hours and copying to DR across WAN takes 8 hours. And then we have another 8 hours for the restore - you can see the issue. And disk space, as always is at a premium especially as we are on a SAN and ££££ are an issue.

    Invest in a database backup program such as RedGate backup, Quest Litespeed, etc that compresses your backups, saving both (lots of) space and (lots of) time. We use the latter and we backup our 450 GB database in a little over half an hour. Copying files over the WAN takes about 1,5 hours (1.5 MBps line) since the files are nice and small thanks to the compression. Restore takes about 30 mins as well. (we're using SQL2005)

    As far as DR goes, why not go for mirroring the database? This is what we do and works like a charm.

Viewing 15 posts - 1 through 15 (of 15 total)

You must be logged in to reply to this topic. Login to reply