Disaster Recovery Planning

  • We currently are running a high priority web application on a SQL 2008 2 node cluster in our data center. The application relies on 3 databases on the same instance. We are discussing our disaster recovery options and minimizing our downtime in the event of a disaster. Currently our maximum acceptable downtime is defined as 2 hours with 0 data loss. It was asked could we run a live version of the application at our DR data center (2nd location on our network) and fail over automatically without users being aware. Is this possible?

    I've looked at peer to peer replication - where we would run 2 live versions of the application (one at each site) but it appears that there may be some application changes necessary for that option to work (primary key concerns). The application itself may not be the best designed and changes would be difficult and costly.

    It seems that running two live versions of the database (and application) would not be possible. I've looked at mirroring and this seems like the more suitable solution(?). We could run a second 2 node fail over cluster (or even stand alone) and fail over to it. Could the failover occur automatically? I think that we could use DNS to connect to the primary cluster. If we had a failure of the primary cluster we could bring the mirror online and change the DNS to point to the mirror. Wouldn't existing connections to the application be dropped?

    Any general thoughts? Curious how others are accomplishing the same goals...

    Thanks!

  • I agree the Mirror is probably the easiest to set up. as long as you set up the mirror in high-safety mode you will not loose any data....just be careful the down side is the the principal has to wait for the transaction to be written to and a commit must occur on the failover server brfore changes to the principal can occur you will have some latency to deal with.

  • Do you think there is a significant performance impact on the principle server?

  • ..biggest thing to think about is this

    . the principal writes a transaction sends it to the mirror the the mirror must commit tell the principal that success has occurred then the principal will commit. this means that the principal can have lag time...think about this...for us our sever is in data center is in Texas and DR in the east coast. so I need this all to happen from fair distance. This does not cause a problem...since I am not running something large like 500 gigs...I am running only about 150...and not a lot of transactions..

    at the end of the day lots of things to consider but just guessing you should be okay....only one way to find out....set up and give it a try.

  • you have a lot of options for HA.

    2 hours of downtime isn't to bad, but 0 data loss can be expensive.

    -The price is dependent on your environment and the solution you use.

    Log shipping and replication are generally cheaper than clustering. Clustering will get you the HA, but not if wiley coyote drives a bucket truck through your datacenter. If you need offsite replication - check with the vendor if it's a vendor supported app.

    Replication can be a little tricky, because sometimes if you are using the high safety mode and the primary won't commit a transaction until it hears a commit from the mirror, road crews will dig exactly where the utility company sprays the paint line showing where the communication lines are. and you may lose communication with the outside world for a day or 2(happened to me previously).

    Log shipping nightly/ hourly or at a specified interval could work. It's a little less complicated to set up.

    Then again, VMware is nice too. Vmotion is flat out awesome. Vmware is probably the most cost effective, because any sql box set up in vmware that uses vmotion, will be moved around if there is an issue with the underlying hardware.

    But with anything - the vmware environment needs to be set up correctly and the SAN needs to be set up correctly - considering you have a SAN. Some people will bad mouth vmware instances of sql server. What you may hear is just not true.

    read some of brent ozar's[/url] stuff and watch his videos - his depth of knowledge is unreal.

    Some SQL databases may not be good candidates for vitualization, but that can be app related, poor db development related, or insanely OLTP intensive related.

    Bottom line here - test before you make the switch.

    We use Vmware for some SQL boxes, clusters for others, replication and log shipping for others.

    It really depends what you have and how much you want to spend to guarantee zero data loss.

    The cheapest (automated) solution I can think of would be log shipping to a remote location - considering you need a remote location, and have it happen at a specified interval.

    So I guess what I'm trying to say is what 456789psw just said. Set it up and give it try and see if your results are acceptable to the end users.

    Hope this helps.

    -Chris

  • I should probably read a little more carefully.

    here's a link for SQL 2008 R2 on setting up a mirrored cluster.

    It says the connections can drop if the cluster fails over to the mirror.

    The clusters we have here are all 2 node active / passive, & active /active. When they fail over, the app freezes for about 10-15 seconds, but nobody gets booted. It depends on how the connection is made from the app to the db.

    hope this helps

  • Thanks for all the info. We actually do run most of our SQL instances on VMWare. This instance is quite sensitive thus we following the vendors recommendation of running physical hardware.

    Our current cluster is an active/passive cluster. I just want to mirror the cluster (I think). I'll do some more research and come back with any questions.

Viewing 7 posts - 1 through 6 (of 6 total)

You must be logged in to reply to this topic. Login to reply