May 28, 2002 at 1:15 am
Hi All
Just wanting to see other people's ideas on how to cover off a particular scenario that's being put forward.
A company I am doing some consulting work for want to implement a SQL Cluster running in Active/Passive mode. This cluster will reside at the PRIMARY site.
They also want to have a second SQL installation at a REMOTE disaster recovery site.
They will allow a maximum of 1 hour downtime, but there can be no loss of data.
I'm thinking this will entail setting up transactional replication. I'm still investigating other methods such as log-shipping and 3rd party products like SQl-Up and DataMirror.
Anyone care to put in their 2 cents worth?
Thanks in advance
Phill
--------------------
Colt 45 - the original point and click interface
May 28, 2002 at 6:08 am
quote:
They will allow a maximum of 1 hour downtime, but there can be no loss of data.
Unfortunately no mater what you hear you can never truely guarantee no loss of data as uncomiitted transactions at the time of failure may be lost.
However for your situation I would go with Transactional Replication to get as close to realtime backup as possible. You should then make sure alreats and other state information for TR is sent to you via page or netsend to make sure you are aware the moment a replcation failure occurrs. Also you can run into issues with really large databases on the initial setup of replication so it is best to setup in an offpeak hour and if at all possible should consider another machine for distributor. That way if the main server fails the distributor should be able to get most of the data out still. Definently make sure you have your log files on a seperate drive array from your data so that either can be retrieved indenpendent of each other, should an array fail, thus allowing a possible closer no loss situation. Have a good fast link between servers to get large quantities of data moved as quickly as possible (especially for large DBs with high number of transactions). Keep at least 2 or more spare hard drives on hand to make sure you can replace any that go bad, ASAP. Keep a spare power supply for your server on hand (if changeable). IMHO it is better to have more hardware availble doing nothing than it is to wait for the service tech to bring it in.
"Don't roll your eyes at me. I will tape them in place." (Teacher on Boston Public)
May 28, 2002 at 1:22 pm
Tend to agree with replication. It works well. I prefer log shipping, but you have to have the tolerance for some data loss.
If you have a high speed link, then a product like doubletake may actually work and handle a clustered situation.
Steve Jones
May 29, 2002 at 11:00 am
We found that rolling foward with doubletake was fine but pushing back onto the original equipment was not as easy and could have resulted in excess downtime.
Wes
June 2, 2002 at 9:14 pm
Thanks for all your suggestions.
Here's what I'm am proposing,
SQL Cluster running Active/Active, One node doing database processing the other doing acting as the distributor. Each node can fail across to the other.
The remote site will have a single server that gets the data from the distributor.
I do have one question hanging around though. Nothing I've read seems to give a straight answer.
Say the Cluster has failed and we've switched to the remote site. Now we've been working on the remote site for a couple of hours, what's the process of getting the data that's at the remote site back to the primary site?
Thanks again
Phill
--------------------
Colt 45 - the original point and click interface
June 3, 2002 at 8:52 am
Painful. usually in this case, you will need another outage window to do a backup/restore to the primary site.
Steve Jones
June 21, 2002 at 4:55 pm
It is painful, but not that bad. Once on your remote site and the issue with your cluster has been resolved, use the cluster as your warmstandby until you have time for the short outage where you kick everyone off the remote site, copy the last log across and apply it, then reverse the log shipping process treating the cluster as your primary.
Shorter down time (than a complete backup-copy-restore), but means you are using the remote site for longer than you might really feel comfortable with while waiting for a suitable opportunity to move back to your cluster.
June 21, 2002 at 8:47 pm
I like replication myself. Just make sure you think through your switchover/back plan. Merge or transactional with queued updates I think are easiest/most flexible, though both are a bit more complex/intense than plain transactional.
Andy
June 24, 2002 at 9:28 am
I agree a log would be the outage window, but keep in mind if you are out for longer than the window for a transaction log, you will need to move a full backup first. This doesn't have to take the 2nd site down, but if the time for a backup, copy, restore exceeds the window for a transaction log, then you will have to do differentials, or copy multiple logs.
i.e.
You switch to the remote site, which runs t-logs every hour and a full once a day.
The next day you are ready to come back. You take the full backup from the night and copy that over to the cluster. The remote site is still working. If this takes less than an hour, you then shut down the remote site, take a t-log backup, and restore that and you are up.
If it takes more than an hour, you will have to copy multiple t-logs, or perhaps use a differential to shorten the time.
Steve Jones
Viewing 9 posts - 1 through 8 (of 8 total)
You must be logged in to reply to this topic. Login to reply