July 2, 2013 at 9:59 am
Hi,
We have an ecommerce site for which primary infrastructure (DB and web servers) is based in Chicago. We have a disaster recovery site at a data center based in Dallas and would like know how the replication should be setup between the Chicago DB server and Dallas DB server. So far, we have it setup as log shipping with a 15 mt delay and do a manual failover if we have to switch traffic to the DR site. Being an ecommerce company, its not good for us to loose the 15mts worth of data because once's the Dallas DB server becomes the primary, it starts its record from where the last replication finished so now we have duplicates. We can setup a manual process to input this 15mts worth of data into the Dallas DB server but my goal is to find an ideal way of replication if there is one possible. We consulted the DBA at our data center provider as well and they told us that there is no other option due to the latency between 2 sites. Currently the replication runs over VPN between our firewalls.
Our programmer suggests that we should do transaction replication between 2 sites which would provide live replication per record and we can also do automatic failover between the sites. I read through the details of the transaction replication but not sure if that is the way to go or log shipping is the only solution.
happy to hear everyone's suggestions. thank you for your help in advance.
July 2, 2013 at 11:54 pm
its not good for us to loose the 15mts worth of data because once's the Dallas DB server becomes the primary, it starts its record from where the last replication finished so now we have duplicates
Duplicates?
Replication is object level and you cannot trasfer etire database at a time.
How many databases running on it and what is the size of each database?
July 4, 2013 at 7:30 am
Logshipping or replication are not HA or DR solutions. these two features has a their limitations. You did not mention abt your current environment and SQL SERVER editions etc. Why did you not consider DB Mirroring.
July 5, 2013 at 3:37 am
We use SQL P2P replication to provide scale-out, high availability, and resilience. It does what it says on the can.
There are some warnings in the P2P documentation about quiescing the system before doing schema changes and some other operations, but this is only necessary if you are initiating these changes from multiple P2P hubs. You also need a strategy for managing identity column ranges.
We designate one of our P2P hubs as the 'master write server' and send all our updates to this hub. This avoids any need to manage identity column ranges during normal use. All schema changes are also initiated on the master hub, and because of this we have never needed to quiesce the system. We use a DNS vanity name to identify the master server, so that no application changes are needed when we change the vanity name to point to the other hub.
The main 'gotcha' we found with P2P replication is manging Identiiy column ranges at failover. The high-watrer mark for an identity column is only ever updated at the server that ran the INSERT statement. We have developed a process where the high water marks for all identity columns are regularly copied into a control table that is available at all P2P nodes. At failover time we run a process that interrogates this table and runs a RESEED operation on all tables with identity columns, so that the high water mark on the new master is set to a higher value than we had recorded for the old master.
The use of P2P does not guarantee no data loss.
At a macro level, if the network fails between your sites, then obviously the data gets out of synchronisation. When the network comes back P2P will automatically resynchronise, but if your master site goes down during the network outage then the second site cannot have all the data.
At a micro level, SQL replication is a batch-based process, kicked off at 5-second intervals. It should be expected that your secondary site is 5 to 10 seconds behind the primary site, plus any network delay. Depending on the type of failure you have, there is a window of opportunity to have this amount of data loss. BTW, seek Microsoft advice before reducing the time interval between replication batches, this can have unwanted side-effects.
If you need zero data loss then you should use synchronous mirroring.
Original author: https://github.com/SQL-FineBuild/Common/wiki/ 1-click install and best practice configuration of SQL Server 2019, 2017 2016, 2014, 2012, 2008 R2, 2008 and 2005.
When I give food to the poor they call me a saint. When I ask why they are poor they call me a communist - Archbishop Hélder Câmara
July 8, 2013 at 8:34 am
Hi,
Apologies for the delayed response.
@ramasankar - If the last replication happened at 2:15pm and server fails at 2:20pm and during those 5mts, 10 new orders were placed. Now those 10 orders will not be replicated but also the DR server based in dallas will start its record count from the last one which would eventually result in duplicate entries that would need to be manually sorted out once the primary server is back online.
One DB running.
DB size is about 20gb
thank you
July 8, 2013 at 8:35 am
@ Murali
DB mirroring cannot work due to the latency. This is cross site. sql edition is enterprise.
July 10, 2013 at 12:44 am
If latency is the only problem, you can increase the Partner timeout from 10 sec to longer duration, since your concern is the data should be consistently up to date even at worst case of disaster i think mirroring seems to be the suggested method for HA DR considering the Connectivity is not inconsistent other than latency, surely it will put a load on bandwidth because of mirroring being operated at tcp and then you can fail back whenever primary is back.
July 15, 2013 at 3:23 am
DB mirroring cannot work due to the latency. This is cross site. sql edition is enterprise.
Asynchronous mirroring will work over any distance.
Your choices are limited by the technology available and the speed of light. You need to discuss with your management what Recovery Point Objective you want, and the Recovery Time Objective. This will help guide you to the appropriate technology. Unless and until you have your RPO and RTO defined you cannot rule out or rule in any given solution.
You need to let the business know that zero data loss is almost impossible to achieve, and getting from 5 seconds data loss to 0.5 seconds data loss could cost orders of magnitude more than getting from 5 minutes data loss to 5 seconds data loss.
Eventually you should get to a solution that your business can afford, both in implementation cost and business risk from data loss.
Original author: https://github.com/SQL-FineBuild/Common/wiki/ 1-click install and best practice configuration of SQL Server 2019, 2017 2016, 2014, 2012, 2008 R2, 2008 and 2005.
When I give food to the poor they call me a saint. When I ask why they are poor they call me a communist - Archbishop Hélder Câmara
Viewing 8 posts - 1 through 7 (of 7 total)
You must be logged in to reply to this topic. Login to reply