Blog Post

SQL Server 2008 Replication: High Availability Solution Part One – by Abi Chapagai

,

Introduction 

Replication is the process of copying data between two databases on the same server or different servers on the same database. This is one of the methods to maintain the redundant database site for disaster recovery purpose.  Replication methodology maintains copies of main database from the primary server on the secondary sever, both of the databases will be in synchronization state and this technology is used to provide high availability. The main concept of replication in SQL Server 2008 is whatever data changes made on one databases are transferred to one or more target databases for redundancy. Snapshot, merge, transactional, and peer-peer replication are the types of replication used in SQL Server 2008. 

SQL Server Editions: 

In SQL Server replication set up, enterprise, standard or developer edition can be used. 

Replication Components:  

Very basic architecture looks like in the following diagram:- 

 

There are several components that are used in replication technology. Following are the main replication components that are used in replication set up. 

Article: In replication set up, article is a main component of replication model. An article can be a table, stored procedure, functions, or a view. 

Publication: There can be more than one article used in the replication. Therefore, publication is a collection of one or more than one articles that are used for data distribution. 

Distributor: Distributor is the main component of replication architecture. Distributor is one of the replication roles that maintains distribution database and all replication agents run here by default. 

Publisher: Publisher is a source of the database that makes the data available for replication in another server.  Publisher keeps track of what data has been changed in the subscriber during the data synchronization process. 

Subscriber: Subscriber server hold the subscriber database that receives the from the publication. 

Replication Types: 

SQL Server replication engine has three different ways of distributing the data between two databases. These methods can be applied based on the need of the business and how often the data synchronization is needed. Following are the methods of data distribution that replication engine has: 

  • Snapshot Replication:
  • Merge Replication:
  • Transactional Replication:
  • Peer-Peer Replication:

Snapshot Replication:  Snapshot replication is initiated at firs by all types of replication to initialize subscribers.  In this method of replication, entire data will be captured and sent to the subscriber. What happens in this replication is, all the changes that occur in the publisher will be captured and then later sent to the subscriber when the snapshot agent runs again.  Snapshot agent and distribution agents are used in the snapshot replication. 

Merge Replication: Merge replication is the process of distributing data from publisher to the subscriber. In the merge replication, updates are allowed even when subscriber or publisher are connected or disconnected. If they are disconnected, the updates will be applied when both get connected.  Merge agent and snapshot agent is used in merge replication. Merge replication is the most difficult replication method to implement and manage. 

Transactional Replication: In transactional replication, all changes made in the publisher are captured and stored in the distribution database, and then these changes are transferred to the subscriber database. The best situation to use transactional replication is when data changes frequently and data is transferred to the subscriber immediately as in almost real time scenario. 

Peer-Peer Replication: Peer-Peer replication is built on the foundation of the transactional replication. This replication method is introduced in SQL Server 2005. This replication is available only in SQL Server Enterprise Edition. Peer-to-Peer replication provides high availability solution by providing multiple copies of databases across multiple instances. 

Replication Topology: 

Replication topology is based on the requirement of how the replication set up is required by the business. In ideal environment, it is very important and well recommended to keep the subscriber, publisher and distributor in a separate physical hardware set up.  Central publisher, central subscriber, and publishing subscriber are the most popular replication topologies used today. In basic terms, replication is a topology or replication model that provides a process flow diagram and shows how the data flows between publisher and subscriber during the synchronization process. 

There are several components that are used in replication technology. Following are the main replication components that are used in replication set up.  

Article: In replication set up, article is a main component of replication model. An article can be a table, stored procedure, functions, or a view.  

Publication: There can be more than one article used in the replication. Therefore, publication is a collection of one or more than one articles that are used for data distribution.  

Distributor: Distributor is the main component of replication architecture. Distributor is one of the replication roles that maintains distribution database and all replication agents run here by default. 

Publisher: Publisher is a source of the database that makes the data available for replication in another server.  Publisher keeps track of what data has been changed in the subscriber during the data synchronization process.  

Subscriber: Subscriber server hold the subscriber database that receives the from the publication.  

 

Replication Types: 

SQL Server replication engine has three different ways of distributing the data between two databases. These methods can be applied based on the need of the business and how often the data synchronization is needed. Following are the methods of data distribution that replication engine has: 

·         Snapshot Replication: 

·         Merge Replication: 

·         Transactional Replication: 

·         Peer-Peer Replication: 

Snapshot Replication:  Snapshot replication is initiated at firs by all types of replication to initialize subscribers.  In this method of replication, entire data will be captured and sent to the subscriber. What happens in this replication is, all the changes that occur in the publisher will be captured and then later sent to the subscriber when the snapshot agent runs again.  Snapshot agent and distribution agents are used in the snapshot replication.  

Merge Replication: Merge replication is the process of distributing data from publisher to the subscriber. In the merge replication, updates are allowed even when subscriber or publisher are connected or disconnected. If they are disconnected, the updates will be applied when both get connected.  Merge agent and snapshot agent is used in merge replication. Merge replication is the most difficult replication method to implement and manage. 

Transactional Replication: In transactional replication, all changes made in the publisher are captured and stored in the distribution database, and then these changes are transferred to the subscriber database. The best situation to use transactional replication is when data changes frequently and data is transferred to the subscriber immediately as in almost real time scenario. 

Peer-Peer Replication: Peer-Peer replication is built on the foundation of the transactional replication. This replication method is introduced in SQL Server 2005. This replication is available only in SQL Server Enterprise Edition. Peer-to-Peer replication provides high availability solution by providing multiple copies of databases across multiple instances. 

Replication Topology: 

Replication topology is based on the requirement of how the replication set up is required by the business. In ideal environment, it is very important and well recommended to keep the subscriber, publisher and distributor in a separate physical hardware set up.  Central publisher, central subscriber, and publishing subscriber are the most popular replication topologies used today. In basic terms, replication is a topology or replication model that provides a process flow diagram and shows how the data flows between publisher and subscriber during the synchronization process.  

   

Central publisher: 

The most commonly used replication topology is to have single publisher with one or more subscribers. In this topology, distribution database can be on the same or different server. The main idea of this topology is that the data changes occur at publisher and then flows the data to one or more subscribers.  Basic architecture of central publisher replication topology is shown below with one publisher and two subscribers. There can be more than one subscriber in central publisher. 

 Central subscriber: 

In the central publisher replication topology, there will be one subscriber and multiple publishers. In this replication topology, whatever the data changes made in the publisher are consolidated into one subscriber.  Replication topology with central subscriber is shown in the figure below:  

   

 

Subscriptions:

Whatever the changes made in the publisher database can be applied to the subscriber by using subscription methods. There are two types of subscriptions used in SQL Server Replication.   They are: Pull subscription and Push subscription.  

 

Push subscription:  In the push subscription method, publisher database server takes the full responsibility of applying all the changes to the subscriber database. In this process, publisher does not ask the subscribers for any changes, Push subscription is basically used only when the subscribing database need the changes as soon as they are made in the publisher database. Data changes on the publisher can be replicated to the subscriber based on demand, continuously or on a scheduled task. Distribution agent and merge agent runs at the distributor server.

 

 Pull subscription: In pull subscription method, subscriber sends the request to the publisher to send the data to the subscriber whenever the data has been changed. The main advantage of using pull subscriptions is it allows users to check and make sure that when the data changes are synchronized between publisher and subscriber.  In this subscription, distribution agent and merge agent runs at the subscriber server.

 

Replication Agents:

It is very essential to understand that replication technology in SQL Server is not a part of core SQL Server engine  and it runs outside of the SQL Server engine with the set of external executables and these executables are referred to as Replication Agents. Replication agents are the driving force of  replication technology that connects with SQL Server and keep data processing smooth and easy.  Following are the replication agents used in replication technology:

 

Snapshot Agent: Snapshot agent is used in all types of replication. This agent takes the snapshot of all the database schema and data that needs to be replicated from publisher to the subscriber. Snapshot.exe executes in the distributor server.

 

 Log Reader Agent: Log reader agent is used only in transactional replication. All the committed transactions from the transaction log of publisher database will be extracted. Once the data is being extracted from the transaction log of the publisher database, log reader agent makes sure that each transaction is written into the distribution database  of the distributor server, and the data should be written in the same sequence as that of the transaction sequence made in the publisher database. This sequence is very critical to keep the data consistency and keep the transactions out of order. Logread.exe is the executable for this agent.

 

Distribution Agent:

Distribution agent is used in snapshot and transactional replication. Distrib.exe is the executable for this agent. Distribution agent serves two purpose: Apply snapshots at the initial stage of replication initialization process and to send transactions from distribution database to the subscriber.

 

Merge Agent:

 Merge agent is used only with merge replication. The main function of merge agent in merge replication method is to apply the snapshot that is generated when the subscriber is initialized at first duringthe replication process. Replmerg.exe is an executable for this agent and this agent runs behind the scene to finish the replication process successfully.

 

 Queue Reader Agent:

 

The main function of queue reader agent is to transfer or send the queue from the subscriber to the publisher. Qrdrsvc.exe is an executable for this agent.

 

Conclusion:

Replications is very powerful tool that is used to transfer or synchronize database in real time or near real time in either one direction or bi-direction. Single direction replication is the process of replicating the data between subscriber and publisher in one direction whereas bi-directional replication is the process of replicating the data from publisher to the subscriber and back from subscriber to the publisher. Good understanding of business requirement is needed; good planning is needed for implanting replication as an high availability option. I will be discussing about the steps of setting up and configuring each of the replication methods in the Part Two.

 

References:

http://msdn.microsoft.com/en-us/library/ms151196.aspx

Rate

5 (1)

You rated this post out of 5. Change rating

Share

Share

Rate

5 (1)

You rated this post out of 5. Change rating