For my next series of blog posts I intend to document and write about SQL Server replication. This is a pretty large topic and I make no claims to an expert in replication but over the last couple years I have been working with replication more and more and I think it is time I wrote some blog posts on the topic to document and reinforce my understanding of the subject and hopefully learn a lot along the way.
SQL Server replication is a technology that can be used to distribute and update data from one database to another and also synchronising the databases to maintain consistency.
SQL Server offers the following types of Replication:
Snapshot Replication - a snapshot of the database is taken at a point in time and the whole data set it rather than individual transactions is applied to the subscriber. This is not a continuous process so the subscribing database always has some latency when compared to the publishing database.
Transactional Replication - Transactional replication allows incremental changes to be applied to the subscribing databases. These changes can either be applied continuously or incrementally - at set intervals. Transactional replication is normally used when there is high number of write transactions against the database (INSERT, UDPATED, DELETE)
Merge Replication - I struggled to put merge replication in to words but Book Online describes it as "data changes and schema modifications made at the Publisher and Subscribers are tracked with triggers. The Subscriber synchronizes with the Publisher when connected to the network and exchanges all rows that have changed between the Publisher and Subscriber since the last time synchronisation occurred." I think Merge replication is probably the most complicated and difficult type of replication to configure and manage.
The type of replication you choose for a given application will vary according to a number of factors. This MSDN article http://msdn.microsoft.com/en-us/library/ms152565.aspx details what type of replication is best suited for a given scenario.
I have worked mainly with snapshot and transaction replication so I will start with some posts on how to configure these two type of replication. I have not done much with merge replication in truth, but I will be looking at it here too. All these will be in posts over the newt few weeks and months but for the purpose of this post I am going to explain some the components of replication:
The Distributor - Can be generally thought of as the link between the components involved in replication. It plays a key role in 'distributing' data from the publisher to the subscribers
Publisher - The publisher is the source database that makes it components available to those wishing to subscribe to it. The Publisher is the data to be replicated.
Subscriber - stores the replicated data and received updates
Article – “is a grouping of data to replicated”
There are several agents that play a key role in replication, there are:
- Snapshot Agent
- Log Reader Agent
- Queue Reader Agent
- Distribution Agent
- Merge Agent
These agents are standalone programs that play a key role in replication. By default they are run using the SQL Server agent. bare in mind that the SQL agent needs to be running for these jobs to run successfully.
Next we’ll be looking at configuring replication.