Handling Conflicts in Merge Replication
Introduction
Merge replication is a bi-directional replication where the
data changes that occur on the publisher and the subscriber are merged at the
time of synchronization. This article primarily focuses on how merge replication
detects conflicts and how they are resolved using the default and custom conflict
handlers.
Conflict Detection
Conflict occurs whenever data changes both on the publisher
and the subscriber at the time of synchronization. You can specify whether you
want merge replication to recognize conflicts at a row-level or at a
column-level based on your business requirements. This is illustrated in figure
1. When row-level conflict detection is enabled for the authors table, changes
to a particular row in the table on the publisher and the subscriber will be
considered a conflict even though the columns changed may be different. However,
in the case of column-level conflict detection, the data changes must happen to
the same columns in a particular row on both the publisher and the subscriber.
Figure 1. Choosing the conflict detection type
For instance, the last name for the author Michel has
changed from DeFrance to Olivier on the publisher, while the phone number has
changed from 219- 547-9982 to 340-545-6677 on the subscriber. In this case,
merge replication will detect a conflict only if row-level conflict detection
has been enabled for the table Authors. If column-level detection is enabled,
then the changes will be merged without any conflict. Regardless of the conflict
type, when a conflict occurs, the entire row from the publisher will overwrite
the one on the subscriber if the default conflict resolver is selected (Figure
2.) and if the publisher wins the conflict.
Figure 2. Choosing the conflict resolver type
In merge replication, all updates and inserts are tracked in
the MSmerge_contents system table and deletes in MSmerge_tombstone system table
in the publication and subscription databases. For each row modified in a base
table there is only one row created in these two system tables. The lineage
column in the two system tables plays the role of version tracking for all the
changes that have occurred to a row so far. At the time of synchronization, the
merge agent compares the lineage values in the system tables on the publisher
and the subscribers. If they are different then the row has been updated since
the last synchronization.
If conflict detection is enabled at the column level for a
table, then colvl column (varbinary data type) in MSmerge_contents system table
tracks version numbers for all the columns in the base table. Suppose if the
lastname column in the authors table is updated, the lineage value for the row
changes and the colvl correspondingly gets a new version number to reflect this
change. So column-level tracking results in more overhead compared to the
row-level as changes to every single column in a table have to be tracked. After
the synchronization, the lineage and the colvl values are updated so that the
versions values look identical on the publisher and the subscriber.
Conflict Resolution
SQL Server comes with a wide range of conflict resolvers.
Let us look at some of the available options
1. Default Resolver
The default resolver is a priority-based resolver, where you
can assign different priority values to the subscribers to determine who would
win in the case of a conflict.
When setting up a subscriber using the wizard, you will see
a screen as illustrated in figure 3 where you can set the subscription priority.
There are two options you can choose from. The first one creates a local subscriber,
so that the merge agent uses the publisher as a proxy for the subscriber when
resolving conflicts. A local subscriber is one whose priority is 0.00. The second
option creates a global subscriber with the priority value you assign. You can
assign priority values between 0.00 and 99.99 to the global subscribers.
Subscribers with higher values win over those with lower values when a conflict
occurs.
Figure 3. Choosing the subscription priority
When you want all your subscribers to have the same level of
priority then, you can create local subscribers. After the first local subscriber
synchronizes with the publisher, the changes assume the priority value of the
publisher. Since the publisher has the highest priority by default, the first local
subscriber’s changes at the publisher will always win over any subscriber
conflicts during later synchronizations.
When you want different subscribers to have different
priorities, you can manually assign the priority values. Let us look at a few
examples to understand how conflicts between subscribers with different
priority levels are resolved. Let us assume that Company ABC has its
headquarters in Dallas (Publisher) with four other office locations as
illustrated in the table below. We have couple of local subscribers and couple
of global subscribers at Chicago and New York with priority values of 90 and 65
respectively. Data changes made at any location will be merged with the rest.
Location | Priority |
Dallas (Publisher) | 100.00 |
Atlanta (Local Subscriber) | 0.00 |
Chicago (Global Subscriber) | 90.00 |
New York (Global Subscriber) | 65.00 |
Phoenix (Local Subscriber) | 0.00 |
Let us assume that a Customer David Smith lives in zip 77523.
Initially all the subscribers have the same value.
Location | Zip |
Dallas (Publisher) | 77523 |
Atlanta (Local Subscriber) | 77523 |
Chicago (Global Subscriber) | 77523 |
New York (Global Subscriber) | 77523 |
Phoenix (Local Subscriber) | 77523 |
Scenario 1: Publisher and Local Subscriber Conflict on update
Publisher updates the zip for David Smith to 77644 and the
Local subscriber at Atlanta changes it to 76222. The next time Atlanta synchronizes with the publisher (Dallas), a conflict is detected. Since the
publisher has the highest priority, the changes at the publisher win. The rest
of the subscribers subsequently synchronize with the publisher and receive the
changes.
Location | Zip |
Dallas (Publisher) | 77644 |
Atlanta (Local Subscriber) | 77644 |
Chicago (Global Subscriber) | 77644 |
New York (Global Subscriber) | 77644 |
Phoenix (Local Subscriber) | 77644 |
Scenario 2: Publisher and Global Subscriber Conflict on update
Publisher now updates the zip for David Smith to 72233 and
the Global subscriber at Chicago changes it to 72111. The next time Chicago synchronizes with the publisher (Dallas), a conflict is detected. Since the
publisher has the highest priority, the changes at the publisher win. The rest
of the subscribers subsequently synchronize with the publisher and receive the
changes.
Location | Zip |
Dallas (Publisher) | 72233 |
Atlanta (Local Subscriber) | 72233 |
Chicago (Global Subscriber) | 72233 |
New York (Global Subscriber) | 72233 |
Phoenix (Local Subscriber) | 72233 |
Scenario 3: Global and Local Subscriber Conflict on update
Suppose the Global Subscriber at New York updates the zip
for David Smith to 55223 and the Local subscriber at Phoenix updates it to
43232. There are two possibilities here as to who wins the conflict based on
who synchronizes with the Publisher (Dallas) first.
Case 1: Global Subscriber Synchronizes first
Suppose the Global Subscriber at New York synchronizes first with the publisher.
The priority value of 65 for this subscriber is also stored
at the publisher. When the local subscriber synchronizes now, it loses since
the global subscriber has a higher priority level.
Location | Zip |
Dallas (Publisher) | 55223 |
Atlanta (Local Subscriber) | 55223 |
Chicago (Global Subscriber) | 55223 |
New York (Global Subscriber) | 55223 |
Phoenix (Local Subscriber) | 55223 |
Case 2: Local Subscriber Synchronizes first
Suppose the Local Subscriber at Phoenix synchronizes first
with the publisher.
The changes take the priority of the publisher. When the
global subscriber synchronizes now, it loses since the Publisher has the
highest priority level.
Location | Zip |
Dallas (Publisher) | 43232 |
Atlanta (Local Subscriber) | 43232 |
Chicago (Global Subscriber) | 43232 |
New York (Global Subscriber) | 43232 |
Phoenix (Local Subscriber) | 43232 |
Scenario 4: Global Subscribers Conflict on update
Suppose the Global Subscriber at New York updates the zip
for David Smith to 76063 and the Global subscriber at Chicago updates it to 66345.
New York (priority value 65.00) synchronizes first with the publisher. The
priority value of 65 is stored at the publisher to resolve any conflicts. When Chicago (priority value of 90.00) synchronizes now, the subscriber with the highest
priority value wins the conflict regardless of whether it is the first one to
synchronize
Location | City |
Dallas (Publisher) | 66345 |
Atlanta (Local Subscriber) | 66345 |
Chicago (Global Subscriber) | 66345 |
New York (Global Subscriber) | 66345 |
Phoenix (Local Subscriber) | 66345 |
2. Custom Resolvers
If the default resolver does not meet your business
requirements you can choose to use a custom resolver. Custom Resolvers are
specific to a table. If you alter your table, make sure that you modify the
resolver accordingly.
Figure 4. Choosing a Custom Resolver
You can choose a custom resolver by clicking on the Table
Properties button (…) and selecting the Resolver tab on the Properties screen
as shown in figure 4. You can also use the merge stored procedures sp_addmergearticle
and sp_changemergearticle to specify the custom resolvers.
SQL Server comes with a number of custom resolvers. In most
cases you will specify a column name as the required input for the resolver.
For instance in the case of “Maximum Conflict Resolver”, the publisher or the
subscriber with the larger column value will be the conflict winner. However if
you choose “Subscriber Always Wins Conflict Resolver”, you do not have to
provide a column name as an input since the subscriber row wins each time there
is a conflict. Here is a list of the custom resolvers that come with SQL Server
installation.
- Microsoft SQL Server Additive Conflict Resolver
- Microsoft SQL Server Averaging Conflict Resolver
- Microsoft SQL Server DATETIME (Earlier Wins) Conflict
Resolver
- Microsoft SQL Server DATATIME (Later Winds) Conflict
Resolver
- Microsoft SQL Server Maximum Conflict Resolver
- Microsoft SQL Server Merge Text Conflict Resolver
- Microsoft SQL Server Minimum Conflict Resolver
- Microsoft SQL Server Subscriber Always Wins Conflict
Resolver
- Microsoft SQL Server Upload Only Conflict Resolver
COM based Conflict Resolver
If the above mentioned conflict resolvers do not suit your
needs you can write a COM based custom resolver using Visual Basic or VC++ and
register the DLL at the server where the merge agent runs. For a push
subscription the merge agent runs at the distributor and for pull at the
subscriber.
Stored Procedure Conflict Resolver
You can write a stored procedure based custom conflict
resolver that uses T-SQL to implement your business logic which determines who
the winner will be. Stored procedure resolvers are always created on the
publisher and only for update conflicts. Stored procedure resolvers return a
single row that is identical to the schema of the base table. You can also
change the values in any of the columns when you return the final result set.
This row will be the winning row and the values will be used to update both the
publisher and the subscriber.
The custom stored procedure that you create uses the
following required parameters
Parameter | Type | Comments |
@tableowner | sysname | Owner for the conflicting table |
@tablename | sysname | Name of the conflicting table |
@rowguid | uniqueidentifier | Rowguid for the row having the conflict |
@subscriber | sysname | Subscriber server that is synchronizing |
@subscriber_db | sysname | Name Database on the conflicting subscriber |
@log_conflictOUTPUT | INT | Merge logs the conflict based on the value provided. |
@conflict_messageOUTPUT | nvarchar(512) | Message to be logged |
Let us take a simple example of an ordering system for a
retail company ABC. The publisher will be the database server at the corporate
office. The subscribers will be the servers at the individual stores. We will
just look at one table “Products” from the Northwind database for our
discussion purposes. The schema for the products table has been slightly altered
and is as follows
Products
Column | Type |
ProductID | int |
ProductName | nvarchar (40) |
SupplierID | int |
CategoryID | int |
UnitPrice | money |
UnitsInStock | smallint |
UnitsOnOrder | smallint |
ReorderLevel | smallint |
Discontinued | bit |
OrderQuantity | smallint |
rowguid | uniqueidentifier |
Let us assume that the retail stores can request products by
updating the OrderQuantity column. Company ABC wants to write a custom stored
procedure that handles conflicts when multiple stores (subscribers) change the
same data. ABC has decided to code the following business logic in the conflict
resolver.
- Each time a conflict occurs, add the values of the OrderQuantity
column from the publisher and the subscriber. Basically if store x orders
50 and store y 100, the final value in the OrderQuantity column should be
150 (x+y)
- Compute the Units to be ordered based on the units already
ordered and Units in Stock
- If the Units that need to be ordered is greater than the
ReorderLevel, place an order and update the UnitsonOrder column
/*** Custom Stored procedure conflict resolver for products table ***/create procedure sproc_products_conflicts_handler @tableowner sysname, @tablename sysname, @rowguid varchar(36), @subscriber sysname, @subscriber_db sysname, @log_conflict INT OUTPUT, @conflict_message nvarchar(512) OUTPUT AS set nocount on DECLARE @OrderQuantity smallint, @Reorderlevel smallint, @UnitsOnOrder smallint, @UnitsInStock smallint, @reorderQuantity smallint, @SQL_TEXT nvarchar(2000) /*** Temp table to hold OrderQuantity value from the conflicting subscriber ***/ create table #tempSubscriber ( OrderQuantity smallint, rowguid varchar(36) ) SET @SQL_TEXT ='insert into #tempSubscriber (OrderQuantity, rowguid) Select OrderQuantity, rowguid From ['+@subscriber+'].'+@subscriber_db+'.'+@tableowner+'.'+@tablename+ ' Where rowguid='''+@rowguid+'''' EXEC sp_executesql @SQL_TEXT Select @OrderQuantity = OrderQuantity From #tempSubscriber where rowguid=@rowguid /*** Get the current values from the publisher. The new OrderQuantity will be sum of the column values from the publisher and the subscriber ***/select @OrderQuantity =@OrderQuantity + OrderQuantity, @UnitsOnOrder=Unitsonorder, @UnitsInStock= UnitsInStock, @Reorderlevel= Reorderlevel from Products where rowguid=@rowguid /*** Calculate the ReorderQuantity based on the Units already ordered and Units in stock ***/Set @ReorderQuantity = (@OrderQuantity - @unitsonorder-@UnitsInStock) /*** If the Units required is greater than the reoderlevel, place an order ***/If @reorderQuantity > @Reorderlevel Begin /*** Place an order for the product ***/ Insert into orders (productid, quantity, DateOrdered) Select productid, @reorderQuantity, getdate() from Products where rowguid=@rowguid /*** Reset the Unitsonorder value ***/ Set @UnitsOnOrder = @UnitsOnOrder + @reorderQuantity End /***Update the publisher and Subscriber with the new column values ***/select productid, productname, supplierid, categoryid, quantityperunit, unitprice, unitsinstock, @unitsonorder, Reorderlevel, discontinued, rowguid, specialorder, @OrderQuantity from products where rowguid=@rowguid drop table #tempsubscriber GO
Figure 5. Choosing a Stored procedure Custom Resolver
Once you create your stored procedure in the publication
database, you can choose the Custom resolver option from the Products table
properties and enter the name of the stored procedure as illustrated in figure
5. Alternatively, you can set the values of @article_resolver and
@resolver_info arguments in the merge sp_addmergearticle stored
procedure while creating the publication.
Conclusion
Hopefully this article provides enough information for you to
choose among the various conflict resolvers and detection types that would meet
your business requirements. For DBAs with little or no programming background
in VB or C++ my recommendation is to implement the stored procedure custom
resolver. However keep in mind that you will need to change the final SQL
statement in the procedure every time you modify the structure of your base
table.
Satia Madimchetty
Sr. Consultant, Scalability Experts