In an earlier article I introduced the statement delivery options and this
continues the discussion. If you're new to replication I recommend reading that article
first and then returning here. Just as quick background I've created a
simple transactional publication consisting of a single table called 'SomeNewTable'
and a single subscriber. Before I added the subscriber I changed the insert
delivery format to insert statement as show below.
Let's look at what that created on the subscriber:
As you can see there is no insert stored procedure. If we run Profiler and do
an insert on the publisher we see this sent our subscriber:
exec sp_executesql N'insert into [dbo].[SomeNewTable]([EmailAddress],[FirstName],[LastName]) values (@P1,@P2,@P3)', N'@P1 nvarchar(7),@P2 nvarchar(4),@P3 nvarchar(6)',N'5@6.com',N'Andy',N'Warren'
A good parameterized insert statement, the only unusual thing is that even
though my test table has firstname and lastname defined as varchar, the
parameters are typed as nvarchar - but it still works! The upside of this
technique is we don't have to worry about code on the subscriber and because the
insert columns are specified, we don't have to worry about them changing the
order of the columns. The downside is that we lose our ability to inject logging
code or otherwise change the behavior on the subscriber. We're still calling
this once for every row we insert on the publisher. It doesn't matter how we do
the insert statement (fully qualified columns or values only), the same syntax
as shown above will be used to send the insert to the publisher.
If you look back at the insert options above it looks like we can also use a
setting called 'Insert statement without column list'. In my testing I was
unable to get replication to persist the setting even though it would let me
select it and click ok without an error message. I couldn't find any references
when I searched on the web, and digging into BOL for sp_addarticle shows only
three supported options for insert command; no action, stored procedure, or
insert statement). It looks like this is just a quirk in the UI where they
planned to support the less verbose but more fragile style of insert statement
and it wasn't implemented elsewhere...or I'm missing something!
The final option is to use the 'Do not replicate..' setting which is
occasionally useful. This setting doesn't create a stored procedure on the
subscriber or send inserts using sp_executesql, it just ignores them. This is
better than the option often used in earlier versions where the subscriber
insert proc would be modified to return successfully without actually doing the
insert. It achieve the behavior but still incurred the overhead of saving and
distributing all the inserts that occurred on the publisher. It's not a common
scenario, but this is a nice clean way to accomplish it.
Of the insert options which should you use? The default is the stored
procedure created on the subscriber and it has two advantages. One is that it
gives you an entry point to tweak the replication process during inserts if
needed, and the other is that it's just the default and easier to use it than
not. The advantages of the insert statement option would be that you don't need
to create or update an insert stored procedure on the subscriber. As far as
performance they should both perform about the same, you're still performing
inserts one row at a time. The do not replicate option is a definite win if you
need that behavior, it saves a lot of overhead between not putting the inserts
into the distribution database, not sending them over the network, and not
applying them to the subscribers.
Now let's look at the delete options.
The first option is to 'Do not replicate DELETE statements' and obviously
does just that. This is more common usage because you'll have cases where
subscribers wish to keep history longer than the publisher does. The second
option is to send over a TSQL delete statement, let's take a look at what would
be for our example:
exec sp_executesql N'delete from [dbo].[SomeNewTable] where [EmailAddress] = @P1' , N'@P1 nvarchar(7)',N'2@3.com'
Nice clean syntax and parameterized so we should get good query plan reuse.
Now let's change our options to the final choice, XCALL, and then see what
happens when we delete a row:
ALTER procedure [dbo].[sp_MSdel_dboSomeNewTable] @c1 varchar(150),@c2 varchar(50),@c3 varchar(50) as begin delete [dbo].[SomeNewTable] where [EmailAddress] = @c1 if @@rowcount = 0 if @@microsoftversion>0x07320000 exec sp_MSreplraiserror 20598 end
This is very similar to the default stored procedure, the difference is that it
passes two additional parameters (our other two columns) even though they are
not used in the delete statement. I guess potentially you could modify the
procedure to use those values for something but it really don't seem like a
useful option.
Now let's look at update options. We'll start by looking at the choices available in the UI:
The first option is 'Do not replicate UPDATE statements' and is easy to use.
Just as we discussed for inserts and deletes, setting this option will tell SQL
to totally ignore updates, saving the overhead of moving them to the
distribution database and then to the subscribers. No stored procedure for
updates will be created on the subscriber. Next is the UPDATE statement option. I'll set it and re-snapshot, then we can
see what a replicated update looks like with this setting:
exec sp_executesql N'update [dbo].[SomeNewTable] set [FirstName] = @P1 where [EmailAddress] = @P2', N'@P1 nvarchar(1),@P2 nvarchar(14)',N'2',N'test5@test.com'
This is a good clean update. I modified the first name column and that is the
only column being modified on the subscriber, and it's using a primary key to
identify the update. Now let's move on to to the CALL syntax. This creates a new stored
procedure on the subscriber that either updates all the non primary key columns,
or updates all columns including the primary key. This will cause all columns to
be updated regardless of whether they were actually changed.
ALTER procedure [dbo].[sp_MSupd_dboSomeNewTable] @c1 varchar(150),@c2 varchar(50),@c3 varchar(50),@pkc1 varchar(150) as begin if not ( @c1 = @pkc1 ) begin update [dbo].[SomeNewTable] set [EmailAddress] = @c1 ,[FirstName] = @c2 ,[LastName] = @c3 where [EmailAddress] = @pkc1 if @@rowcount = 0 if @@microsoftversion>0x07320000 exec sp_MSreplraiserror 20598 end else begin update [dbo].[SomeNewTable] set [FirstName] = @c2 ,[LastName] = @c3 where [EmailAddress] = @pkc1 if @@rowcount = 0 if @@microsoftversion>0x07320000 exec sp_MSreplraiserror 20598 end end
Let's skip down to look at XCALL. If you look at the next stored procedure
you'll see that it is getting passed both the original and the current values,
but we're only using the current values. This matches what we saw for XCALL
deletes and while it might give us the ability to do some additional stuff in
the update procedure based on the original values there's not much reason to use
it on a day to day basis.
ALTER procedure [dbo].[sp_MSupd_dboSomeNewTable] @c1 varchar(150),@c2 varchar(50),@c3 varchar(50),@c4 varchar(150),@c5 varchar(50),@c6 varchar(50) as begin if not ( @c4 = @c1 ) begin update [dbo].[SomeNewTable] set [EmailAddress] = @c4 ,[FirstName] = @c5 ,[LastName] = @c6 where [EmailAddress] = @c1 if @@rowcount = 0 if @@microsoftversion>0x07320000 exec sp_MSreplraiserror 20598 end else begin update [dbo].[SomeNewTable] set [FirstName] = @c5 ,[LastName] = @c6 where [EmailAddress] = @c1 if @@rowcount = 0 if @@microsoftversion>0x07320000 exec sp_MSreplraiserror 20598 end end
That leaves us with SCALL and MCALL to explore. One interesting point is that
if the primary key is updatable you will get the syntax show in part 1 of this
series, but if you choose a table with a non updatable primary key (an identity
column) you get the simpler stored procedures shown below for the SCALL syntax.
The second procedure shows the MCALL syntax. The only difference is on the
parameter line; SCALL defaults the parameters to null, MCALL does not.
--SCALL syntax for not updatable primary key ALTER procedure [dbo].[sp_MSupd_dboEmployees] @c1 int = null ,@c2 datetime = null ,@c3 varchar(50) = null ,@c4 varchar(50) = null ,@pkc1 int ,@bitmap binary(1) as begin update [dbo].[Employees] set [DateAdded] = case substring(@bitmap,1,1) & 2 when 2 then @c2 else [DateAdded] end ,[FirstName] = case substring(@bitmap,1,1) & 4 when 4 then @c3 else [FirstName] end ,[LastName] = case substring(@bitmap,1,1) & 8 when 8 then @c4 else [LastName] end where [EmployeeID] = @pkc1 if @@rowcount = 0 if @@microsoftversion>0x07320000 exec sp_MSreplraiserror 20598 end
--MCALL syntax, non updatable primary key ALTER procedure [dbo].[sp_MSupd_dboEmployees] @c1 int ,@c2 datetime ,@c3 varchar(50) ,@c4 varchar(50) ,@pkc1 int ,@bitmap binary(1) as begin update [dbo].[Employees] set [DateAdded] = case substring(@bitmap,1,1) & 2 when 2 then @c2 else [DateAdded] end ,[FirstName] = case substring(@bitmap,1,1) & 4 when 4 then @c3 else [FirstName] end ,[LastName] = case substring(@bitmap,1,1) & 8 when 8 then @c4 else [LastName] end where [EmployeeID] = @pkc1 if @@rowcount = 0 if @@microsoftversion>0x07320000 exec sp_MSreplraiserror 20598 end
There is one other option for updates, but you won't find it in the UI. You can
run sp_scriptdynamicupdateproc to generate a stored procedure that will build
and execute dynamic SQL on the subscriber. Running the procedure just generates
the script, it is up to you to apply it to all of your subscribers. The tradeoff
when using this is that you only touch truly changed columns which may increase
performance if many of the columns are indexed, in return you incur the overhead
of building the statement and executing it each time. Here's what we what we
would get from our original test table after running the proc on the publisher:
if object_id(N'[sp_MSupd_dboSomeNewTable]', 'P') > 0 drop proc [sp_MSupd_dboSomeNewTable] go if object_id(N'dbo.MSreplication_objects') is not null delete from dbo.MSreplication_objects where object_name = N'sp_MSupd_dboSomeNewTable' go create procedure [sp_MSupd_dboSomeNewTable] @c1 varchar(150) ,@c2 varchar(50) ,@c3 varchar(50) ,@pkc1 varchar(150) ,@bitmap binary(1) as begin declare @stmt nvarchar(4000), @spacervar nvarchar(1) select @spacervar =N'' select @stmt = N'update [dbo].[SomeNewTable] set ' if substring(@bitmap,1,1) & 1 = 1 begin select @stmt = @stmt + @spacervar + N'[EmailAddress]' + N'=@new1' select @spacervar = N',' end if substring(@bitmap,1,1) & 2 = 2 begin select @stmt = @stmt + @spacervar + N'[FirstName]' + N'=@2' select @spacervar = N',' end if substring(@bitmap,1,1) & 4 = 4 begin select @stmt = @stmt + @spacervar + N'[LastName]' + N'=@3' select @spacervar = N',' end select @stmt = @stmt + N' where [EmailAddress] = @1 ' exec sp_executesql @stmt, N' @1 varchar(150) ,@new1 varchar(150) ,@2 varchar(50) ,@3 varchar(50) ', @pkc1 ,@c1 ,@c2 ,@c3 if @@rowcount = 0 if @@microsoftversion>0x07320000 exec sp_MSreplraiserror 20598 end go if columnproperty(object_id(N'dbo.MSreplication_objects'), N'article', 'AllowsNull') is not null exec ('insert dbo.MSreplication_objects (object_name, publisher, publisher_db, publication, article, object_type) values ( + N''sp_MSupd_dboSomeNewTable'' , N''EARLGREY'' , N''ReplTest'' , N''Test'' , N''SomeNewTable'' ,''P'')')
I know that's a lot of code and a lot of options, so here is a quick recap that
may be easier to make sense of:
Insert | Update | Delete | |
Do Not Replicate | Yes - inserts not captured or transmitted | Yes - updates not captured or transmitted | Yes - deletes not captured or transmitted |
XCALL | NA | Yes - Passes original and current values, uses stored procedure | Yes - passes original and current values, uses stored procedure |
MCALL | NA | Yes - Passes original and current values plus a changed bitmap, uses stored procedure, updates columns to either new or existing value, will not support primary key updates if table has identity column | NA |
SCALL | NA | Yes - Passes current values for changed columns, passes DEFAULT for non changed columns plus a changed bitmap, uses stored procedure, updates columns to either new or existing value(DEFAULT), will not support primary key updates if table has identity column | NA |
UPDATE | NA | Yes - Generates TSQL update statement | NA |
INSERT | Yes - Uses sp_executesql to send column names and values | NA | NA |
CALL | Yes - passes column names to a stored procedure (DEFAULT) | Yes - passes current values and updates all columns, uses stored procedure | Yes, passes primary key to stored procedure (DEFAULT) |
sp_scriptdynamicupdateproc | NA | Yes - has to be created and applied to subscribers manually, only updated columns are changed | NA |
The final result? For the most part the defaults do the job and there isn't a
lot of reason to change them. The do not replicate options are definitely nice
and should be used when appropriate, the others are all for special cases and
the main thing is to just know they are there. The sp_scriptdynamicupdateproc
could be interesting if you're having performance problems on the subscriber.