How to stop Multi-row I/U/D from splitting into many single I/U/D during replication?

  • dbirchok (6/24/2008)


    I did a lot of research about this and your answers make no sense. Let me go “step-by-step” for you.

    1.Your reason for doing one row at a time would be equally valid when updating a table in the original publishing database. We all know you can write multi-row I/U/D statements against SQL server tables.

    2.When any one of the many single-row inserts fails, the entire multi-row insert fails replication. Any successful single-row inserts before the failure are rolled back. You can’t “verify that the row is not there” because none of the rows are there.

    3.And let’s just say that we could keep the successful rows, you still would have trouble identifying the problem row. For example, 5 rows where updated, 2 made it through replication. You don’t know the order the updates happened so you don’t know which of the last 3 caused the problem. Maybe, in this case, you could look at each of three rows manually. However, multiply these counts by 1000, and that option flies out the window.

    4.Multi-row Updates can be replicated as Multi-row updates when the primary key isn’t changed. It is currently happening in our replication process (As I stated in my earlier posts).

    5.And before you say it, a primary key change does not guarantee that a row you want to update is still in the replicated table.

    6.When using stored procedures (auto generated or custom) to do replication, they have a few fixed set of input parameters to choose from. All are based on processing one row at a time. It seams to me, the only chance to get multi-row replication behavior is to use I/U/D statements for replication. (I could be wrong on this one)

    Saying "By DESIGN" is a cop-out. I am sure replication is working as Microsoft has it currently designed. And its design is good enough for most replication. I am just pushing the edge of the envelope. Microsoft has changed designs when enough of a need is there.

    I sorry that my posts may be a bit terse. I apologize to the many experts on this site that have helped many people solve their problems. I posted this to see if any of you have the in depth knowledge about my topic. Not to get people who are trying to inflate there stats by regurgitating pat answers from Microsoft’s anemic documentation.

    Maybe I was not clear enough. You said you use transactional replication, right?

    When you update MULTIPLE rows with one statement the LogReader creates AS MANY Stored procedure calls as ROWS were affected on the Distribution database (MSrepl_commands table and one row per transaction boundary on MSrepl_transactions table). Then the distribution agent picks then up applies them ONE-BY-ONE wrapped on a SINGLE transaction (USING TRANSACTION BOUNDARIES FOR THE STATEMEMT that affaceted the primary). When a data consistency error is detected the entire transaction rollsback on the replica and replication logs the Error on the distribution database!! But on the Error recorded by the agent you will get the exact row that failed to apply at the replica. This shreading of multiple commands happens even when you call bcp on the primary.

    When I said by-design I did not meant so say that M$ did a good or a bad job at doing this. I just meant to say that they chose to do it as I just described above.

    If you want an statement being passed as such, the preffered method, Like I indicated before is to use replication of stored procedure execution. This will create in the distribution database a single call to the stored procedure replicated from the primary and you will get the single call also on the replica.

    An example:

    assume there is a procedure that delete names starting with "K" and you replicate it's execution

    What happens is that even if it deletes 2 million rows, only one row is added in the distribution database and this procedure will be called ONCE at the replica.

    I could not care less about what you call statistics or "your opinion" of my trying to help you understand how the process really works.

    Good luck!


    * Noel

  • Yes we are using transaction replication.

    I miss understood what you meant by stored procedure replication. I though you meant using stored procedures in table replication. The straight-up stored procedure a different concept. I looked into this and it won’t work for my case. The mass insert is buried inside a huge stored procedure (I didn’t write it). The store procedure uses many tables in a couple of different databases, not all of which are replicated. We are just replicating a portion of the tables for reporting purposes.

    One other thing, you keep saying that transactional replication always splits up multi-row I/U/D statements into single-row statements. What I describe happening for some of my Multi-row updates directly contradicts that statement. They are being applied to the replicated table as one Multi-row update. Note: there is no stored procedure replication that you mentioned happening in our database. Are you sure of this information? Or, are your simplifying for brevity?

    Sorry if you were offend at the last paragraph. It wasn’t aimed at you. I have just been getting frustrated at the level of what some people think is help. You saw the two posts before yours. I did apologize to true experts before I said it.

  • dbirchok (6/24/2008)


    Yes we are using transaction replication.

    I miss understood what you meant by stored procedure replication. I though you meant using stored procedures in table replication. The straight-up stored procedure a different concept. I looked into this and it won’t work for my case. The mass insert is buried inside a huge stored procedure (I didn’t write it). The store procedure uses many tables in a couple of different databases, not all of which are replicated. We are just replicating a portion of the tables for reporting purposes.

    One other thing, you keep saying that transactional replication always splits up multi-row I/U/D statements into single-row statements. What I describe happening for some of my Multi-row updates directly contradicts that statement. They are being applied to the replicated table as one Multi-row update. Note: there is no stored procedure replication that you mentioned happening in our database. Are you sure of this information? Or, are your simplifying for brevity?

    Sorry if you were offend at the last paragraph. It wasn’t aimed at you. I have just been getting frustrated at the level of what some people think is help. You saw the two posts before yours. I did apologize to true experts before I said it.

    This is BOL explantion: Specifying How Changes Are Propagated for Transactional Articles

    The relevant part:

    By default, transactional replication propagates changes to Subscribers through a set of stored procedures that are installed on each Subscriber. When an insert, update or delete occurs on a table at the Publisher, the operation is translated into a call to a stored procedure at the Subscriber. The stored procedure accepts parameters that map to the columns in the table, allowing those columns to be changed at the Subscriber.

    Note that you could change the default to use U/I/D straight SQL commands but they all use the primary key for the operation

    I am not sure why you are seeing multi-row updates replicated as such. Maybe I am the one missing something. I am not aware if there is a better way.

    By the way if the procedure needs non-replicated data for its operation then, you are correct, procedure execution replication is not for you.

    I know this does not add a lot to what you already know but I thought I should reply.


    * Noel

Viewing 3 posts - 16 through 17 (of 17 total)

You must be logged in to reply to this topic. Login to reply