What are the major factors effecting Merge Replication time, chalanging to see why it takes so long with minor changes ?

  • Hello,

    I am a newbie to these forum so welcome all Gurus. I am also new to replications and know it only by some limited experience and doing some research. I took over the project that has complex replications setup across 100s of clients. I do not want to make any changes before I get familiar with it since I do not want to effect real time clients. It looks like lately replication performance has gone down and after some research and reading I was able to identify some issue where others are still unknown to me. I hoped some of you with more experience can help me identify and clarify some of the principals and issues.

    1. My transaction log file is growing pretty fast and it looks like it was never set to be truncated for 3 years and now is as big as 40G. I want to setup truncation and before I do this I was wondering if doing this will improve merge replication time, just not sure if it has anything to do with size of the log. Also was wondering if log truncation will effect any clients that will have to be redone from scratch and data that is stored on the server will have to be replicated to the client from getgo.

    2. The merge replication setup on the server is filtered by Client ID what has all the information related to the client. I am a little puzzled to understand on why it takes nearly 24 hours to run initial merge replication sync for newly setup clients that have no data yet on both ends. Does it go through all other clients replications? Is there any way I can improve the first sync by downloading snapshot or by archiving old data in DB?

    3. One more puzzle relative to #2. When client doe snot do replication synch for a while but has no data changes since last sync on both server and client, it takes a long time to synch with not data to be synched. I think I am failing to understand on how replication work on filtered articles. I thought it only picks up changes for specified client, but results shows that changes efected by other clients effect each other synch time. Can smobody help me understand this

    Thank you so much in advance. Any tips will be greatly appreciated.

  • Hello,

    After reading our post, I can't help noticing you have quite a large subscriber base against your publications. However, I also noticed that your lack of basic replication knowledge may get you into trouble soon. Here's some replies to your post, however I suggest you really dig into replication deep before you start working / altering your environment.

    As for your questions / remarks :

    1. No, your transaction log size will not afect your merge replication. Merge replication does not use the transaction log to synchronize data between your databases. Instead, it will use metadata tables (created automatically during the replication setup process in your publisher and subscriber databases). These metadata tables are system tables and should not be altered by you directly. The metadata in these tables comes from triggers which are created on your published tables (at both publisher/subscriber side).

    2. The initial synchronisation of published tables is no more then copying over the initial snapshot to the subscriber. The larger your dataset, he more time it takes. You use filtered sets. Parametrized filtering creates a snapshot for each partition of the data. Your initial set will only synchronize the appropriate partition to the corrsponding subscriber. Here are some factors that may affect the duration of that initial synchronization process :

    a. Size of the snapshots / parametrized partitions

    b. Speed of your network and specifically the connection between your server(s) and the client(s)

    c. Heavy server / client load on either side

    3. As for your last remark, there may be a couple of reasons why that synchronization takes a while :

    a. Your subscriptions expired and required a new snapshot (i.e. a full base load) because the last synchronization was beyond the retention period (= the max number of days of merge metadat history in your replication)

    b. You have a huge set of changes on your data with massive metadata tables (check their size to confirm) hence slowing down the whole operation since even if there's nothing to change, it still walks through the full metadata set.

    As a last remark, I can only stimulate you to really dig into the replication topics in BOL, online resources or MSDN. Replication is a very nice technology but a bitch to keep it working smoothly, especially in large environments. A prerequisite to get it setup and configured decently is that you'd fully understand all ins and outs of all replication types you're going to use. Additionally, I'd suggest youdig into the tables/procs behind replication to figure out how it works technically. It learns you a lot about what is happening when something goes wrong during configs or synchronizations.

    With kind regards,

    Koen Matthijs

  • Hi Koen,

    Thank you very much for your time and great answer. I do agree with you that I need to learn more on how Replications work and what is behind the hood. I am the person who do not like to work with things that I do not understand. However, I am being faced challenging situation where I was thrown this project and many others like support, development for both windows base and web base environment and I just physically have not time and can not take time off to get inside the replications since it is small part of my duties but important to maintain the project. This was set up years ago and worked with little issues here and there under different people's supervision. Our client base is growing so number of subscribers what effects the performance and I do not want to be at the situation where it fails one day or at least need to understand what is involved in it to cure current issues, and I do not want to make any changes to initial setup even if it was wrong without thorough understanding of it so I will brake nothing . When my boss does not understand what is involved and does not want to, I try to do my bets by garbing different information from here and there and using Forums to get some help from experienced people like you so I can get my hands on it.

    I tried to analyze your answers and do some reading and was wondering if you can help me with the questions that came up and confirm correctness of my conclusions.

    1. I checked publication properties and found out that is set to never expire subscription so it looks like meta tables keep all the data from very beginning, is it correct assumption?

    The speed of the network is decent and never change, client has no load so the only thing what can be related to performance degrade is probably data load and server load increase.

    What happens with newly setup subscriber clients is that the part of data that is partitioned for this client does not really have any data on neither server or a client since it is brand new client and still it takes over 24 hours. The same happens with existing clients if they do not synch for long time and even if there is no data being updated on both server and client. Logically looking at it I assume that every time client and server synchronize it looks into meta data tables from the last synch and goes through entire data set regardless if it belongs to this partition or not what takes a while since other clients added much data to metadata table during the time synch was not performed by particular client or for new clients it is entire time from getgo. Is it correct understanding. Now as far as solution go

    a. I can make subscription expire after period of time but I think it will make it even worse

    b. The only solution I see is to recommend the client to clients to synch more often and as far as intion setups just bear with how long it takes, not sure if

    there is any remedy.

    d. As far as metadata tables I found many that are responsible for replication process, which once should I be looking at?

    2. The worst part about it is when error comes and it fails to synch and I am not sure what actions to take. I try many things and most of the time it fixes but I was not able to establish methodological steps on how to handle it since it is different every time. Most of the time it is just solves itself by running it over and over. One major thing I fail to understand is on how to check if synch is really running. I have setup and done many experiments but still have no answer. Synchronization process is triggered programmaticly by the application. However sometimes it is throwing different errors in synchronization and SQL subscription status is very misleading

    a. I looked at the client subscription what is indicated as RUNNING and I thought that it might be doing something even though it failed through application so decided to leave it for few days and it is still indicates as RUNNING so I checked the processes what did nto give me the clue either, so I made a conclusion that it must be some MS bug.

    b. I can not use Server Replication Monitor either because it uses refresh time set(I have it for 60) to judge the status if not response form client comes. When replication is actually running successfully on the client side, it shows as failed after that set period of time(60) what the after 24 hours when it is done on the client it changes to successful. So I am making logical conclusion that it must be long task running on the client if I understand it correctly but client is new and has no data what makes it confusing.

    The bottom line is I am not sure on how to really check is synchronization is doing anything on either client or server, because the status seems to be misleading or buggy. Do I just look for certain process? I just have a feeling that even it fails on programatic level it is still doing something on SQL server level

    Sory for very long post.

    Thanks for any tips very much.

Viewing 3 posts - 1 through 2 (of 2 total)

You must be logged in to reply to this topic. Login to reply