April 22, 2007 at 10:38 am
I am wondering how to set up replication. Say you have a very slow connection to a subscriber, so you want to minimize the amount of data piped across the network. The subscribers only need the data updated once a day. The publisher and distributor are on the same server, but the subscriber is at a remote location. Now if the articles being published where small it seems that snapshot replication that runs once a night would be a good option. But what if the amount of data associated with my article was quite large. Would transaction replication where the subscriptions were scheduled to run once a night be better then snapshot replication? I assume this would be quicker and require less network traffic if the number of updates was small. Would this assumption be correct? Now what if the number of updates was significant, at what point do you think snapshot replication would be better? Let me know you thoughts and advise here. Thanks.
Gregory A. Larsen, MVP
April 23, 2007 at 6:58 am
I had faced the similar questions couple months ago. The decision you will make depends on the bandwitdh and your databases. We know that you have a slow connection. So if you have a small database or mostly static database I would do snapshot replication and schedule the snapshot at the intervals you need. If you have a large database and heavy updates I would set up a transactional replication and schedule the transaction to be sent to subscribers twice a day most likely. One thing you should be aware is that once you set up a transactional replication your tlog file won't be truncated at time when you take tlog backup because there are transactions that were not sent to subscriber yet. So how big is your data and how much updates happening and how large of tlog file you can have are some of the questions you will have to answer. did I just answer your questions with more questions? 🙂
Good day,
Bulent
April 23, 2007 at 7:15 am
With heavy updates you might end up better off with snapshot replication because the volume of proc calls to apply the change might exceed the cost of just shipping over a clean copy. Do they only want it updated once a day, or it has to be updated at least once a day? With transactional you can just let it run during the day and it will probably keep up pretty well even over a slow link.
Wouldnt take that long to benchmark in the real environment, but without that I'd tend to go with transactional - for no scientific reason!
May 3, 2007 at 4:46 pm
the LogReader agent [on Distributor] reads changes from your userdb's tranlog and posts into the distribution db [and should keep up with your change rate]. The userdb's tranlog will NOT stay large if DA to a sub is slow/stopped, but the distribution db may get large [hence the 72-hr default to discourage getting silted up].
As other posts have mentioned, the DA will use sprocs to replay changes at the sub. This may not be efficient [depends on indexes as ever and network packet format - search BOL for SCALL etc]. Contrast this with the fast BULK INSERT to deliver initial Snapshot.
I concur that tests should be done to be site-specific (change/churn rate, WAN bandwidth etc)
- your mileage may vary!
FWIW many shops may conduct an overnight batch process, and these are unlikely to be written as "replication-friendly". Supposing each row has delete(1), insert(1) and update(8) done, then a transactional pub will have to ship those 10 changes to each sub. Whereas a Snapshot after batch has completed will just need one.
Consider doing all your heavy work-in-progress in another db which has Recovery=Simple [to reduce tranlog pressure and local Log-Shipping traffic to D/R], and then put finished results in the published db.
HTH
Viewing 4 posts - 1 through 3 (of 3 total)
You must be logged in to reply to this topic. Login to reply