Snapshot Or Replication which would be most efficient?

Question

Snapshot Or Replication which would be most efficient?

Greg Larsen

SSC-Insane

Points: 20961
More actions
April 22, 2007 at 10:38 am

#173731

I am wondering how to set up replication. Say you have a very slow connection to a subscriber, so you want to minimize the amount of data piped across the network. The subscribers only need the data updated once a day. The publisher and distributor are on the same server, but the subscriber is at a remote location. Now if the articles being published where small it seems that snapshot replication that runs once a night would be a good option. But what if the amount of data associated with my article was quite large. Would transaction replication where the subscriptions were scheduled to run once a night be better then snapshot replication? I assume this would be quicker and require less network traffic if the number of updates was small. Would this assumption be correct? Now what if the number of updates was significant, at what point do you think snapshot replication would be better? Let me know you thoughts and advise here. Thanks.
Gregory A. Larsen, MVP

Viewing 4 posts - 1 through 3 (of 3 total)

You must be logged in to reply to this topic. Login to reply

Bulent Gucuk SSCertifiable Points: 5498 More actions · Answer 1

I had faced the similar questions couple months ago. The decision you will make depends on the bandwitdh and your databases. We know that you have a slow connection. So if you have a small database or mostly static database I would do snapshot replication and schedule the snapshot at the intervals you need. If you have a large database and heavy updates I would set up a transactional replication and schedule the transaction to be sent to subscribers twice a day most likely. One thing you should be aware is that once you set up a transactional replication your tlog file won't be truncated at time when you take tlog backup because there are transactions that were not sent to subscriber yet. So how big is your data and how much updates happening and how large of tlog file you can have are some of the questions you will have to answer. did I just answer your questions with more questions? 🙂

Good day,

Bulent

Andy Warren SSC Guru Points: 119902 More actions · Answer 2

With heavy updates you might end up better off with snapshot replication because the volume of proc calls to apply the change might exceed the cost of just shipping over a clean copy. Do they only want it updated once a day, or it has to be updated at least once a day? With transactional you can just let it run during the day and it will probably keep up pretty well even over a slow link.

Wouldnt take that long to benchmark in the real environment, but without that I'd tend to go with transactional - for no scientific reason!

Andy
SQLAndy - My Blog!
Connect with me on LinkedIn
Follow me on Twitter

DickBaker Old Hand Points: 334 More actions · Answer 3

the LogReader agent [on Distributor] reads changes from your userdb's tranlog and posts into the distribution db [and should keep up with your change rate]. The userdb's tranlog will NOT stay large if DA to a sub is slow/stopped, but the distribution db may get large [hence the 72-hr default to discourage getting silted up].

As other posts have mentioned, the DA will use sprocs to replay changes at the sub. This may not be efficient [depends on indexes as ever and network packet format - search BOL for SCALL etc]. Contrast this with the fast BULK INSERT to deliver initial Snapshot.

I concur that tests should be done to be site-specific (change/churn rate, WAN bandwidth etc)

- your mileage may vary!

FWIW many shops may conduct an overnight batch process, and these are unlikely to be written as "replication-friendly". Supposing each row has delete(1), insert(1) and update(8) done, then a transactional pub will have to ship those 10 changes to each sub. Whereas a Snapshot after batch has completed will just need one.

Consider doing all your heavy work-in-progress in another db which has Recovery=Simple [to reduce tranlog pressure and local Log-Shipping traffic to D/R], and then put finished results in the published db.

HTH