Standby Servers: Log-Shipping vs
Replication
Many previous articles on log-shipping and replication have concentrated on solving set-up and configuration issues. This article outlines the differences between them, for the purposes of maintaining a standby server. To put this in context,log-shipping and replication are two methods which are often compared to clustering, however clustering is really a technology created to solve a different business requirement - that of automatic fail-over. In solving this requirement it has some intrinsic restrictions: the distance between nodes is extremely limited meaning the entire system must be in the same geographic location, and the data itself physically resides in one place. This latter point sounds obvious, but consequently there is no possibility of using the topology to distribute query load e.g. to have a reporting server. On the more prosaic side it also has a reputation as being relatively difficult to set up and maintain and the licenses required are not cheap. So, many DBAs take advantage of their existing skill set to implement a methodology which maintains a standby server, accept that fail-over will be a manual process but hope to gain the bonus of having standalone reporting capability. Commonly the choice is between log-shipping and replication, but one needs a clear understanding of which methodology is the more appropriate. The table below outlines some of the key differences, which are then explained further.
Log-Shipping | Replication | |
What is the latency? | >1min | Potentially as low as a few seconds |
Is the schema altered at the publisher? | No | Snapshot - no Transactional - no Updating subscribers or merge - yes |
Is the schema altered at the subscriber? | No | Possibly (see text) |
Are there schema requirements? | None | Primary keys required for transactional table articles |
Can we select individual articles? | No | Yes |
Is the subscriber/standby database 'protected'? | Yes | No |
Is system data transferred? | Mostly | No |
Is the subscriber/standby server useable as reporting server? | Unlikely (see text) | Yes |
What is the latency?
Log-shipping can backup once every minute and the copy and load frequency can also be every minute. If you use transactional replication or merge replication, the latency can be as low as a few seconds, assuming the relevant -POLLINGINTERVAL agent parameters are minimized. Snapshot replication will have a much higher latency as it requires an exclusive lock on the publisher's replicated tables - consequently it is often considered an offline solution.
Is the schema altered at the publisher?
Log-shipping and snapshot replication do not alter the publisher's schema. Updating subscribers (transactional and snapshot) and merge replication will add a guid column if there isn't one there already with the rowguid property. This may cause some queries on the publisher to fail, e.g. if you had TSQL inside a stored procedure which did the following:
INSERT INTO ExistingTable SELECT * FROM ReplicatedTable
Is the schema altered at the subscriber?
Log-shipping makes no alterations to the schema. Snapshot and transactional replication may make subtle schema changes; standard transactional and snapshot will not transfer identity attributes - they become normal numerical columns (int, smallint, numeric...) on the subscriber. Some DBAs try to get round this by doing a nosync initialization and ensuring the table on the subscriber has the Identity 'Yes (Not for Replication)' attribute, which essentially allows the replication process to do identity inserts. However, on use of the fail-over server this methodology fails as the internal identity counter has not been incremented, and use of DBCC CHECKIDENT to reseed may not work on columns created with this attribute (see BOL). This problem is not apparent if merge or queued updating subscribers are selected.
Are there schema requirements?
There are no requirements for log-shipping whereas all forms of transactional replication require primary keys on table articles.
Can we select individual articles?
Log-shipping takes the whole set of tables, stored procedures, triggers
etc existing in the published database, while
replication has the granularity to select articles individually.
Is the Standby database
'protected'?
Log-shipping
restores the logs to the standby server with NoRecovery or with Standby - both options
disallow
edits to be made to the data. Replication doesn't enforce such a restriction so explicit permissions would be required to prevent changes
to subscriber data.
Is system data transferred?
A front-end application usually requires the presence of some system data. What I am referring to here is data in the system catalog or the database catalog: logins, messages, users, permissions etc. In the case of log-shipping, the database catalog is transferred but not the system catalog, while replication transfers neither.
Is the subscriber/standby server useable as reporting server?
Log-shipping can't usually allow the standby server to be used for reporting as it requires an exclusive lock on the database to restore the log i.e. during the restore users will be forcibly removed, or the log-shipping job will fail. Replication can allow reporting queries to be run, and as typically concurrent replication is not a big resource drain, such queries usually run quickly.
Conclusions
Hopefully this article provides enough info to make a more informative choice. Although this is a little simplistic, generally I recommend log-shipping for fail-over, followed by transactional replication with queued updating subscribers. The order is reversed if there is a possibility of needing to go back to the production server once it is repaired (in which case one can simply run the queue reader agent) or if there is a requirement for a standby server to function additionally as a reporting server.
Paul Ibison, http://ssisblog.replicationanswers.com/
June 2004