One of the biggest hurdles to using replication on large databases is doing
the initial snapshot. If you're new to replication (and we're focusing on
transactional replication in this article) a snapshot is the process of putting
the tables and related data on the subscriber. A vanilla snapshot consists of
scripting out the tables and indexes, BCP'ing out the data, then applying the
scripts to the subscriber, and finally BCP'ing the data in. It's often faster
than expected based on the database size because it's only the data being moved,
not the indexes themselves. Speed is a function of available disk IO on both
machines as well as the network. Doing a snapshot of a terabyte database across
a WAN can be an exercise in slow.
Prior to SQL 2005 there weren't many great alternatives if you were set on
using replication. One way was to quiesce the system; deny users the ability to
change anything, then run a full backup, copy it to a portable drive, drive or
overnight it to the subscriber location, load it, and then start replication by
configuring it to know that the subscriber already had the data and finally let
the users resume work. Worked okay if you could do this overnight or weekend.
SQL 2000 introduced the concept of a concurrent snapshot - basically a non
blocking snapshot. This let us run the snapshot with minimal impact on users. It
didn't change the time required to post the data to the subscriber though. SQL
2005 now gives us the option to initialize a subscriber from backup without much
pain. It's not supported via the replication dialogs in SSMS, but it's
relatively easy to do and I'm going to demonstrate that today. This article
assumes you already know how to set up replication and build a publication.
After creating a publication, right click it and select properties. Change
the allow initialization from backup files setting to True, then close the
dialog.
The next step is to run a full backup. I'm using Adventureworks and I'm
replicating just the Person.Address table for this exercise.
backup database AdventureWorks to disk='c:\aworks.bak'
Then, restore the file to the subscriber server (in this case I'm going to restore to the same machine, my laptop):
restore database AdventureWorksSubscriber from disk='c:\aworks.bak' with move 'AdventureWorks_data' to 'c:\aworkssub.mdf', move 'AdventureWorks_log' to 'c:\aworkssub.ldf'
Now we need to create the subscription on the publisher and for it to use the
restored copy, we have to do this with TSQL:
use adventureworks go sp_addsubscription @publication ='Test', --your pub name here @subscriber='earlgrey', --subscriber server name @destination_db='AdventureWorksSubscriber', @sync_type = 'initialize with backup', @backupdevicetype = 'disk', @backupdevicename = 'c:\aworks.bak' go
We can see the subscription has been added in SSMS:
To prove it's working we'll update one row on the publisher, wait a few
seconds, and then verify on the subscriber:
update person.address set addressline2='TEST UPDATE' where addressid=1 use adventureworkssubscriber go select * from person.address where addressid=1
The only downside to this technique is that the subscriber has a complete
copy of the database. In our case we only wanted to replicate one table. If
we're concerned about security, or the amount of space used on the subscriber,
we can just start dropping tables and other objects after we get everything
going. It's easy enough to use and maybe they'll support it better in a service
pack or SQL 2008.
I blog once a week or so at
http://blogs.sqlservercentral.com/blogs/andy_warren/default.aspx about
SQLServer, SQL user groups, and related topics. I hope you'll visit and comment
occasionally!