Creating a "cloned" subscriber with minimal impact to the source database users?

  • Good afternoon everyone,

    The situation I have here seems simple enough but its a pain to convey. I'll do my best to describe the problem and eagerly await any suggestions you all have. If anything seems muddled please let me know, I will gladly attempt to clarify further.

    Long story short, has anyone had success with creating an exact copy of an existing subscriber without having to take the source database offline, and without having to reinitialize the old subscriber when the new subscriber is added?

    We have two database servers which I will refer to as the SOURCE and REPLICA1. The source database is a "pruned" copy of live data which only goes back two years. We have replication filters in place to only replicate changes in data that falls within this window and a system of procedures which come in on a daily basis to trim off anything older than two years (anything outside the window). REPLICA1 contains all the data for as far back as history goes.

    My current goal is to create a copy of REPLICA1 which I will call REPLICA2 for simplicity, in such a way (if possible) that the users writing transactions to the SOURCE aren't impacted at all. I'd like to avoid having to ask them to leave the system.

    The plan I originally put together was:

    1) Copy the most recent full backup of REPLICA1 over to the machine that will house REPLICA2

    2) Create the new REPLICA2 database on this secondary machine, restoring it from the full backup we copied over with the "NORECOVERY" option. This leaves the database in a loading state.

    3) I will then stop the log reader on the SOURCE

    4) Once the Distribution Agent is caught up (all transactions have been pushed to REPLICA1) I would take a transaction log backup that I would also copy over to the secondary machine now housing REPLICA2.

    5) I finish restoring REPLICA2 with the log backup. Now I have 3 databases SOURCE, REPLICA1, REPLICA2. The source is still active, and available to the users. The two replicas are identical, the first set up as a subscriber, the second is not.

    6) Set up REPLICA2 as a subscriber to SOURCE using the option to NOT generate schema and data or start the snapshot agent since this new subscriber already has everything.

    7) Restart the Log Reader, all transactions accumulated during prior steps being pushed to the old subscriber AND the new subscriber as well.

    The problem that I've run into is that apparently the log reader only logs transactions for subscribers that its aware of when you stop it. My testing showed that any transactions that came in after the log was stopped and before REPLICA2 was added as a subscriber were not pushed to REPLICA2 when the log reader restarts. They were successfully pushed to the initial subscriber as I expected and any additional transactions after the restart were pushed to both... but there is a window of time for things to be missed for the new subscriber.

    To further frustrate the situation an alternate method of adding the new subscriber while its database is still in the loading state, then stopping the log reader, then doing the log backup copy over & restore before finally restarting the log reader again resulted in data duplication (PK viloations) and replication failure for REPLICA2. This was due to the fact that transactions written to the log after the REPLICA2 was added as a subscriber but before the log reader was stopped were replicated to REPLICA1 and subsequently restored into REPLICA2 already courtesy to the restore of the copied transaction log. Keeping in mind that the log reader logs items for subscribers it is aware of this is the source of the problem. These records were entered after REPLICA2 was added as a subscriber but they're present due to the restore of the transaction log from REPLICA1.

  • This was removed by the editor as SPAM

  • I do almost the same most of the time when creating copy of database for reporting or adhoc query or redunduncy for high availability.What usually i do is I setup two parallel replication to have overlap to overcome dataloss and modify subscriber's stored proc to handle duplicate records based on primary key and then keep only the replication that is required.

    All you have to do is modify sp_msins_xx stored proc on subscriber database to insert the record only if it does not exists on subscriber based on primary key.

     

Viewing 3 posts - 1 through 2 (of 2 total)

You must be logged in to reply to this topic. Login to reply