December 3, 2009 at 5:14 am
I'm really just getting started planning a replication solution. I have a 100 GB sql 2005 production database that I want to replicate to another host (sql 2005) for reporting. The reporting database needs to be updated nightly for now - from production. It would also contain almost all production data. Production has a lot of historical data and daily updates would be minimal. I really would like a very minimum impact to the production server expecially during business hours because it supports many other databases.
I think this should be a common and standard setup, I'm thinking Transactional Replication between a local distributor(production database) and subscriber(reporting database)? Do I need a seperate server for the distribution agent? Should this be a pull or push situation? Can I just schedule replication to the subscriber nightly? If anyone has any thought's or recommendations I would really appreciate it. Thanks!
December 3, 2009 at 8:44 pm
Any decision to use a separate server for the distribution database you be taken based on the expected load on each of the servers (publisher, distributor, subscriber). If you are concerned about the load on you publisher, then the distributor probably shouldn't be on that server. It might be appropriate to use the subscriber as the distributor.
In my experience, push subscriptions have been easier to manage. However, since you are wanting to keep the load on the publisher as low as possible, you will probably end up using a pull subscription.
Underneath the wrapper, all of the agents used by replication are implemented using SQL Server Agent jobs. So your options for scheduling of the replication agent do include being able to run then nightly.
December 4, 2009 at 8:07 am
Just use transactional replication and if your daily updates are minimal, then the load will be minimal because transaction replication only pushes over the most recent changes. Don't know if that answers your questions...
The only issue for you, by the way, is what happens if your replication breaks. Then you have to reinitialize which means pushing over the entire 100G unless your break your table up. We do that with views that have underlying tables by date. So when we have to reinitialize, we only push over the most recent table(s). We have some views where the underlying tables add up to 500+ G so this is critical for us.
December 5, 2009 at 3:35 am
I would suggest that you strongly consider using Log Shipping as a possible solution for your business case.
It will adhere nicely to the requirements you have listed and will be more straightforward to administer and manage.
Replication is a more complex implementation and has more administrative overhead associated with it however, it does have significant advantages such as allowing you to replicate a subset of the data from the Publication database (i.e. you could replicate solely the tables you actually require for reporting) and the ability to apply an alternative indexing strategy that can specifically address your Reporting queries at the Subscriber.
Once you are at this level of custom implementation however it is often time to start considering the creation of a Data Warehouse, at least that has been the case in my experience.
December 5, 2009 at 8:06 am
John.Sansom (12/5/2009)
I would suggest that you strongly consider using Log Shipping as a possible solution for your business case.It will adhere nicely to the requirements you have listed and will be more straightforward to administer and manage.
Replication is a more complex implementation and has more administrative overhead associated with it however, it does have significant advantages such as allowing you to replicate a subset of the data from the Publication database (i.e. you could replicate solely the tables you actually require for reporting) and the ability to apply an alternative indexing strategy that can specifically address your Reporting queries at the Subscriber.
Once you are at this level of custom implementation however it is often time to start considering the creation of a Data Warehouse, at least that has been the case in my experience.
December 7, 2009 at 4:37 am
This all sounds good. I have been testing transactional replication with a small database. Another question I have is when I start transactional replication with an initial snapshot can I just delete the snapshot once replication is started? Is there a way is extimate the size of the snapshot, in testing it seems smaller than the total size of the database. I'm just trying to get an idea of the size my disk request should be.
Thanks again for all the replies!
December 7, 2009 at 4:56 am
Replication includes a few agent collectively called Miscellaneous Agents (real imaginative, huh). One of these is responsible for deleting snapshots that are no longer needed.
As for estimating the size of a snapshot, I have never really given it much thought. In essence, it is a BCP of each table in native format. So, any estimates of sizes for a BCP file would apply.
Viewing 7 posts - 1 through 6 (of 6 total)
You must be logged in to reply to this topic. Login to reply