September 28, 2007 at 1:44 pm
Hi,
We have an ETL process setup that extracts mainframe transaction data and populates a local SQL 2K staging server. The local SQL Server is updated once a night around 3am with incremental changes from the previous day. We use a propriety middleware product the runs on an SNA Server which transfers and applies the incremental changes to MS SQL. We need to replicate the daily updates from the local SQL server to a remote SQL 2005 server that resides at a client site.
We prefer not to place an SNA Server at the client site due to issues with routing SNA over the WAN. We also cannot use the local SNA server to populate the remote server directly because our proprietary s/w performs a push and the client requires the replication must be pulled. The client has suggested that we just FTP them the source flat files and they would write a script to apply the changes themselves. I'd prefer to have SQL perform the replication. Other clients have asked about SQL replication and I'd like to have a consistent process in place for all our clients.
The total DB size is about 40G. The incremental changes are small, maybe 500-600M. Is Transactional Replication the best way to go? We will probably do a snapshot or DTS copy for the initial data load. I'm ruling out Log Shipping because the remote server is a reporting server and we don't want to replicate triggers, stored procs, accounts .... etc. We're only interested in moving the data once a day during off-line hours. I'm not sure what the bandwidth is but we prefer not to perform a full replication daily. Any input would be appreciated,
Thanks,
Dave
September 29, 2007 at 6:46 am
The volume of changes sounds like transactional replication would be a good choice. You have to make sure that the changes from the mainframe don't actually do more updating than they need to. If you update a field in a database to the value it was before the update, replication will still send it to the subscriber.
Replication from SQL 2000 to SQL 2005 works fine (we have been doing it here for more than a year now).
You will need every table being replicated to have a primary key of some sort. Schema changes can be made, but carefully if you do not want to have to reinitialize.
Replication has a lot of options, my suggestion would be to try to keep it as close to the defaults as you can. As soon as you do something abnormal, it gets complicated to manage.
September 29, 2007 at 9:00 am
Michael Earl (9/29/2007)
You have to make sure that the changes from the mainframe don't actually do more updating than they need to.
That is one of my concerns. We have a few extracts designed and coded by different mainframe programmers that pull data from different systems. Some handle a change by doing an update while another performs a delete and replace of the whole record. I've seen rows in the flat file where the same record is deleted and added multiple times. It seems that if a user changes 3 fields in the OLTP system and happens to hit enter 3 times it gets processed and 3 delete/adds. Fortunately the files still aren't that large and we will have plenty of time during the night to get the data replicated. I believe Primary Keys are defined on most if not all of the tables. Thanks for the reply!
Dave
September 29, 2007 at 11:23 am
Deleting and inserting the entire record is not as bad for replication as you think. If you update any field that has a unique index, replication calls it a deferred update and deletes and inserts anyway. In addition, the update procedures at the subscriber take the entire set of fields, not just the ones that actually changed.
Where you get an overhead problem is if you are updating records (or deleting and inserting them) that have not had anything changed. For example:
UPDATE MyTable SET MyField1 = MyField1
This update will hit every record and replication will issue (one at a time in fact) an update command for every record in the table.
If the updates are really updates, you will usually see acceptable performance from transactional replication. The only way to be sure for your situation is to try it. For testing, it is pretty low risk to create a publication to another server right on your network and simply look at the rate the transaction log grows.
September 29, 2007 at 12:12 pm
Is there still someone who doesn't know that it's pointless to update a row without doing any change (like in your exemple)??
That should be thaught in DB 101.1!
September 29, 2007 at 2:27 pm
I'm afraid bad programming is still alive and kicking.
I guess I cannot say much, I ran a query just like that the other day because someone updated a bunch of records in a table in one of my transactional replication subscribers and I did not feel like spending the time to figure out which ones were changed.
A good programmer can write code for almost anything. A great programmer is a really lazy good programmer that only writes what he has to.
September 29, 2007 at 2:54 pm
I was not pointing fingers at anyone... I was just wondering if that knowledge was as wide spread as I taught.
September 29, 2007 at 6:07 pm
The "common" in "common sense" is apparently relative.
September 29, 2007 at 9:26 pm
Sorry if I offended anyone with my comment. It was not my intention.
Viewing 9 posts - 1 through 8 (of 8 total)
You must be logged in to reply to this topic. Login to reply