June 9, 2008 at 2:55 pm
Hello All - I have a question about replication and triggers on the subscriber machine. I have two production servers on the same LAN - one is live data, the second is dedicated as a report server, subscriber to the first sever. I have not yet set up replication, but the subscriber server is to be configured to recieve data transactionally 24x7 - several thousand short records per day. I'd like to set up a key table on the subscriber server and triggers on the important tables that will write the key values whenever any parent or child record is added or updated. A later process will reference this key table to handle offline processing for another application, i.e. everything that has been added or changed since the last time it looked at the key table.
My question is a) can triggers be placed on subscriber tables and fire for each/every record that the subscriber recieves (new or update), and b) am I taking the proper approach to this? It seems this has been the solution elsewhere, but I would like a second (or third, or fourth) opinion. If anyone can point me at additional resources or examples, I would be eternally thankful! This is my first run at replication in a 'large' production environment and I'm on pins and needles.
Thank you!
June 9, 2008 at 10:34 pm
You certainly create triggers in the subscribing database. You will need to make sure that the triggers are created correctly - either manually by you after the distribution agent has deliver the initial snapshot or by coding the triggers on the publisher with suitable logic to stop them doing anything on the publisher. Personally, I would rather use the first method since it places less code on the main server and the code in the triggers would be much simpler.
June 10, 2008 at 5:22 am
I would not use triggers.
Your plan is somewhat sound, but has a couple of holes. First, replication will send inserts, updates, and deletes one at a time to your subscriber - this is pretty good in your case I think since you want to be dealing with them in small pieces.
However, an UPDATE on the publisher is not always an UPDATE on the subscriber. If you have an update to a field that has a unique constraint, it becomes a deferred update and is applied as a DELETE and an INSERT. So, you cannot always be sure a delete is a delete and an update is an update. That may break your entire plan. That is a big reason to not have FK relationships on the subscriber end of transactionally replicated tables.
If that issue does not stop you, then using triggers is not what you should do anyway. On your subscriber, you will find (if you use the defaults for transactional replication) a bunch of stored procedures created for replication. The inserts, updates, and deletes are actually handled by stored procedures in replication. Rather than using a trigger, modify the stored procedures to write to an additional table. This will be safer and more stable. Remember to script these procedures out and save them in case you need to reinitialize - they may be overwritten (there are options for this).
Finally, you should look at Service Broker for capturing the keys of inserted, updated, and deleted records. It is asynchronous, so it does not get into your original transaction and it is designed to queue items up in the order they came in. If you have control over the application that will be processing later, you could simply write it to pop items off the queue thus ensuring you would never process the same thing twice. There is no GUI and very little documentation for service broker, but it works really well. It is worth reading up on for something like this.
June 11, 2008 at 4:25 pm
Michael Earl (6/10/2008)
I would not use triggers.If that issue does not stop you, then using triggers is not what you should do anyway. On your subscriber, you will find (if you use the defaults for transactional replication) a bunch of stored procedures created for replication. The inserts, updates, and deletes are actually handled by stored procedures in replication. Rather than using a trigger, modify the stored procedures to write to an additional table. This will be safer and more stable. Remember to script these procedures out and save them in case you need to reinitialize - they may be overwritten (there are options for this).
Finally, you should look at Service Broker for capturing the keys of inserted, updated, and deleted records. It is asynchronous, so it does not get into your original transaction and it is designed to queue items up in the order they came in. If you have control over the application that will be processing later, you could simply write it to pop items off the queue thus ensuring you would never process the same thing twice. There is no GUI and very little documentation for service broker, but it works really well. It is worth reading up on for something like this.
Thank you Michael - this is definitely the type of information I was looking for. There will never be deleted data, so I'm not worried about that, but there will be updated data, and the concern over the delete before update methodology was what I was afraid might happen. This is police data, so the retention period is 'forever'. I'm just trying to get the legacy data into a more modern database, and since the legacy system will remain in use for several more years, I have to continue to check for records that are updated.
Two questions:
Are the default stored procedures created when the transactional replication job is set up? Or are there some templates I could look at?
Also could you point me toward ~any~ resources on the Service Broker approach? To be honest, I'm a coder, not a DBA, and this has fallen in my lap. The code approach really appeals to me.
Again, thank you! This has been an eye-opener.
Matt
September 16, 2008 at 6:35 pm
Michael - if you are still getting forum notifications, I am very interested in how you got your information about deferred updates. I need to know when updates are applied as insert/delete pairs on remote replication nodes. If you can steer my toward white papers or whatnot, that would be fantastic.
Cheers,
David
September 17, 2008 at 4:13 am
as well-noted already, transactional repl will normally invoke sprocs that are scripted for you at the distributor and applied at the subscriber (as part of the initial snapshot delivery). But you can elect to roll-your-own instead if you prefer [specify pre and post scripts - see BOL for sp_addpublication].
One would normally have identical metadata (eg 50 columns on T1 table on pub means 50 columns on T1 table at each sub), but you could add columns if you wished [e.g. WhenChanged datetime default getdate(), or IsDirty bit default 1] or have your sprocs explicitly set these columns or common T2 table.
your later processes can then query by WhenChanged or IsDirty (and update set IsDirty=1 [before] actioning).
if you had 100 table articles that means
100 x sp_MSdel_Tn
100 x sp_MSins_Tn
100 x sp_MSupd_Tn
so do you really want to have the hassle of more triggers to manage
100 x Tn_trigIUD
anyway it feels wrong to have sproc plus trigger for each table, when you can easily encapsulate your functionality within the TSQL for the sproc.
guess if you are dead-set on coding you could go for CLR trigger instead, but most folk would be happier with straight TSQL (well-known so easier for staff to use).
HTH
Dick
September 17, 2008 at 9:15 am
That's interesting information but unless I am missing something about your point, it does not address the question of deferred updates. I am not asking about triggers and I have identicle schema [data] and I don't wish to add columns outside of replication.
I'm looking for information related to exactly when (maybe even why) I can expect replication to process an update on the publisher as a pair of insert/delete (yes, sp_MSdel_ are executed if the update was made to a unique constraint. I just want to know why and if there are other scenarios that spawn deferred updates.
Cheers,
David
Viewing 7 posts - 1 through 6 (of 6 total)
You must be logged in to reply to this topic. Login to reply