December 12, 2006 at 10:06 am
Guys,
We have a replication set up with a single publisher and a single subscriber.
Recently, replication failed due to the following reason:
Data, erroneously, was deleted from subscriber. Then, to make the publisher and subscriber synch, the same data was deleted from publisher.
At that point, the publisher, as it should, attempted to delete the data in the subscriber. But since the data is not there, the attempt failed, and replication failed.
I tried to restart the agent, but replication kept failing. Since nothing worked, I just deleted the subscription and recreated it. The replication then started, but since it had to go through all the steps again (i.e. Snapshot ...), it took us 2 hours to get the data back in synch.
Luckily this happened in the testing environment. If it would be production, we would be in trouble, as availability is our #1 concern.
Would anyone know of any other (i.e. more elegant) way I could've solved this problem?
The only other idea I had was the following: It is possible that the data which the publisher plans to delete is temporarily stored in some file and if that file is located and deleted, and then the agent is restarted, it could possibly work - but, I could not locate any such files - and this is a completely hypothetical idea.
Any suggestions would be appreciated.
Thanks
December 12, 2006 at 2:51 pm
Was this intentional? If so from BOL
When replicated data needs to be updated at the Subscribers, you can use snapshot replication or transactional replication with updatable subscription options or you can use merge replication. The method you choose depends on your replication topology and the needs of your application and its users.
Use merge replication when . . . | Use snapshot replication or transactional replication with immediate updating or queued updating when . . . |
---|---|
|
|
December 12, 2006 at 3:36 pm
I believe you were using one way transaction replication...
Without snapshot you can handle this situation by add the -SkipErrors swith in your distribution job...and later you run the snapshot if neccessary...
Use the "-SkipErrors" parameter in Distribution Agent cautiously
http://support.microsoft.com/default.aspx/kb/327817
MohammedU
Microsoft SQL Server MVP
December 13, 2006 at 7:40 am
This is a quite useful suggestion.
Thank you very much
December 13, 2006 at 8:06 am
Antares,
I implemented "updatable subscriptions" solution.
However, when I delete something from the subscriber, I get the following error message:
Login failed for user 'sa'.
Would you know what could be causing it?
Thank you
December 13, 2006 at 9:17 am
Guys,
To solve the problem stated above, I am currently considering to implement the following replication model:
No user, other than the replication user, will be allowed to perform any insert/delete/update to the tables in the subscriber.
That way, we can avoid the problem that happened this time, which resulted in a replication failure.
In order to implement this, I am assuming that I should create 'replication' user and then give execution priviliges to this user for all the replication stored procedures.
The issues I need help with are the following:
1. There are too many stored procedures related to replication (and I don't know if I can find all of them)
2. I am not sure whether it would be sufficient just to do that (i.e. is there anything more related with replication in addition to the sps, that I should be concerned about?)
Can anyone suggest a more elegant solution for implementing this model or any other suggestions?
Thank you very much
December 13, 2006 at 2:37 pm
We just went through this issue. I right clicked on the distributer agent, selected the profile and made a new profile setting the skiperrors to the proper code (btw-there should be a builtin profile that skips errors, but it skips three, I made one to just skip the error we were getting).
That let the replication continue. Then I looked at the order the tables replicate. I compared the number of rows in each (original/copy) until I found a mismatch. We also use RedGate Compare to do this. Then I fix the incorrect table.
The second time it happened (two days in a row), we shut off all stored procedures that we didn't know what they did (meaning someone didn't go through us to create them). Then we waited for the screaming - which hasn't happened yet (two days later).
Document whatever you do, including the reasons so you can undo as needed.
-SQLBill
December 15, 2006 at 8:12 am
Bill,
Where can I find information about what error value corresponds to which error (i.e. I want to know what types of errors I will be skipping when I set a certain code value)?
Thank you
December 15, 2006 at 9:08 am
Ah ...
- Replication Monitor - Agents - Distribution Agents - "double click on publication" opened up the dialog box to show the error number.
Thanks Bill ... the info is quite useful - it can definitely save us in the future!
Viewing 9 posts - 1 through 8 (of 8 total)
You must be logged in to reply to this topic. Login to reply