T-Replication has 4 jobs and failing

  • Hi guys,

    I have a T-replication setup, its failing now, after so many retries its now asking for an initial snapshot, but i have seen the jobs in the distributor server, where it shows 4 jobs for this publication. of which one is not running(this job is for snapshot), other is log reader agent job(running), but here they are two other jobs which are running, i think one of which is of old subscriber, as we recently recreated the same subscriber.

    for a different publisher i see 3 jobs in the distributor than 4 for this one.

    Can any one tell me how many jobs will be created when a T-replication with one subscriber is created.

    how do i go from here, because now the replication is stopped synchronization. Do I Delete the subsriber and the two jobs, or delete publication and then all the jobs showing in the distributor server and recreate it again.

    please suggest.

    Thanks.

  • And this is what its showing me in the distributor server agent jobs. this is a SQL 2005-2005 replication and distributor a separate box and has sql 2005.

    Date10/26/2008 4:04:41 AM

    LogJob History (PublisherServer-Database-database_pub-Subscriberserver-77)

    Step ID2

    ServerDistributorserver

    Job NamePublisherServer-Database-database_pub-Subscriberserver-77

    Step NameRun agent.

    Duration3.06:13:13

    Sql Severity0

    Sql Message ID0

    Operator Emailed

    Operator Net sent

    Operator Paged

    Retries Attempted0

    Message

    -XJOBNAME PublisherServer-Database-database_pub-Subscriberserver-77

    -XSTEPID 2

    -XSUBSYSTEM Distribution

    -XSERVER DE1SMSDB02

    -XCMDLINE 0

    -XCancelEventHandle 000006D4

    -XParentProcessHandle 00000448

    2008-10-29 14:17:45.346 Startup Delay: 9096 (msecs)

    2008-10-29 14:17:54.455 Connecting to Distributor 'Distributor server'

    2008-10-29 14:17:54.549 Parameter values obtained from agent profile:

    -bcpbatchsize 2147473647

    -commitbatchsize 100

    -commitbatchthreshold 1000

    -historyverboselevel 1

    -keepalivemessageinterval 300

    -logintimeout 15

    -maxbcpthreads 1

    -maxdeliveredtransactions 0

    -pollinginterval 5000

    -querytimeout 1800

    -skiperrors

    -transactionsperhistory 100

    2008-10-29 14:17:54.549 Connecting to Subscriber 'Subscriberserver'

    2008-10-29 14:17:54.643 Agent message code 21036. Another distribution agent for the subscription or subscriptions is running, or the server is working on a previous request by the same agent.

  • Taking the simple case of one publication and one subscription you should have three jobs. One Snapshot Agent, one Log Reader Agent and one Distribution Agent. So you have somehow managed to leave a rogue job lying around which isn't entirely unheard of.

    The simplest way to get things working, hopefully, is to stop and disable the job corresponding to the incorrect Distribution Agent. Stop and restart the correct one and all should be well.

    To tidy things up you will probably need to manually delete rows from a couple of tables in the distribution database and delete the rogue distribution agent as you would delete any other scheduled task.

    The two tables in the distribution database from which you need to delete rows are msdistribution_agents and msdistribution_history. The job name contains the agent id so you can use it to match against the id column value in msdistribution_agents and the agent_id column value in msdistribution_history.

    As always when manually maintaining such tables, be careful!

    Hope that helps.

    Mike

Viewing 3 posts - 1 through 2 (of 2 total)

You must be logged in to reply to this topic. Login to reply