Replication history

Question

Replication history

TRACEY-320982

SSChampion

Points: 13653
More actions
February 10, 2009 at 7:48 am

#201058

Yesterday i received a count of two tables and the count were out by 1000 and the time was an hour out.
Is there a way to see a history of the transactions that were sent and how long it took to move from primary to the replicated database.
The replication is running fine now but the time of the two counts ran were definately out.
Cheers

Viewing 8 posts - 1 through 7 (of 7 total)

You must be logged in to reply to this topic. Login to reply

David Benoit SSC-Dedicated Points: 34562 More actions · Answer 1

You can use tracer tokens for some of this analysis but I'm not sure that will provide you with exactly what you were looking for.

This script is from Hilary Cotter that will provide you with all the commands in the commands queue on the distributor;

SELECT publisher_database_id, article_id,COUNT(*) FROM (

select Publisher_Database_id, article_id, XACT_SEQNO,

MIN_COMMAND_id=MIN(COMMAND_ID) From msrepl_commands

where article_id>0

GROUP BY Publisher_Database_id, article_id, XACT_SEQNO

) AS K

GROUP BY publisher_database_id, article_id

The one thing to consider with latency is that it is always downstream that the lag comes into play. So, start by checking at the distributor. See if the Distribution Clean Up job is blocking activity. Check at the subscribers to see if there is any blocking there that is keeping things from coming across.

Another thought too is that the replication related system tables in the subscriber and distribution databases can have their indexes / stats get a bit off if you are doing a good bit of work so, periodic rebuilds of those indexes are necessary.

In most cases these are the items that clear up latency in my experience unless there has been a large data load, etc.

Hope this is helpful.

David

@SQLTentmaker

“He is no fool who gives what he cannot keep to gain that which he cannot lose” - Jim Elliot

TRACEY-320982 SSChampion Points: 13653 More actions · Answer 2

Just added the tracer for today there does seems to be a lag and a lot of undistributed subscriptions waiting.

More to investigate now.

TRACEY-320982 SSChampion Points: 13653 More actions · Answer 3

Today it is happening and it appears bi impromptu report blocked the record and the replication inserts etc had to wait.

Usually they set an isolation mode in their connections.

Is there a way to get an alert set up so that if the undistributed commands get high i.e 4000 today that an email could be fired.

Still would like a history log of the wait time of how long replication is having to wait that be very useful.

Cheers

David Benoit SSC-Dedicated Points: 34562 More actions · Answer 4

I'm sure someone can come up with something better but for those articles that are critical for us we log the datetime that they were inserted into the publishing database. We have a job that runs at 5 minute intervals and queries the publishing DB and the subscribers and compares the times. If they are off by more than N minutes we send an alert. Cheesy maybe but it works. Typically we will keep an eye on things through Replication Monitor but that is not really an alert.

Chad Boyd posted a solution on his BLOG that looks pretty cool, I just haven't had time to review it and / or implement it. Might be worth a consideration.

David

@SQLTentmaker

“He is no fool who gives what he cannot keep to gain that which he cannot lose” - Jim Elliot

noeld SSC Guru Points: 96590 More actions · Answer 5

TRACEY (2/10/2009)
Today it is happening and it appears bi impromptu report blocked the record and the replication inserts etc had to wait.
Usually they set an isolation mode in their connections.
Is there a way to get an alert set up so that if the undistributed commands get high i.e 4000 today that an email could be fired.
Still would like a history log of the wait time of how long replication is having to wait that be very useful.
Cheers

here are a couple of queries I have found to be useful:

SELECT da.publisher_db,

da.subscriber_db,

SUM(ds.DelivCmdsInDistDB) AS total_delivered_cmds_in_dist,

SUM(ds.UndelivCmdsInDistDB) AS total_undelivered_cmds_in_dist

FROM distribution.dbo.MSdistribution_status ds WITH (NOLOCK)

INNER JOIN

distribution.dbo.MSdistribution_agents da

ON ds.agent_id = da.id

GROUP BY da.publisher_db, da.subscriber_db

ORDER BY da.publisher_db, da.subscriber_db

* Noel

noeld SSC Guru Points: 96590 More actions · Answer 6

TRACEY (2/10/2009)
Today it is happening and it appears bi impromptu report blocked the record and the replication inserts etc had to wait.
Usually they set an isolation mode in their connections.
Is there a way to get an alert set up so that if the undistributed commands get high i.e 4000 today that an email could be fired.
Still would like a history log of the wait time of how long replication is having to wait that be very useful.
Cheers

;WITH status_cte

AS

(

SELECT da.publisher_db,

da.subscriber_db,

CASE dh.runstatus

WHEN 1 THEN 'Start'

WHEN 2 THEN 'Succeed'

WHEN 3 THEN 'Running'

WHEN 4 THEN 'Idle'

WHEN 5 THEN 'Retry'

WHEN 6 THEN 'Fail'

END AS run_status,

dh.time AS message_time,

dh.current_delivery_rate,

dh.current_delivery_latency,

dh.delivery_rate,

dh.delivery_latency,

dh.comments,

ROW_NUMBER() OVER(PARTITION BY dh.agent_id ORDER BY dh.timestamp DESC) rn

FROM distribution.dbo.MSdistribution_history dh

INNER JOIN

distribution.dbo.MSdistribution_agents da

ON dh.agent_id = da.id

)

SELECT

publisher_db,

subscriber_db,

message_time,

current_delivery_rate

current_delivery_latency,

delivery_rate,

delivery_latency,

comments

FROM status_cte

WHERE rn = 1

ORDER BY publisher_db, subscriber_db

* Noel

TRACEY-320982 SSChampion Points: 13653 More actions · Answer 7

Thanks guys for sharing i be looking at them today.

Yesterday i would have got notice there was a block on table but some person deleted the mail setting from DNS ...so i got no notice and of course i watch SQL every minute of the day lol.;)