Troubleshooting Replication: Get servers/subscribers from MSREPL_ERRORS

  • Hello,

    Is there a way to match replication errors in the MSREPL_ERRORS table to its specific subscribers?

    Scenario:

    - 1 publishing server

    - 100 subscribing servers

    If I connect to the publishing server and SELECT * FROM MSREPL_ERRORS, I can see a list of errors.

    However, I do not know which of the 100 servers encountered this error.

    For example, seeing a Disk Space Error does not really help because I do not know which server had this error.

    Any ideas/suggestions?

    Thanks in advance!

    Rafferty

  • Hello,

    Merge replication:

    USE distribution

    go

    SELECT MSmerge_agents.subscriber_name ,

    *

    FROM MSrepl_errors

    JOIN MSmerge_history ON MSrepl_errors.id = MSmerge_history.error_id

    JOIN MSmerge_agents ON MSmerge_agents.id = MSmerge_history.agent_id

    ORDER BY MSrepl_errors.time DESC

    Don't have transactional replication to test, should be

    USE distribution

    go

    SELECT MSdistribution_agents.subscriber_name ,

    *

    FROM MSrepl_errors

    JOIN MSdistribution_history ON MSrepl_errors.id = MSdistribution_history.error_id

    JOIN MSdistribution_agents ON MSdistribution_agents.id = MSdistribution_history.agent_id

    ORDER BY MSrepl_errors.time DESC

    If the subscriber_name returns null, use the MSdistribution_agents.Name or MSmerge_agents.Name column for analisys

    Best Regards,

    Vladimir

  • Hi,

    Thank you for your reply.

    Our msdistribution_agents and msdistribution_history tables are empty.

    So what I did was to join msrepl_errors with msmerge_history according to the session_id, and then join again with msmerge_agents

    Can I assume that joining on the session_id of these msmerge_history and msrepl_errors is okay?

    I'm having doubts because not all session_ids exist in msmerge_history, thus I end up doing a LEFT JOIN.

  • Hi,

    msdistribution_agents and msdistribution_history tables are empty, because I think these tables are used for the transactional replication only.

    You are right, left join to MSmerge_agents is needed, but I am not sure about the session_id. I think the right key is the error_id

    Let's compare MSmerge_history with error>0 and MSrepl_errors tables.

    SELECT * FROM MSmerge_history WHERE error_id>0 ORDER BY time desc

    SELECT * FROM MSrepl_errors ORDER BY time desc

    MSmerge_history table has unique error_id, table MSrepl_errors not. It looks like the history information 1st comes to the MSmerge_history table, see the time columns in both tables and then more descriptive data to the MSrepl_errors table. Maybe it's even better to concatenate the string with error codes and texts from MSrepl_errors by error_id and use it for automation analysis.

    I am personally not using the history table directly, because of the history retention policy. I have created a table in the local DBA_DB database and scheduled the job to run every 5 minutes. My monitoring system reads this table and sends customized alerts with the status in (5,6) or IsError >0. I can also keep the replication history much longer.

    There is a query which I am using for automation:

    SET nocount ON;

    exec distribution.dbo.sp_replmonitorrefreshjob @iterations=1;

    SELECT MSreplication_monitordata.Publisher ,

    MSreplication_monitordata.publisher_db ,

    MSreplication_monitordata.publication ,

    MSreplication_monitordata.agent_type ,

    CASE WHEN MSreplication_monitordata.agent_type = 4 THEN 'Merge Agent'

    ELSE ''

    END [agent_type] ,

    ISNULL(( SELECT Name

    FROM sys.servers

    WHERE server_id = MSmerge_agents.Subscriber_ID

    ), '') [Subscriber] ,

    MSmerge_agents.Subscriber_DB ,

    status ,

    CASE WHEN status IN ( 5, 6 )

    OR comments = 'Retrieving subscription information.' THEN 1

    ELSE 0

    END [IsError] ,

    comments [LastInformationMessage] ,

    MSreplication_monitordata.last_distsync [LastSync] ,

    ISNULL(MSreplication_monitordata.cur_latency,0) [latency]

    FROM distribution.dbo.MSreplication_monitordata

    LEFT JOIN distribution.dbo.MSmerge_agents ON MSmerge_agents.id = MSreplication_monitordata.agent_id

    LEFT JOIN distribution.dbo.MSmerge_history ON MSreplication_monitordata.agent_id = MSmerge_history.agent_id

    JOIN ( SELECT agent_id ,

    MAX(ISNULL(timestamp, 0)) timestamp

    FROM distribution.dbo.MSmerge_history

    GROUP BY agent_id

    ) a ON MSmerge_history.agent_id = a.agent_id

    AND a.timestamp = MSmerge_history.timestamp

    WHERE MSreplication_monitordata.agent_type IN ( 4 )

    Best Regards,

    Vladimir

Viewing 4 posts - 1 through 3 (of 3 total)

You must be logged in to reply to this topic. Login to reply