Deadlocks between application sql queries and replication

  • We recently upgraded to sql server 2012. We have xxx-D-011 as OLTP server and yyy-D-011 as distributor server.

    The log is showing deadlocks every day between application queries/updates and replication jobs.

    A fragment of the log about the deadlock is included below. Any comments/ideas that can help me with the next steps will be much appreciated. Thank you!

    2014-07-10 15:31:05.94 spid13s deadlock-list

    2014-07-10 15:31:05.94 spid13s deadlock victim=process37ced3498

    2014-07-10 15:31:05.94 spid13s process-list

    2014-07-10 15:31:05.94 spid13s process id=process37ced3498 taskpriority=0 logused=0 waitresource=OBJECT: 8:532249001:0 waittime=357 ownerId=860304057 transactionname=SELECT lasttranstarted=2014-07-10T15:31:05.090 XDES=0x4674fbca0 lockMode=Sch-S schedulerid=2 kpid=10128 status=suspended spid=111 sbid=0 ecid=0 priority=0 trancount=0 lastbatchstarted=2014-07-10T15:31:05.090 lastbatchcompleted=2014-07-10T15:31:05.090 lastattention=1900-01-01T00:00:00.090 clientapp=.Net SqlClient Data Provider hostname=DIT3330W7DT hostpid=1092 loginname=xuser isolationlevel=read uncommitted (1) xactid=860304057 currentdb=8 lockTimeout=4294967295 clientoption1=673316896 clientoption2=128056

    2014-07-10 15:31:05.94 spid13s executionStack

    2014-07-10 15:31:05.94 spid13s frame procname= dbo.UpdatePurchaseOrderWorkQueueStatus line=39 stmtstart=2232 stmtend=2756 sqlhandle=0x0300080013baae0b4974c5004aa3000001000000000000000000000000000000000000000000000000000000

    2014-07-10 15:31:05.94 spid13s SELECT @XDate= dbo.fn_getxdate(C.ClinicID)

    2014-07-10 15:31:05.94 spid13s FROM dbo.Station S

    2014-07-10 15:31:05.94 spid13s INNER JOIN dbo.Clinic C ON C.ClinicID = S.ClinicID

    2014-07-10 15:31:05.94 spid13s INNER JOIN dbo.PurchaseOrder PO ON PO.StationID = S.StationID

    2014-07-10 15:31:05.94 spid13s WHERE S.Active = 1

    2014-07-10 15:31:05.94 spid13s AND PO.PurchaseOrderID = @PurchaseOrderID

    2014-07-10 15:31:05.94 spid13s inputbuf

    2014-07-10 15:31:05.94 spid13s Proc [Database Id = 8 Object Id = 196000275]

    2014-07-10 15:31:05.94 spid13s process id=process37ced3c38 taskpriority=0 logused=60504 waitresource=OBJECT: 8:644249400:0 waittime=313 ownerId=860303537 transactionname=user_transaction lasttranstarted=2014-07-10T15:31:04.830 XDES=0x1da8ead28 lockMode=Sch-M schedulerid=2 kpid=5544 status=suspended spid=119 sbid=0 ecid=0 priority=0 trancount=5 lastbatchstarted=2014-07-10T15:31:04.880 lastbatchcompleted=2014-07-10T15:31:04.880 lastattention=1900-01-01T00:00:00.880 clientapp=xxx-D-011-x-x_OrgHierarchy_Publication-13 hostname=yyy-D-011 hostpid=3360 loginname=svcadminsqldev isolationlevel=read committed (2) xactid=860303537 currentdb=8 lockTimeout=4294967295 clientoption1=538968096 clientoption2=128024

    2014-07-10 15:31:05.94 spid13s executionStack

    2014-07-10 15:31:05.94 spid13s frame procname=mssqlsystemresource.sys.sp_MSsetfilteredstatus line=30 stmtstart=1570 stmtend=1796 sqlhandle=0x0300ff7f66f82acd71e2fc00efa0000001000000000000000000000000000000000000000000000000000000

    2014-07-10 15:31:05.94 spid13s EXEC %%Object(MultiName = @qualified_name).LockMatchID(ID = @object_id, Exclusive = 1, BindInternal = 0)

    2014-07-10 15:31:05.94 spid13s frame procname=mssqlsystemresource.sys.sp_MSrepl_changesubstatus line=909 stmtstart=66056 stmtend=66312 sqlhandle=0x0300ff7f48dddfc2bfe2fc00efa0000001000000000000000000000000000000000000000000000000000000

    2014-07-10 15:31:05.94 spid13s exec sys.sp_MSsetfilteredstatus @tabid

    2014-07-10 15:31:05.94 spid13s -- clear nonsqlsub status for this article.

    2014-07-10 15:31:05.94 spid13s frame procname=mssqlsystemresource.sys.sp_changesubstatus line=52 stmtstart=3496 stmtend=5162 sqlhandle=0x0300ff7f5eea5ef792d8fa00efa0000001000000000000000000000000000000000000000000000000000000

    2014-07-10 15:31:05.94 spid13s EXEC @retcode = @cmd

    2014-07-10 15:31:05.94 spid13s @publication,

    2014-07-10 15:31:05.94 spid13s @article,

    2014-07-10 15:31:05.94 spid13s @subscriber,

    2014-07-10 15:31:05.94 spid13s @status,

    2014-07-10 15:31:05.94 spid13s @previous_status,

    2014-07-10 15:31:05.94 spid13s @destination_db,

    2014-07-10 15:31:05.94 spid13s @frequency_type,

    2014-07-10 15:31:05.94 spid13s @frequency_interval,

    2014-07-10 15:31:05.94 spid13s @frequency_relative_interval,

    2014-07-10 15:31:05.94 spid13s @frequency_recurrence_factor,

    2014-07-10 15:31:05.94 spid13s @frequency_subday,

    2014-07-10 15:31:05.94 spid13s @frequency_subday_interval,

    2014-07-10 15:31:05.94 spid13s @active_start_time_of_day,

    2014-07-10 15:31:05.94 spid13s @active_end_time_of_day,

    2014-07-10 15:31:05.94 spid13s @active_start_date,

    2014-07-10 15:31:05.94 spid13s @active_end_date,

    2014-07-10 15:31:05.94 spid13s @optional_command_line,

    2014-07-10 15:31:05.94 spid13s @distribution_jobid OUTPUT,

    2014-07-10 15:31:05.95 spid13s @from_auto_sync,

    2014-07-10 15:31:05.95 spid13s @ignore_distributor,

    2014-07-10 15:31:05.95 spid13s -- Agent offload

    2014-07-10 15:31:05.95 spid13s @offloadagent,

    2014-07-10 15:31:05.95 spid13s @offloadserver,

    2014-07-10 15:31:05.95 spid13s @dts_package_name,

    2014-07-10 15:31:05.95 spid13s @dts_package_password,

    2014-07-10 15:31:05.95 spid13s @dts_package_location,

    2014-07-10 15:31:05.95 spid13s @skipobjectactivation,

    2014-07-10 15:31:05.95 spid13s @distribution_job_name,

    2014-07-10 15:31:05.95 spid13s @publisher,

    2014-07-10 15:31:05.95 spid13s @publisher_type

    2014-07-10 15:31:05.95 spid13s ,@ignore_distributor_failure

    2014-07-10 15:31:05.95 spid13s frame procname=mssqlsystemresource.sys.sp_MSactivate_auto_sub line=92 stmtstart=4236 stmtend=4768 sqlhandle=0x0300ff7f6366c5c4d8e2fc00efa0000001000000000000000000000000000000000000000000000000000000

    2014-07-10 15:31:05.95 spid13s *sp_changesubstatus------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

    2014-07-10 15:31:05.95 spid13s inputbuf

    2014-07-10 15:31:05.95 spid13s Proc [Database Id = 32767 Object Id = -993696157]

    2014-07-10 15:31:05.95 spid13s resource-list

    2014-07-10 15:31:05.95 spid13s objectlock lockPartition=0 objid=532249001 subresource=FULL dbid=8 objectname=X.dbo.Station id=lock389908f00 mode=Sch-M associatedObjectId=532249001

    2014-07-10 15:31:05.95 spid13s owner-list

    2014-07-10 15:31:05.95 spid13s owner id=process37ced3c38 mode=Sch-M

    2014-07-10 15:31:05.95 spid13s waiter-list

    2014-07-10 15:31:05.95 spid13s waiter id=process37ced3498 mode=Sch-S requestType=wait

    2014-07-10 15:31:05.95 spid13s objectlock lockPartition=0 objid=644249400 subresource=FULL dbid=8 objectname=X.dbo.Clinic id=lock3f2864b00 mode=S associatedObjectId=644249400

    2014-07-10 15:31:05.95 spid13s owner-list

    2014-07-10 15:31:05.95 spid13s owner id=process37ced3498 mode=Sch-S

    2014-07-10 15:31:05.95 spid13s waiter-list

    2014-07-10 15:31:05.95 spid13s waiter id=process37ced3c38 mode=Sch-M requestType=convert

  • Saw your SO post here (for anyone that wants a more readable log): http://stackoverflow.com/questions/24686476/sql-server-replication-and-application-queries-updates-conflicts-resulting-in-de

  • The two queries seem to be taking out locks in different orders. On a guess, you have some type of SELECT statement that's part of the procedure UpdatePurchaseOrderWorkQueueStatus that validates records exist or something like that before you run the UPDATE statements. That's probably the cause of the deadlock. Validate that you need that SELECT statement. If not, remove it. If you have to have it, I'd put UPDLOCK hints on it so that it takes out the appropriate locks before you get to the UPDATE statements.

    "The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
    - Theodore Roosevelt

    Author of:
    SQL Server Execution Plans
    SQL Server Query Performance Tuning

  • Grant Fritchey (7/11/2014)


    That's probably the cause of the deadlock.

    Isn't the cause something between the Sch-S and Sch-M locks? How does an update factor into that?

  • Cody K (7/11/2014)


    Grant Fritchey (7/11/2014)


    That's probably the cause of the deadlock.

    Isn't the cause something between the Sch-S and Sch-M locks? How does an update factor into that?

    I'm just referencing the proc name. I didn't look at the specific locks being taken out. Maybe they're running a truncate during the procedure. Nice catch.

    "The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
    - Theodore Roosevelt

    Author of:
    SQL Server Execution Plans
    SQL Server Query Performance Tuning

  • Thank you for the comments. The deadlocks don't just happen with that one sproc. They happen everywhere on select queries sprocs, on adhoc queries as well. Replication seems to always be involved.

    Here is another fragment of such deadlock:

    2014-07-10 15:05:24.49 spid27s deadlock-list

    2014-07-10 15:05:24.49 spid27s deadlock victim=process1b7c9e558

    2014-07-10 15:05:24.49 spid27s process-list

    2014-07-10 15:05:24.49 spid27s process id=process1b7c9e558 taskpriority=0 logused=0 waitresource=OBJECT: 8:178151730:0 waittime=8686 ownerId=859130607 transactionname=SELECT lasttranstarted=2014-07-10T15:05:08.540 XDES=0x25d1cfd10 lockMode=IS schedulerid=3 kpid=10916 status=suspended spid=125 sbid=0 ecid=0 priority=0 trancount=0 lastbatchstarted=2014-07-10T15:05:06.233 lastbatchcompleted=2014-07-10T15:05:06.233 lastattention=1900-01-01T00:00:00.233 clientapp=.Net SqlClient Data Provider hostname=ION2938W530 hostpid=13468 loginname= ION2938 isolationlevel=read committed (2) xactid=859130607 currentdb=8 lockTimeout=4294967295 clientoption1=671088672 clientoption2=128056

    2014-07-10 15:05:24.49 spid27s executionStack

    2014-07-10 15:05:24.49 spid27s frame procname=adhoc line=1 stmtstart=32 sqlhandle=0x020000007cbba0057e6d79b2b5b9ef093b94148495b8713c0000000000000000000000000000000000000000

    2014-07-10 15:05:24.49 spid27s SELECT DISTINCT

    2014-07-10 15:05:24.49 spid27s pr.BusinessPartnerId AccountId, p.BUnitSize BillingUnit, u.Display BillingUnitOfMeasure,

    2014-07-10 15:05:24.49 spid27s pv.RetailPrice CurrentPrice, p.Name [Description], null Form, p.ItemId, m.Name Manufacturer, Inv.NDC,

    2014-07-10 15:05:24.49 spid27s null UnitDescription, s.StationName, s.StationId, null AvailabilityIndicator,

    2014-07-10 15:05:24.49 spid27s null BackorderReasonDescription, null BackorderReleaseNotes, null CanAddToCart

    2014-07-10 15:05:24.49 spid27s FROM dbo.Position pos

    2014-07-10 15:05:24.49 spid27s INNER JOIN dbo.AccessPoint a ON a.accesspointid = pos.accesspointid

    2014-07-10 15:05:24.49 spid27s INNER JOIN dbo.Cabinet c ON c.cabinetid = a.cabinetid

    2014-07-10 15:05:24.49 spid27s INNER JOIN dbo.Station s ON s.stationid = c.stationid

    2014-07-10 15:05:24.49 spid27s INNER JOIN dbo.Inventory INV ON INV.PositionID = pos.PositionID

    2014-07-10 15:05:24.49 spid27s INNER JOIN dbo.Product P ON

    2014-07-10 15:05:24.49 spid27s frame procname=unknown line=1 sqlhandle=0x0000000000000000000000000000000000000000000000000000000000000000000000000000000000000000

    2014-07-10 15:05:24.49 spid27s unknown

    2014-07-10 15:05:24.49 spid27s inputbuf

    2014-07-10 15:05:24.49 spid27s (@stationId int)SELECT DISTINCT

    2014-07-10 15:05:24.49 spid27s pr.BusinessPartnerId AccountId, p.BUnitSize BillingUnit, u.Display BillingUnitOfMeasure,

    2014-07-10 15:05:24.49 spid27s pv.RetailPrice CurrentPrice, p.Name [Description], null Form, p.ItemId, m.Name Manufacturer, Inv.NDC,

    2014-07-10 15:05:24.49 spid27s null UnitDescription, s.StationName, s.StationId, null AvailabilityIndicator,

    2014-07-10 15:05:24.49 spid27s null BackorderReasonDescription, null BackorderReleaseNotes, null CanAddToCart

    2014-07-10 15:05:24.49 spid27s FROM dbo.Position pos

    2014-07-10 15:05:24.49 spid27s INNER JOIN dbo.AccessPoint a ON a.accesspointid = pos.accesspointid

    2014-07-10 15:05:24.49 spid27s INNER JOIN dbo.Cabinet c ON c.cabinetid = a.cabinetid

    2014-07-10 15:05:24.49 spid27s INNER JOIN dbo.Station s ON s.stationid = c.stationid

    2014-07-10 15:05:24.49 spid27s INNER JOIN dbo.Inventory INV ON INV.PositionID = pos.PositionID

    2014-07-10 15:05:24.49 spid27s INNER JOIN

    2014-07-10 15:05:24.49 spid27s process id=process144999498 taskpriority=0 logused=612 waitresource=OBJECT: 8:617105289:0 waittime=8684 ownerId=859127269 transactionname=user_transaction lasttranstarted=2014-07-10T15:05:05.610 XDES=0x1a5e383a8 lockMode=Sch-M schedulerid=3 kpid=9536 status=suspended spid=114 sbid=0 ecid=0 priority=0 trancount=2 lastbatchstarted=2014-07-10T15:05:08.543 lastbatchcompleted=2014-07-10T15:05:08.540 lastattention=1900-01-01T00:00:00.540 clientapp=xxx-D-011-x_OrgHierarchy_Publication-13 hostname=yyy-D-011 hostpid=9368 loginname= svcadminsqldev isolationlevel=read committed (2) xactid=859127269 currentdb=8 lockTimeout=4294967295 clientoption1=536870944 clientoption2=128024

    2014-07-10 15:05:24.49 spid27s executionStack

    2014-07-10 15:05:24.49 spid27s frame procname=mssqlsystemresource.sys.sp_replupdateschema line=1 stmtstart=-1 sqlhandle=0x0400ff7ff863e5c2010000000000000000000000000000000000000000000000000000000000000000000000

    2014-07-10 15:05:24.49 spid27s sp_replupdateschema

    2014-07-10 15:05:24.49 spid27s frame procname=mssqlsystemresource.sys.sp_MSreplupdateschema line=16 stmtstart=748 stmtend=840 sqlhandle=0x0300ff7fbcdda2d521d9fa00efa0000001000000000000000000000000000000000000000000000000000000

    2014-07-10 15:05:24.49 spid27s exec sys.sp_replupdateschema @object_name

    2014-07-10 15:05:24.49 spid27s inputbuf

    2014-07-10 15:05:24.49 spid27s Proc [Database Id = 32767 Object Id = -710746692]

    2014-07-10 15:05:24.49 spid27s resource-list

    2014-07-10 15:05:24.49 spid27s objectlock lockPartition=0 objid=178151730 subresource=FULL dbid=8 objectname=.dbo.AccessPoint id=lock462701400 mode=Sch-M associatedObjectId=178151730

    2014-07-10 15:05:24.49 spid27s owner-list

    2014-07-10 15:05:24.49 spid27s owner id=process144999498 mode=Sch-M

    2014-07-10 15:05:24.49 spid27s waiter-list

    2014-07-10 15:05:24.49 spid27s waiter id=process1b7c9e558 mode=IS requestType=wait

    2014-07-10 15:05:24.49 spid27s objectlock lockPartition=0 objid=617105289 subresource=FULL dbid=8 objectname=.dbo.Cabinet id=lock3f4f92480 mode=IS associatedObjectId=617105289

    2014-07-10 15:05:24.49 spid27s owner-list

    2014-07-10 15:05:24.49 spid27s owner id=process1b7c9e558 mode=IS

    2014-07-10 15:05:24.49 spid27s waiter-list

    2014-07-10 15:05:24.49 spid27s waiter id=process144999498 mode=Sch-M requestType=wait

  • Deadlocks are fundamentally a performance problem. What's the situation with the server? How are the waits and queues? Are you running out of memory, I/O, or CPU? You may need to tune the server setup itself.

    "The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
    - Theodore Roosevelt

    Author of:
    SQL Server Execution Plans
    SQL Server Query Performance Tuning

  • Thank you everyone for your time and the good comments. They are all valid.

    At this writing, I have 2 environments DEV and QA. In QA Replication snapshot jobs were disabled. In Dev, Replication snapshot jobs run every 5 minutes. In QA there is no deadlocks. So the suspect is the replication snapshot jobs running too frequently.

    I will disable the DEV replication snapshots and watch the log for the next 3-5 days to see if that helps minimizing the deadlocks.

    Again, many thanks!

  • hoa.nguyen 64423 (7/11/2014)


    Thank you everyone for your time and the good comments. They are all valid.

    At this writing, I have 2 environments DEV and QA. In QA Replication snapshot jobs were disabled. In Dev, Replication snapshot jobs run every 5 minutes. In QA there is no deadlocks. So the suspect is the replication snapshot jobs running too frequently.

    I will disable the DEV replication snapshots and watch the log for the next 3-5 days to see if that helps minimizing the deadlocks.

    Again, many thanks!

    Snapshots are very cost intensive. Why would you run so many? They're meant to start a transactions log or merge replication process, not be run on a continuous basis.

    "The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
    - Theodore Roosevelt

    Author of:
    SQL Server Execution Plans
    SQL Server Query Performance Tuning

  • Each publication and subscription end up with a snapshot job and a dist job on the distributor box.

    This is what is coded to create the snapshot job:

    EXEC sp_addpublication_snapshot @publication = @Publication

    ,@frequency_type = 4

    ,@frequency_interval = 1

    ,@frequency_relative_interval = 1

    ,@frequency_recurrence_factor = 0

    ,@frequency_subday = 4

    ,@frequency_subday_interval = 1

    ,@active_start_time_of_day = 0

    ,@active_end_time_of_day = 235959

    ,@active_start_date = 0

    ,@active_end_date = 0

    ,@job_login = NULL

    ,@job_password = NULL

    ,@publisher_security_mode = 1

    Frequency_type should be set to 1 to run it once.

    Thanks!

  • Haha. What's hilarious about this is that snapshots had crossed my mind but I had immediately dismissed it with, "Nobody would run then that frequently!" 😛

  • Why run snapshots over and over. What purpose is it serving?

    "The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
    - Theodore Roosevelt

    Author of:
    SQL Server Execution Plans
    SQL Server Query Performance Tuning

  • It was a user error. Typically if one sets up a publication and subscription under SSMS and not scrutinize the parms, the setup will end up with defaults. The default is to run daily. One has to specifically specify the frequency type.

Viewing 13 posts - 1 through 12 (of 12 total)

You must be logged in to reply to this topic. Login to reply