SQL jobs Dead Locks Issue

  • GilaMonster (4/4/2012)


    Mac1986 (4/4/2012)


    It looks like SPID 76,85 and 91 are suspended. Not sure what process is responsible for it. Is it SPID 25?

    No. It's a deadlock, processes 76, 85 and 91 are blocking each other. 25 is just the deadlock detector. No other processes are involved.

    I'll leave this one for Kraig when he wakes up.

    *facepalm* Thanks Gail. I'll just blame that on having a very long day yesterday and trying to do something immediately after getting home before a brain reboot. I'll take a look through this.

    Edit: Upside down... gyeah, pain to read. I need more caffeine and then to shove this into something to reverse it to readibility... Not gone, just... I'll be back. Excel and notepad are your friends...


    - Craig Farrell

    Never stop learning, even if it hurts. Ego bruises are practically mandatory as you learn unless you've never risked enough to make a mistake.

    For better assistance in answering your questions[/url] | Forum Netiquette
    For index/tuning help, follow these directions.[/url] |Tally Tables[/url]

    Twitter: @AnyWayDBA

  • Alright, these statements are deadlocking:

    UPDATE dbo.WorkQueue

    and the first select that uses WorkQueue which drops to a table variable.

    Since the second statement could only lock up WorkQueue, I won't have you digging around in system tables trying to locate what table that waitresource belongs to. FYI, since you asked earlier, a Page is the smallest system of storage for the database. All rows drop onto a page and either rows, pages, or tables are locked for usage.

    Easy fix: find the statement that Updates WorkQueue and change its locking to use TABLOCKX so it'll stop pounding on equivalent data for other selects. So change it to this:

    UPDATE [dbo].[WorkQueue] WITH (TABLOCKX)

    SET WorkQueueStatus = 'DataLoaded'

    WorkQueueStatusDate = GETDATE()

    WorkQueueStatusUTCDate = GETUTCDATE()

    WorkQueueEndWorkDate = GETDATE()

    WorkQueueProcessingMSG = 'DataLoaded: ' + WorkQueueProcessingMSG

    LastUpdatedBY = SYSTEM_USER

    LastUpdatedDate = GETDATE()

    WHERE WorkQueueID = @WorkQueueID;

    That will grab an exclusive lock on the table for the duration of the update, a necessary precaution for your system, since it's tripping on other selects in the system. We can work to help you make that update faster if you like, but we'll need to see the schema on Workqueue, but basically you want an index on WorkQueueID to find the necessary records the fastest, so you get in and out cleanly.


    - Craig Farrell

    Never stop learning, even if it hurts. Ego bruises are practically mandatory as you learn unless you've never risked enough to make a mistake.

    For better assistance in answering your questions[/url] | Forum Netiquette
    For index/tuning help, follow these directions.[/url] |Tally Tables[/url]

    Twitter: @AnyWayDBA

  • Thanks a lot Gail.

    Analysis:

    So, SPID 76, 91 and 85 are involved in Deadlock occurrence and are supending each other.

    Three jobs are potentially running at the same time.

    Question: How would I know what are those 3 jobs with help of their SPIDs 76,91 & 85. (Any query or anything?)

    Information for Moderators: Actually there are 17 jobs that run 24X7 and gets kicked off every 5 mins.

  • Thanks a lot Evil: I will try to locate the that particular Update statement on Workqueue table.

    But we have 17 jobs getting kicked of every 5 mins and all jobs update the work queue table.

    I guess I should only update that particular job where deadlock got encountered. How would I find that particular job name out of 17 jobs. If I could get the job name I can locate the update statement in that job.(It just calls the Store Proc which has the script).

  • Mac1986 (4/4/2012)


    Thanks a lot Evil: I will try to locate the that particular Update statement on Workqueue table.

    But we have 17 jobs getting kicked of every 5 mins and all jobs update the work queue table.

    I guess I should only update that particular job where deadlock got encountered. How would I find that particular job name out of 17 jobs. If I could get the job name I can locate the update statement in that job.(It just calls the Store Proc which has the script).

    Any/all of those could deadlock, dependent on how the data is stored on the page level. Even updating non-clustered indexes could theoretically deadlock. I'd update the stored proc itself that all of them call. You're pounding a table in large groups concurrently and repeatedly. Deadlocks will happen or you have to over-insulate the calls to avoid it. What's WorkQueue's schema look like?

    Also, you might try staggering your 5 minute job cycle so they're not all kicking off simultaneously.


    - Craig Farrell

    Never stop learning, even if it hurts. Ego bruises are practically mandatory as you learn unless you've never risked enough to make a mistake.

    For better assistance in answering your questions[/url] | Forum Netiquette
    For index/tuning help, follow these directions.[/url] |Tally Tables[/url]

    Twitter: @AnyWayDBA

  • Thanks a lot for your input Evil:

    I just encountered another deadlock. I saw the logs and it said that SPID 77 is suspended. Then I ran sp_who2 to find out which job is SPID 77. Under the column 'ProgramName', it said that "SQLAgent - TSQL JobStep (Job 0xD78FB44BE950344EB5F48F457FD1227B : Step 4)". Now how can I find out which job is it.

  • Select * from msdb..sysjobs where CONVERT(BINARY(16), job_id) = 0xD78FB44BE950344EB5F48F457FD1227B

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • thanks a lot for input Gail:

    So that was the suspended Job. Can I know who suspended it? Please find the Log below.

    Date,Source,Severity,Message

    04/04/2012 10:41:55,spid20s,Unknown,waiter id=process5863048 mode=IX requestType=wait

    04/04/2012 10:41:55,spid20s,Unknown,waiter-list

    04/04/2012 10:41:55,spid20s,Unknown,owner id=process184bc8508 mode=S

    04/04/2012 10:41:55,spid20s,Unknown,owner-list

    04/04/2012 10:41:55,spid20s,Unknown,pagelock fileid=1 pageid=3535848 dbid=8 objectname=dbVelocityMetadata.dbo.WorkQueue id=lock53984ab00 mode=S associatedObjectId=72057594108837888

    04/04/2012 10:41:55,spid20s,Unknown,waiter id=process184bc8508 mode=S requestType=wait

    04/04/2012 10:41:55,spid20s,Unknown,waiter-list

    04/04/2012 10:41:55,spid20s,Unknown,owner id=process5863048 mode=IX

    04/04/2012 10:41:55,spid20s,Unknown,owner-list

    04/04/2012 10:41:55,spid20s,Unknown,pagelock fileid=1 pageid=1688513 dbid=8 objectname=dbVelocityMetadata.dbo.WorkQueue id=lock3ec97fd00 mode=IX associatedObjectId=72057594108837888

    04/04/2012 10:41:55,spid20s,Unknown,resource-list

    04/04/2012 10:41:55,spid20s,Unknown,EXEC WorkQueueProcessRun @DataSetInstanceRunGroup='RunGroup03'

    04/04/2012 10:41:55,spid20s,Unknown,inputbuf

    04/04/2012 10:41:55,spid20s,Unknown,EXEC WorkQueueProcessRun @DataSetInstanceRunGroup='RunGroup03'

    04/04/2012 10:41:55,spid20s,Unknown,frame procname=adhoc line=1 sqlhandle=0x010008007109a424300ece6a020000000000000000000000

    04/04/2012 10:41:55,spid20s,Unknown,<c/>@WorkQueueID= @WorkQueueID OUTPUT;

    04/04/2012 10:41:55,spid20s,Unknown,EXEC [WorkQueueToProcessGet] @DataSetInstanceRunGroup = @DataSetInstanceRunGroup

    04/04/2012 10:41:55,spid20s,Unknown,frame procname=dbVelocityMetadata.dbo.WorkQueueProcessRun line=250 stmtstart=18080 stmtend=18382 sqlhandle=0x03000800cd33bc7ae794dd00a29f00000100000000000000

    04/04/2012 10:41:55,spid20s,Unknown,@InfaRunInstanceName = @InfaRunInstanceName;

    04/04/2012 10:41:55,spid20s,Unknown,EXEC [dbo].[InfaWorkFLowCompletionSet]

    04/04/2012 10:41:55,spid20s,Unknown,frame procname=dbVelocityMetadata.dbo.WorkQueueToProcessGet line=253 stmtstart=20744 stmtend=20966 sqlhandle=0x03000800cba791697928ac00f19f00000100000000000000

    04/04/2012 10:41:55,spid20s,Unknown,WHERE WorkQueueID = @WorkQueueID;

    04/04/2012 10:41:55,spid20s,Unknown,<c/>LastUpdatedDate = GETDATE()

    04/04/2012 10:41:55,spid20s,Unknown,<c/>LastUpdatedBY = SYSTEM_USER

    04/04/2012 10:41:55,spid20s,Unknown,<c/>WorkQueueProcessingMSG = 'DataLoaded: ' + WorkQueueProcessingMSG

    04/04/2012 10:41:55,spid20s,Unknown,<c/>WorkQueueEndWorkDate = GETDATE()

    04/04/2012 10:41:55,spid20s,Unknown,<c/>WorkQueueStatusUTCDate = GETUTCDATE()

    04/04/2012 10:41:55,spid20s,Unknown,<c/> WorkQueueStatusDate = GETDATE()

    04/04/2012 10:41:55,spid20s,Unknown,SET WorkQueueStatus = 'DataLoaded'

    04/04/2012 10:41:55,spid20s,Unknown,UPDATE [dbo].[WorkQueue]

    04/04/2012 10:41:55,spid20s,Unknown,frame procname=dbVelocityMetadata.dbo.InfaWorkFLowCompletionSet line=175 stmtstart=11768 stmtend=12510 sqlhandle=0x03000800e58a96545994dd00a29f00000100000000000000

    04/04/2012 10:41:55,spid20s,Unknown,executionStack

    04/04/2012 10:41:55,spid20s,Unknown,process id=process5863048 taskpriority=0 logused=576 waitresource=PAGE: 8:1:3535848 waittime=2905 ownerId=1929881466 transactionname=user_transaction lasttranstarted=2012-04-04T10:41:52.417 XDES=0x1bf6db950 lockMode=IX schedulerid=14 kpid=7076 status=suspended spid=100 sbid=0 ecid=0 priority=0 trancount=2 lastbatchstarted=2012-04-04T10:40:01.460 lastbatchcompleted=2012-04-04T10:40:01.460 clientapp=SQLAgent - TSQL JobStep (Job 0xA39B81E4B6075D42819CF70EEDDA9A5E : Step 4) hostname=TK5BIVELMDS hostpid=5092 loginname=REDMOND\EDWSQLP isolationlevel=read committed (2) xactid=1929881466 currentdb=8 lockTimeout=4294967295 clientoption1=673185824 clientoption2=128056

    04/04/2012 10:41:55,spid20s,Unknown,EXEC WatchDogProcessRun

    04/04/2012 10:41:55,spid20s,Unknown,inputbuf

    04/04/2012 10:41:55,spid20s,Unknown,EXEC WatchDogProcessRun

    04/04/2012 10:41:55,spid20s,Unknown,frame procname=adhoc line=1 sqlhandle=0x01000800a6ec490100e35165040000000000000000000000

    04/04/2012 10:41:55,spid20s,Unknown,<c/>@Debug = @Debug

    04/04/2012 10:41:55,spid20s,Unknown,EXECUTE dbo.WorkQueueStatusSyncSet @SystemSystemObjectName = @SystemSystemObjectName

    04/04/2012 10:41:55,spid20s,Unknown,frame procname=dbVelocityMetadata.dbo.WatchDogProcessRun line=109 stmtstart=7988 stmtend=8226 sqlhandle=0x03000800b07ef775ccaed2000ba000000100000000000000

    04/04/2012 10:41:55,spid20s,Unknown,ON rw

    04/04/2012 10:41:55,spid20s,Unknown,JOIN InfaRepositoryOne.[PowerCenter02].dbo.REP_WFLOW_RUN rwr

    04/04/2012 10:41:55,spid20s,Unknown,ON dgt.DataGroupID = ftg.TableDataGroupID

    04/04/2012 10:41:55,spid20s,Unknown,JOIN dbo.DataGroupTable dgt

    04/04/2012 10:41:55,spid20s,Unknown,ON ftg.FileTemplateGroupID = ft.FileTemplateGroupID

    04/04/2012 10:41:55,spid20s,Unknown,JOIN dbo.FileTemplateGroup ftg

    04/04/2012 10:41:55,spid20s,Unknown,ON ft.FileTemplateID = fa.FileTemplateID

    04/04/2012 10:41:55,spid20s,Unknown,JOIN dbo.FileTemplate ft

    04/04/2012 10:41:55,spid20s,Unknown,ON fa.FileArrivalID = lf.FileArrivalID

    04/04/2012 10:41:55,spid20s,Unknown,JOIN dbo.FileArrival fa

    04/04/2012 10:41:55,spid20s,Unknown,ON lf.LoadFileID = lfr.LoadFileID

    04/04/2012 10:41:55,spid20s,Unknown,JOIN dbo.CDCDataExtractor_LoadFile lf

    04/04/2012 10:41:55,spid20s,Unknown,ON lfr.WorkQueueID = wq.WorkQueueID

    04/04/2012 10:41:55,spid20s,Unknown,JOIN dbo.LoadFileRun lfr

    04/04/2012 10:41:55,spid20s,Unknown,FROM dbo.WorkQueue wq

    04/04/2012 10:41:55,spid20s,Unknown,<c/>IsProcessed = 0

    04/04/2012 10:41:55,spid20s,Unknown,<c/>dgt.TableName

    04/04/2012 10:41:55,spid20s,Unknown,<c/>fa.FullUNCFileName

    04/04/2012 10:41:55,spid20s,Unknown,<c/>RunStatusCode = rwr.RUN_STATUS_CODE

    04/04/2012 10:41:55,spid20s,Unknown,SELECT wq.WorkQueueID

    04/04/2012 10:41:55,spid20s,Unknown,EXEC('

    04/04/2012 10:41:55,spid20s,Unknown,<c/>IsProcessed)

    04/04/2012 10:41:55,spid20s,Unknown,<c/>TableName

    04/04/2012 10:41:55,spid20s,Unknown,<c/>FullUNCFileName

    04/04/2012 10:41:55,spid20s,Unknown,<c/>RunStatusCode

    04/04/2012 10:41:55,spid20s,Unknown,INSERT INTO @StatusMismatch (WorkQueueID

    04/04/2012 10:41:55,spid20s,Unknown,frame procname=dbVelocityMetadata.dbo.WorkQueueStatusSyncSet line=120 stmtstart=11164 stmtend=13782 sqlhandle=0x03000800595fa967b7aed2000ba000000100000000000000

    04/04/2012 10:41:55,spid20s,Unknown,AND rwr.RUN_STATUS_CODE NOT IN (6 --'Running'

    04/04/2012 10:41:55,spid20s,Unknown,AND (DATEDIFF(second<c/> wq.LastUpdatedDate<c/>GETDATE()) > 60)

    04/04/2012 10:41:55,spid20s,Unknown,WHERE wq.WorkQueueStatus = 'Running'

    04/04/2012 10:41:55,spid20s,Unknown,ON rwr.WORKFLOW_RUN_ID = lfr.InfaWorkFlowRunID

    04/04/2012 10:41:55,spid20s,Unknown,JOIN InfaRepositoryOne.[PowerCenter02].dbo.REP_WFLOW_RUN rwr

    04/04/2012 10:41:55,spid20s,Unknown,ON dgt.DataGroupID = ftg.TableDataGroupID

    04/04/2012 10:41:55,spid20s,Unknown,JOIN dbo.DataGroupTable dgt

    04/04/2012 10:41:55,spid20s,Unknown,ON ftg.FileTemplateGroupID = ft.FileTemplateGroupID

    04/04/2012 10:41:55,spid20s,Unknown,JOIN dbo.FileTemplateGroup ftg

    04/04/2012 10:41:55,spid20s,Unknown,ON ft.FileTemplateID = fa.FileTemplateID

    04/04/2012 10:41:55,spid20s,Unknown,JOIN dbo.FileTemplate ft

    04/04/2012 10:41:55,spid20s,Unknown,ON fa.FileArrivalID = lf.FileArrivalID

    04/04/2012 10:41:55,spid20s,Unknown,JOIN dbo.FileArrival fa

    04/04/2012 10:41:55,spid20s,Unknown,ON lf.LoadFileID = lfr.LoadFileID

    04/04/2012 10:41:55,spid20s,Unknown,JOIN dbo.CDCDataExtractor_LoadFile lf

    04/04/2012 10:41:55,spid20s,Unknown,ON lfr.WorkQueueID = wq.WorkQueueID

    04/04/2012 10:41:55,spid20s,Unknown,JOIN dbo.LoadFileRun lfr

    04/04/2012 10:41:55,spid20s,Unknown,FROM dbo.WorkQueue wq

    04/04/2012 10:41:55,spid20s,Unknown,<c/>IsProcessed = 0

    04/04/2012 10:41:55,spid20s,Unknown,<c/>dgt.TableName

    04/04/2012 10:41:55,spid20s,Unknown,<c/>fa.FullUNCFileName

    04/04/2012 10:41:55,spid20s,Unknown,<c/>RunStatusCode = rwr.RUN_STATUS_CODE

    04/04/2012 10:41:55,spid20s,Unknown,SELECT wq.WorkQueueID

    04/04/2012 10:41:55,spid20s,Unknown,frame procname=adhoc line=2 stmtstart=12 sqlhandle=0x020000006ea03b27edbc086977f52c8b8c15e611c82ded29

    04/04/2012 10:41:55,spid20s,Unknown,executionStack

    04/04/2012 10:41:55,spid20s,Unknown,process id=process184bc8508 taskpriority=0 logused=0 waitresource=PAGE: 8:1:1688513 waittime=2610 ownerId=1929881397 transactionname=INSERT EXEC lasttranstarted=2012-04-04T10:41:52.380 XDES=0x45b949950 lockMode=S schedulerid=9 kpid=15576 status=suspended spid=77 sbid=0 ecid=0 priority=0 trancount=1 lastbatchstarted=2012-04-04T10:41:52.380 lastbatchcompleted=2012-04-04T10:41:52.380 clientapp=SQLAgent - TSQL JobStep (Job 0x7C518879A7C1324594014B9F1855FFB2 : Step 4) hostname=TK5BIVELMDS hostpid=5092 loginname=REDMOND\EDWSQLP isolationlevel=read committed (2) xactid=1929881397 currentdb=8 lockTimeout=4294967295 clientoption1=538968096 clientoption2=128056

    04/04/2012 10:41:55,spid20s,Unknown,process-list

    04/04/2012 10:41:55,spid20s,Unknown,deadlock victim=process184bc8508

    04/04/2012 10:41:55,spid20s,Unknown,deadlock-list

    04/04/2012 10:36:52,spid21s,Unknown,waiter id=process879a7dc8 mode=IX requestType=wait

    04/04/2012 10:36:52,spid21s,Unknown,waiter-list

    04/04/2012 10:36:52,spid21s,Unknown,owner id=process5648988 mode=S

    04/04/2012 10:36:52,spid21s,Unknown,owner-list

    04/04/2012 10:36:52,spid21s,Unknown,pagelock fileid=1 pageid=3535848 dbid=8 objectname=dbVelocityMetadata.dbo.WorkQueue id=lockb1b95f80 mode=S associatedObjectId=72057594108837888

    04/04/2012 10:36:52,spid21s,Unknown,waiter id=process5648988 mode=S requestType=wait

    04/04/2012 10:36:52,spid21s,Unknown,waiter-list

    04/04/2012 10:36:52,spid21s,Unknown,owner id=process879a7dc8 mode=IX

    04/04/2012 10:36:52,spid21s,Unknown,owner-list

    04/04/2012 10:36:52,spid21s,Unknown,pagelock fileid=1 pageid=1688513 dbid=8 objectname=dbVelocityMetadata.dbo.WorkQueue id=lock3dec8f380 mode=IX associatedObjectId=72057594108837888

    04/04/2012 10:36:52,spid21s,Unknown,resource-list

    04/04/2012 10:36:52,spid21s,Unknown,EXEC WorkQueueProcessRun @DataSetInstanceRunGroup='RunGroup02'

    04/04/2012 10:36:52,spid21s,Unknown,inputbuf

    04/04/2012 10:36:52,spid21s,Unknown,EXEC WorkQueueProcessRun @DataSetInstanceRunGroup='RunGroup02'

    04/04/2012 10:36:52,spid21s,Unknown,frame procname=adhoc line=1 sqlhandle=0x010008008fe56d08d0787a74050000000000000000000000

    04/04/2012 10:36:52,spid21s,Unknown,<c/>@WorkQueueID= @WorkQueueID OUTPUT;

    04/04/2012 10:36:52,spid21s,Unknown,EXEC [WorkQueueToProcessGet] @DataSetInstanceRunGroup = @DataSetInstanceRunGroup

    04/04/2012 10:36:52,spid21s,Unknown,frame procname=dbVelocityMetadata.dbo.WorkQueueProcessRun line=250 stmtstart=18080 stmtend=18382 sqlhandle=0x03000800cd33bc7ae794dd00a29f00000100000000000000

    04/04/2012 10:36:52,spid21s,Unknown,@InfaRunInstanceName = @InfaRunInstanceName;

    04/04/2012 10:36:52,spid21s,Unknown,EXEC [dbo].[InfaWorkFLowCompletionSet]

    04/04/2012 10:36:52,spid21s,Unknown,frame procname=dbVelocityMetadata.dbo.WorkQueueToProcessGet line=253 stmtstart=20744 stmtend=20966 sqlhandle=0x03000800cba791697928ac00f19f00000100000000000000

    04/04/2012 10:36:52,spid21s,Unknown,WHERE WorkQueueID = @WorkQueueID;

    04/04/2012 10:36:52,spid21s,Unknown,<c/>LastUpdatedDate = GETDATE()

    04/04/2012 10:36:52,spid21s,Unknown,<c/>LastUpdatedBY = SYSTEM_USER

    04/04/2012 10:36:52,spid21s,Unknown,<c/>WorkQueueProcessingMSG = 'DataLoaded: ' + WorkQueueProcessingMSG

    04/04/2012 10:36:52,spid21s,Unknown,<c/>WorkQueueEndWorkDate = GETDATE()

    04/04/2012 10:36:52,spid21s,Unknown,<c/>WorkQueueStatusUTCDate = GETUTCDATE()

    04/04/2012 10:36:52,spid21s,Unknown,<c/> WorkQueueStatusDate = GETDATE()

    04/04/2012 10:36:52,spid21s,Unknown,SET WorkQueueStatus = 'DataLoaded'

    04/04/2012 10:36:52,spid21s,Unknown,UPDATE [dbo].[WorkQueue]

    04/04/2012 10:36:52,spid21s,Unknown,frame procname=dbVelocityMetadata.dbo.InfaWorkFLowCompletionSet line=175 stmtstart=11768 stmtend=12510 sqlhandle=0x03000800e58a96545994dd00a29f00000100000000000000

    04/04/2012 10:36:52,spid21s,Unknown,executionStack

    04/04/2012 10:36:52,spid21s,Unknown,process id=process879a7dc8 taskpriority=0 logused=592 waitresource=PAGE: 8:1:3535848 waittime=2864 ownerId=1929270789 transactionname=user_transaction lasttranstarted=2012-04-04T10:36:49.450 XDES=0x644ea9950 lockMode=IX schedulerid=10 kpid=12740 status=suspended spid=70 sbid=0 ecid=0 priority=0 trancount=2 lastbatchstarted=2012-04-04T10:34:01.273 lastbatchcompleted=2012-04-04T10:34:01.273 clientapp=SQLAgent - TSQL JobStep (Job 0xA88DC1ACBEF44F4E8641464C2A2B035E : Step 4) hostname=TK5BIVELMDS hostpid=5092 loginname=REDMOND\EDWSQLP isolationlevel=read committed (2) xactid=1929270789 currentdb=8 lockTimeout=4294967295 clientoption1=673185824 clientoption2=128056

    04/04/2012 10:36:52,spid21s,Unknown,EXEC WatchDogProcessRun

    04/04/2012 10:36:52,spid21s,Unknown,inputbuf

    04/04/2012 10:36:52,spid21s,Unknown,EXEC WatchDogProcessRun

    04/04/2012 10:36:52,spid21s,Unknown,frame procname=adhoc line=1 sqlhandle=0x01000800a6ec49018020a6ab020000000000000000000000

    04/04/2012 10:36:52,spid21s,Unknown,<c/>@Debug = @Debug

    04/04/2012 10:36:52,spid21s,Unknown,EXECUTE dbo.WorkQueueStatusSyncSet @SystemSystemObjectName = @SystemSystemObjectName

    04/04/2012 10:36:52,spid21s,Unknown,frame procname=dbVelocityMetadata.dbo.WatchDogProcessRun line=109 stmtstart=7988 stmtend=8226 sqlhandle=0x03000800b07ef775ccaed2000ba000000100000000000000

    04/04/2012 10:36:52,spid21s,Unknown,ON rw

    04/04/2012 10:36:52,spid21s,Unknown,JOIN InfaRepositoryOne.[PowerCenter02].dbo.REP_WFLOW_RUN rwr

    04/04/2012 10:36:52,spid21s,Unknown,ON dgt.DataGroupID = ftg.TableDataGroupID

    04/04/2012 10:36:52,spid21s,Unknown,JOIN dbo.DataGroupTable dgt

    04/04/2012 10:36:52,spid21s,Unknown,ON ftg.FileTemplateGroupID = ft.FileTemplateGroupID

    04/04/2012 10:36:52,spid21s,Unknown,JOIN dbo.FileTemplateGroup ftg

    04/04/2012 10:36:52,spid21s,Unknown,ON ft.FileTemplateID = fa.FileTemplateID

    04/04/2012 10:36:52,spid21s,Unknown,JOIN dbo.FileTemplate ft

    04/04/2012 10:36:52,spid21s,Unknown,ON fa.FileArrivalID = lf.FileArrivalID

    04/04/2012 10:36:52,spid21s,Unknown,JOIN dbo.FileArrival fa

    04/04/2012 10:36:52,spid21s,Unknown,ON lf.LoadFileID = lfr.LoadFileID

    04/04/2012 10:36:52,spid21s,Unknown,JOIN dbo.CDCDataExtractor_LoadFile lf

    04/04/2012 10:36:52,spid21s,Unknown,ON lfr.WorkQueueID = wq.WorkQueueID

    04/04/2012 10:36:52,spid21s,Unknown,JOIN dbo.LoadFileRun lfr

    04/04/2012 10:36:52,spid21s,Unknown,FROM dbo.WorkQueue wq

    04/04/2012 10:36:52,spid21s,Unknown,<c/>IsProcessed = 0

    04/04/2012 10:36:52,spid21s,Unknown,<c/>dgt.TableName

    04/04/2012 10:36:52,spid21s,Unknown,<c/>fa.FullUNCFileName

    04/04/2012 10:36:52,spid21s,Unknown,<c/>RunStatusCode = rwr.RUN_STATUS_CODE

    04/04/2012 10:36:52,spid21s,Unknown,SELECT wq.WorkQueueID

    04/04/2012 10:36:52,spid21s,Unknown,EXEC('

    04/04/2012 10:36:52,spid21s,Unknown,<c/>IsProcessed)

    04/04/2012 10:36:52,spid21s,Unknown,<c/>TableName

    04/04/2012 10:36:52,spid21s,Unknown,<c/>FullUNCFileName

    04/04/2012 10:36:52,spid21s,Unknown,<c/>RunStatusCode

    04/04/2012 10:36:52,spid21s,Unknown,INSERT INTO @StatusMismatch (WorkQueueID

    04/04/2012 10:36:52,spid21s,Unknown,frame procname=dbVelocityMetadata.dbo.WorkQueueStatusSyncSet line=120 stmtstart=11164 stmtend=13782 sqlhandle=0x03000800595fa967b7aed2000ba000000100000000000000

    04/04/2012 10:36:52,spid21s,Unknown,AND rwr.RUN_STATUS_CODE NOT IN (6 --'Running'

    04/04/2012 10:36:52,spid21s,Unknown,AND (DATEDIFF(second<c/> wq.LastUpdatedDate<c/>GETDATE()) > 60)

    04/04/2012 10:36:52,spid21s,Unknown,WHERE wq.WorkQueueStatus = 'Running'

    04/04/2012 10:36:52,spid21s,Unknown,ON rwr.WORKFLOW_RUN_ID = lfr.InfaWorkFlowRunID

    04/04/2012 10:36:52,spid21s,Unknown,JOIN InfaRepositoryOne.[PowerCenter02].dbo.REP_WFLOW_RUN rwr

    04/04/2012 10:36:52,spid21s,Unknown,ON dgt.DataGroupID = ftg.TableDataGroupID

    04/04/2012 10:36:52,spid21s,Unknown,JOIN dbo.DataGroupTable dgt

    04/04/2012 10:36:52,spid21s,Unknown,ON ftg.FileTemplateGroupID = ft.FileTemplateGroupID

    04/04/2012 10:36:52,spid21s,Unknown,JOIN dbo.FileTemplateGroup ftg

    04/04/2012 10:36:52,spid21s,Unknown,ON ft.FileTemplateID = fa.FileTemplateID

    04/04/2012 10:36:52,spid21s,Unknown,JOIN dbo.FileTemplate ft

    04/04/2012 10:36:52,spid21s,Unknown,ON fa.FileArrivalID = lf.FileArrivalID

    04/04/2012 10:36:52,spid21s,Unknown,JOIN dbo.FileArrival fa

    04/04/2012 10:36:52,spid21s,Unknown,ON lf.LoadFileID = lfr.LoadFileID

    04/04/2012 10:36:52,spid21s,Unknown,JOIN dbo.CDCDataExtractor_LoadFile lf

    04/04/2012 10:36:52,spid21s,Unknown,ON lfr.WorkQueueID = wq.WorkQueueID

    04/04/2012 10:36:52,spid21s,Unknown,JOIN dbo.LoadFileRun lfr

    04/04/2012 10:36:52,spid21s,Unknown,FROM dbo.WorkQueue wq

    04/04/2012 10:36:52,spid21s,Unknown,<c/>IsProcessed = 0

    04/04/2012 10:36:52,spid21s,Unknown,<c/>dgt.TableName

    04/04/2012 10:36:52,spid21s,Unknown,<c/>fa.FullUNCFileName

    04/04/2012 10:36:52,spid21s,Unknown,<c/>RunStatusCode = rwr.RUN_STATUS_CODE

    04/04/2012 10:36:52,spid21s,Unknown,SELECT wq.WorkQueueID

    04/04/2012 10:36:52,spid21s,Unknown,frame procname=adhoc line=2 stmtstart=12 sqlhandle=0x020000006ea03b27edbc086977f52c8b8c15e611c82ded29

    04/04/2012 10:36:52,spid21s,Unknown,executionStack

    04/04/2012 10:36:52,spid21s,Unknown,process id=process5648988 taskpriority=0 logused=0 waitresource=PAGE: 8:1:1688513 waittime=2539 ownerId=1929270181 transactionname=INSERT EXEC lasttranstarted=2012-04-04T10:36:49.197 XDES=0x2a60ff790 lockMode=S schedulerid=13 kpid=6484 status=suspended spid=71 sbid=0 ecid=0 priority=0 trancount=1 lastbatchstarted=2012-04-03T16:22:25.993 lastbatchcompleted=2012-04-03T16:22:25.993 clientapp=SQLAgent - TSQL JobStep (Job 0x7C518879A7C1324594014B9F1855FFB2 : Step 4) hostname=TK5BIVELMDS hostpid=5092 loginname=REDMOND\EDWSQLP isolationlevel=read committed (2) xactid=1929270181 currentdb=8 lockTimeout=4294967295 clientoption1=538968096 clientoption2=128056

    04/04/2012 10:36:52,spid21s,Unknown,process-list

    04/04/2012 10:36:52,spid21s,Unknown,deadlock victim=process5648988

    04/04/2012 10:36:52,spid21s,Unknown,deadlock-list

    04/04/2012 10:21:40,spid86,Unknown,SQL Trace stopped. Trace ID = '2'. Login Name = 'REDMOND\v-jagade'.

    04/04/2012 10:20:00,spid61,Unknown,The client was unable to reuse a session with SPID 61<c/> which had been reset for connection pooling. The failure ID is 29. This error may have been caused by an earlier operation failing. Check the error logs for failed operations immediately before this error message.

    04/04/2012 10:20:00,spid61,Unknown,Error: 18056<c/> Severity: 20<c/> State: 29.

    04/04/2012 10:16:28,spid65,Unknown,The client was unable to reuse a session with SPID 65<c/> which had been reset for connection pooling. The failure ID is 29. This error may have been caused by an earlier operation failing. Check the error logs for failed operations immediately before this error message.

    04/04/2012 10:16:28,spid65,Unknown,Error: 18056<c/> Severity: 20<c/> State: 29.

    04/04/2012 10:12:44,spid53,Unknown,The client was unable to reuse a session with SPID 53<c/> which had been reset for connection pooling. The failure ID is 29. This error may have been caused by an earlier operation failing. Check the error logs for failed operations immediately before this error message.

    04/04/2012 10:12:44,spid53,Unknown,Error: 18056<c/> Severity: 20<c/> State: 29.

    04/04/2012 10:12:44,spid123,Unknown,Failed to run resource governor classifier user-defined function. See previous errors in SQL Server error log from session ID 123 for details. Classifier elapsed time: 0 ms.

    04/04/2012 10:12:44,spid123,Unknown,Error: 10982<c/> Severity: 16<c/> State: 1.

    04/04/2012 10:03:03,spid62,Unknown,SQL Trace ID 2 was started by login "REDMOND\v-jagade".

    04/04/2012 02:07:20,Backup,Unknown,Database backed up. Database: SQLOPSDB<c/> creation date(time): 2012/02/13(18:46:53)<c/> pages dumped: 650<c/> first LSN: 79:9218:44<c/> last LSN: 79:9236:1<c/> number of dump devices: 1<c/> device information: (FILE=1<c/> TYPE=DISK: {'E:\BIHVDSSQLMDS\Backup\MSSQL\Bak\Dailybackup\SQLOPSDB\SQLOPSDB_backup_2012_04_04_020002_5148314.bak'}). This is an informational message only. No user action is required.

    04/04/2012 02:07:19,Backup,Unknown,Database backed up. Database: dbVelocityMetadata<c/> creation date(time): 2011/06/20(15:06:11)<c/> pages dumped: 3724590<c/> first LSN: 5997:239311:163<c/> last LSN: 5997:243974:1<c/> number of dump devices: 1<c/> device information: (FILE=1<c/> TYPE=DISK: {'E:\BIHVDSSQLMDS\Backup\MSSQL\Bak\Dailybackup\dbVelocityMetadata\dbVelocityMetadata_backup_2012_04_04_020002_4992086.bak'}). This is an informational message only. No user action is required.

    04/04/2012 02:00:33,Backup,Unknown,Database backed up. Database: dbVelocityFileMonitor<c/> creation date(time): 2011/06/20(15:05:08)<c/> pages dumped: 77835<c/> first LSN: 575:186143:153<c/> last LSN: 575:186204:1<c/> number of dump devices: 1<c/> device information: (FILE=1<c/> TYPE=DISK: {'E:\BIHVDSSQLMDS\Backup\MSSQL\Bak\Dailybackup\dbVelocityFileMonitor\dbVelocityFileMonitor_backup_2012_04_04_020002_4992086.bak'}). This is an informational message only. No user action is required.

    04/04/2012 02:00:24,Backup,Unknown,Database backed up. Database: DSLGetDataClient<c/> creation date(time): 2011/10/05(09:55:05)<c/> pages dumped: 3603<c/> first LSN: 137:7433:37<c/> last LSN: 137:7449:1<c/> number of dump devices: 1<c/> device information: (FILE=1<c/> TYPE=DISK: {'E:\BIHVDSSQLMDS\Backup\MSSQL\Bak\Dailybackup\DSLGetDataClient\DSLGetDataClient_backup_2012_04_04_020002_4992086.bak'}). This is an informational message only. No user action is required.

    04/04/2012 02:00:22,Backup,Unknown,Database backed up. Database: Operations<c/> creation date(time): 2011/06/20(14:59:42)<c/> pages dumped: 83862<c/> first LSN: 37504:42472:37<c/> last LSN: 37504:42488:1<c/> number of dump devices: 1<c/> device information: (FILE=1<c/> TYPE=DISK: {'E:\BIHVDSSQLMDS\Backup\MSSQL\Bak\Dailybackup\Operations\Operations_backup_2012_04_04_020002_4835858.bak'}). This is an informational message only. No user action is required.

    04/04/2012 02:00:10,Backup,Unknown,Database backed up. Database: msdb<c/> creation date(time): 2010/04/02(17:35:08)<c/> pages dumped: 13582<c/> first LSN: 419180:9974:288<c/> last LSN: 419180:10091:1<c/> number of dump devices: 1<c/> device information: (FILE=1<c/> TYPE=DISK: {'E:\BIHVDSSQLMDS\Backup\MSSQL\Bak\Dailybackup\msdb\msdb_backup_2012_04_04_020002_4835858.bak'}). This is an informational message only. No user action is required.

    04/04/2012 02:00:07,Backup,Unknown,Database backed up. Database: model<c/> creation date(time): 2003/04/08(09:13:36)<c/> pages dumped: 185<c/> first LSN: 82:68:37<c/> last LSN: 82:84:1<c/> number of dump devices: 1<c/> device information: (FILE=1<c/> TYPE=DISK: {'E:\BIHVDSSQLMDS\Backup\MSSQL\Bak\Dailybackup\model\model_backup_2012_04_04_020002_4835858.bak'}). This is an informational message only. No user action is required.

    04/04/2012 02:00:05,Backup,Unknown,Database backed up. Database: master<c/> creation date(time): 2012/03/23(17:33:27)<c/> pages dumped: 386<c/> first LSN: 536:483:72<c/> last LSN: 537:19:1<c/> number of dump devices: 1<c/> device information: (FILE=1<c/> TYPE=DISK: {'E:\BIHVDSSQLMDS\Backup\MSSQL\Bak\Dailybackup\master\master_backup_2012_04_04_020002_4835858.bak'}). This is an informational message only. No user action is required.

    04/04/2012 00:00:52,spid18s,Unknown,This instance of SQL Server has been using a process ID of 11368 since 3/23/2012 5:33:39 PM (local) 3/24/2012 12:33:39 AM (UTC). This is an informational message only; no user action is required.

    04/03/2012 21:41:17,Backup,Unknown,Database backed up. Database: Operations<c/> creation date(time): 2011/06/20(14:59:42)<c/> pages dumped: 83803<c/> first LSN: 37504:42412:117<c/> last LSN: 37504:42460:1<c/> number of dump devices: 1<c/> device information: (FILE=1<c/> TYPE=VIRTUAL_DEVICE: {'{1CC1103A-112D-4217-B172-0894759DB036}1'}). This is an informational message only. No user action is required.

  • I just wanted to figure out. which SPID is killing/Suspending the 100,77,70,71.

    I see the query UPDATE [dbo].[WorkQueue]

    SET WorkQueueStatus = 'DataLoaded'

    WorkQueueStatusDate = GETDATE()

    WorkQueueStatusUTCDate = GETUTCDATE()

    WorkQueueEndWorkDate = GETDATE()

    WorkQueueProcessingMSG = 'DataLoaded: ' + WorkQueueProcessingMSG

    LastUpdatedBY = SYSTEM_USER

    LastUpdatedDate = GETDATE()

    WHERE WorkQueueID = @WorkQueueID;

    is causing the issue, Please correct me if I'm wrong. And i found the the store Proc in which this update statement is used.

    I also want to know which SPID is using this update statement. Which means that this SPID is suspending/killing rest of the jobs or SPIDs.

  • Can anyone please update me on the above issue, Please.

    I need to give a detailed update before the day close

  • Mac1986 (4/4/2012)


    I just wanted to figure out. which SPID is killing/Suspending the 100,77,70,71.

    I see the query UPDATE [dbo].[WorkQueue]

    SET WorkQueueStatus = 'DataLoaded'

    WorkQueueStatusDate = GETDATE()

    WorkQueueStatusUTCDate = GETUTCDATE()

    WorkQueueEndWorkDate = GETDATE()

    WorkQueueProcessingMSG = 'DataLoaded: ' + WorkQueueProcessingMSG

    LastUpdatedBY = SYSTEM_USER

    LastUpdatedDate = GETDATE()

    WHERE WorkQueueID = @WorkQueueID;

    is causing the issue, Please correct me if I'm wrong. And i found the the store Proc in which this update statement is used.

    I also want to know which SPID is using this update statement. Which means that this SPID is suspending/killing rest of the jobs or SPIDs.

    SPIDs change. You have to look at the spid at the time of the failure to know what's causing it. They get re-used. SQL Agent, which runs the jobs, will close/open connections for each run, you won't be able to trace it back that way.

    That seems like the correct proc, set it to use WITH (TABLOCKX). Note: It will take longer for it to run because it will always wait until it has complete access to the table without any other connections sharing it at the time, but you need that to avoid deadlocking.


    - Craig Farrell

    Never stop learning, even if it hurts. Ego bruises are practically mandatory as you learn unless you've never risked enough to make a mistake.

    For better assistance in answering your questions[/url] | Forum Netiquette
    For index/tuning help, follow these directions.[/url] |Tally Tables[/url]

    Twitter: @AnyWayDBA

  • I did a little investigation back on my server. Please find the details below and suggestions are open.

    Query1:

    UPDATE [dbo].[WorkQueue]

    SET WorkQueueStatus = 'DataLoaded'

    , WorkQueueStatusDate = GETDATE()

    ,WorkQueueStatusUTCDate = GETUTCDATE()

    ,WorkQueueEndWorkDate = GETDATE()

    ,WorkQueueProcessingMSG = 'DataLoaded: ' + WorkQueueProcessingMSG

    ,LastUpdatedBY = SYSTEM_USER

    ,LastUpdatedDate = GETDATE()

    WHERE WorkQueueID = @WorkQueueID;

    ---------------------------------------------------------------------------------------------------------------------------

    Query2

    INSERT INTO @StatusNULLRunInstance (WorkQueueID ,RunStatusCode ,FullUNCFileName ,TableName ,IsProcessed)

    EXEC(SELECT wq.WorkQueueID ,RunStatusCode = rwr.RUN_STATUS_CODE ,fa.FullUNCFileName ,dgt.TableName ,IsProcessed = 0

    FROM dbo.WorkQueue wq

    JOIN dbo.LoadFileRun lfr ON lfr.WorkQueueID = wq.WorkQueueID

    JOIN dbo.CDCDataExtractor_LoadFile lf ON lf.LoadFileID = lfr.LoadFileID

    JOIN dbo.FileArrival fa ON fa.FileArrivalID = lf.FileArrivalID

    JOIN dbo.FileTemplate ft ON ft.FileTemplateID = fa.FileTemplateID

    JOIN dbo.FileTemplateGroup ftg ON ftg.FileTemplateGroupID = ft.FileTemplateGroupID

    JOIN dbo.DataGroupTable dgt ON dgt.DataGroupID = ftg.TableDataGroupID

    LEFT JOIN (SELECT RUNINST_NAME, RID=ROW_NUMBER() OVER (PARTITION BY RUNINST_NAME ORDER BY WORKFLOW_RUN_ID DESC), RUN_STATUS_CODE

    FROM InfaRepositoryOne.[PowerCenter02].dbo.OPB_WFLOW_RUN

    WHERE WORKFLOW_NAME IN ('wfFileLoadInitialization','wfPreLoadInitialization')

    AND RUN_STATUS_CODE NOT IN (6 --''Running''

    ,7 --''Suspending''

    ,9 --''Stopping''

    ,10 --''Aborting''

    ,11 --''Waiting''

    )

    )rwr ON rwr.RUNINST_NAME = fa.InfaRunInstanceName AND rwr.RID=1

    WHERE wq.WorkQueueStatus = 'Running'

    AND wq.[highlight=#ffff11]WorkQueueProcessingID[/highlight] IS NULL

    AND rwr.RUNINST_NAME IS NULL

    AND (DATEDIFF(second, wq.LastUpdatedDate,GETDATE()) > 300)--5 MINUTES

    Query 1 is killing/suspending Query 2.

    What I suggest is including a column[highlight=#ffff11](WorkQueueProcessingID[/highlight]) on existing Non clustered index. This will make the query not go back to the clustered index to search for that column.

    Or

    Use hint MAXDOP 1 for that particular query to avoid parellel threads. So that query is executed one after another.

  • Thanks Evil for your input.

    Yes we can use TablockX but this might increase the granularity level of the lock. The other query has to wait until the current query releases the lock on the whole table.

    We need to update only one couple of columns in that table where queries are clashing at the same time. So if the lock the entire table this might increase the execution time of the job.

    On a generalized view, we use 'Locking Hints' in the case where do not know where is the exact issue of locking.

    I thought creating non-clustered index or adding the un-included columns to the existing index might be the option.

    If you have any ideas please feel free to share.

  • Mac1986 (4/4/2012)


    Thanks Evil for your input.

    Yes we can use TablockX but this might increase the granularity level of the lock.

    Exact opposite, actually, but that's the point.

    The other query has to wait until the current query releases the lock on the whole table.

    Precisely. You are familiar with the actual cause of a deadlock, I presume?

    We need to update only one couple of columns in that table where queries are clashing at the same time. So if the lock the entire table this might increase the execution time of the job.

    columns don't matter, rows do. And yes, it will increase the execution time on average.

    I thought creating non-clustered index or adding the un-included columns to the existing index might be the option.

    More indexes increase the chance of a deadlock, although minutely.

    A deadlock is caused by two concurrent operations attempting to access data the other one is holding in a transaction simultaneously where one is trying to get exclusive access to all the rows. So Op 1 gets rows 1,2,3 and tries to get 4. Op 2, doing an update and needing exclusive access and for some reason known only to the optimizer has grabbed rows 4, 8, and 10, and now wants 1. These conflict. One has to lose.

    To avoid this, you need full control of all rows by the exclusive locking statement. So, it takes an extra second. What is the actual price to you if that occurs? Are deadlocks and re-runs preferable, causing you to enforce code-tests and everything that regularly causes it wrapped in Try/Catches, or is it better to just take an exclusive table lock for a moment to get your work done?

    I can't answer that, only you and your environment can. The simplest and most common solution is to make sure the primary cause of deadlocks doesn't cause them anymore by grabbing a tablockx. Other solutions get more intricate, and require a detailed understanding of your data usage, not just design, and is something I'd personally have to come in and discuss/examine with your business staff. I wouldn't try to troubleshoot more intricate solutions over a forum.


    - Craig Farrell

    Never stop learning, even if it hurts. Ego bruises are practically mandatory as you learn unless you've never risked enough to make a mistake.

    For better assistance in answering your questions[/url] | Forum Netiquette
    For index/tuning help, follow these directions.[/url] |Tally Tables[/url]

    Twitter: @AnyWayDBA

Viewing 14 posts - 16 through 28 (of 28 total)

You must be logged in to reply to this topic. Login to reply