February 24, 2011 at 4:50 am
glennhill (2/24/2011)
I do have a snapshot of dm_os_waiting_tasks at the time of the problem. The wait_type is LOCK for all 21 rows (LOCK_M_S, LOCK_M_U or LOCK_M_X). The wait durations of all (8) requests that are directly waiting on the the head of the blocking chain were between 8 and 15 hours. Not sure what the 'exec_context_id' is meant to tell us (BOL is not helpful) but the value is 0 in six cases, 10 and 14 in the other two.
Non-zero execution contexts indicate that the query is using parallelism. Was that the lead blocker? I have limited time available, so if you have more information that would be useful, I would encourage you to share it now, rather than waiting for me to guess the right questions to ask.
Paul White
SQLPerformance.com
SQLkiwi blog
@SQL_Kiwi
February 24, 2011 at 5:21 am
Initially I was trying to make my question as succinct as possible. But sorry if you think I was withholding useful information.
The only DMVs for which I have a full snapshot are dm_tran_locks and dm_os_waiting_tasks. I have results of some other queries that I ran at the time on sysprocesses and dm_exec_requests but not sure that I would be thanked for pasting all these queries and their results into this post.
There were no records in dm_os_waiting_tasks for the "head of the blocking machine" (is that what you mean by the "lead blocker"?). The two tasks that had non-zero exec_context_id were for the same session_id and this session was one of seven sessions/requests waiting on the "head of the blocking chain" - the others six had 0 exec_context_id.
February 24, 2011 at 5:23 am
Sorry about the typo. Of course I meant "head of the blocking chain" not "head of the blocking machine".
February 24, 2011 at 5:28 am
glennhill (2/24/2011)
Initially I was trying to make my question as succinct as possible. But sorry if you think I was withholding useful information.
I don't think you were doing it deliberately, no :laugh:
The only DMVs for which I have a full snapshot are dm_tran_locks and dm_os_waiting_tasks. I have results of some other queries that I ran at the time on sysprocesses and dm_exec_requests but not sure that I would be thanked for pasting all these queries and their results into this post.
The forum features the ability to attach files. If you are able to upload all the collected information (perhaps as a zipped file if it is large) I'm sure we would find that useful.
There were no records in dm_os_waiting_tasks for the "head of the blocking machine" (is that what you mean by the "lead blocker"?). The two tasks that had non-zero exec_context_id were for the same session_id and this session was one of seven sessions/requests waiting on the "head of the blocking chain" - the others six had 0 exec_context_id.
Yes, lead blocker = head of the chain. The information showing what the waiting tasks were waiting on may provide some clues as to what was going on. It is unfortunate that the snapshot did not collect anything useful in tasks or locks for the lead blocker.
Paul White
SQLPerformance.com
SQLkiwi blog
@SQL_Kiwi
February 24, 2011 at 8:53 am
Update ...
I started to put a spreadsheet together with my analysis - and in the process of doing that discovered some results I had forgotten that might help me. I will look into this further and let you know later.
Many thanks for the suggestion. Looks like it might have had a desirable side effect.
February 24, 2011 at 9:02 am
glennhill (2/24/2011)
I started to put a spreadsheet together with my analysis - and in the process of doing that discovered some results I had forgotten that might help me. I will look into this further and let you know later.
Awesome, thanks for the update!
Paul White
SQLPerformance.com
SQLkiwi blog
@SQL_Kiwi
February 25, 2011 at 12:48 am
If you want to find out what is causing the lock escalation (which query) this link below has a good query that will provide needed information.
I work as DBA and have some 1000 instanses around, what I often see is that #temptables makes lock chains, and even self blocking(Create table # and update table #..fu**ing developers):).
Even on servers where we have several tempdb files (1 pr CPU), that is good in most situation but consider this: Heavy load, total free memory on server is below 200 MB and file 1 (tempdb) is blocked, session 2 never reach file 2 in tempdb..., it try to write to disk and you will se LATCH in your sysprocess table...
KR
March 1, 2011 at 7:19 am
Sorry about the delay.
I was out of the office yesterday and had a bit of catching up to do.
What I discovered when I tried to put my analysis together in a spreadsheet (now attached)
was that I had saved results of another query on dm_tran_locks.
What is odd is that this query yielded both grants and waits and was much more logical.
What I cannot explain is that no Grants were shown when I did a complete dump of dm_tran_locks.
Thank you all for you input but it seems that my results are inconsistent so perhaps cannot be trusted.
I have devised a more complete analysis plan for next this occurs.
I will feedback on this thread if I experience the same again.
Viewing 8 posts - 16 through 22 (of 22 total)
You must be logged in to reply to this topic. Login to reply