Deadlocks

Question

Deadlocks

Admingod

SSCertifiable

Points: 5997
More actions
December 18, 2019 at 6:20 pm

#3708868

In the application logs we are seeing as deadlocks victim. However, i am not seeing any blocking, locks or deadlocks in my monitoring tool. I have default blocked process threshold which is approx 30 secs and i am not seeing any deadlocks. How would i capture them, would i need to decrease the threshold? If i am not seeing any blocking though. However, this db server has several other user dbs. Please advise?

Viewing 15 posts - 1 through 15 (of 17 total)

You must be logged in to reply to this topic. Login to reply

Leo.Miller SSChampion Points: 13051 More actions · Answer 1

Deadlocks are not the same a blocking, and are typically resolved in a few milliseconds, you won't see them in your blocking alert and lowering your blocking threshold won't change anything. To get deadlock information on the SQL Server you need to be looking in the SQL Server error log, and probably have the relevant trace flags enabled. -T 1204 at least, possibly also -T 1222.

While blocking is an aspect of deadlocks, the real issue is processes completing for conflicting resources in such a way that neither of the processes can complete (check here https://www.sqlshack.com/what-is-a-sql-server-deadlock/)

The app may be handling the deadlock "gracefully" by retrying the failed process (try/catch) without the user even knowing, but ideally the app should be fixed to reduce the deadlocks. Using Try/Catch and not actually fixing the cause is just lazy, although there are occasionally places where fixing a deadlock is really difficult.

One of the most common deadlock causes is within a table/index structure. One process is inserting/updating a record and needs to modify the related index(s), while another process is reading the index and needs to get to the data pages. I find this is often because of poor index design, like an index on a status column, or the left most column of an index being regularly updated. This can often be fixed by redesigning the index to use an INCLUDE clause on the changing portion of the index or the index used by the failing READ operation. This gives other performance benefits as well.

To find the issue you need to enable the trace flags and capture the deadlock data to see what processes and objects are involved, then work out a solution. If you paste the deadlock data you may get some help with a solution.

Leo
Nothing in life is ever so complicated that with a little work it can't be made more complicated.

Admingod SSCertifiable Points: 5997 More actions · Answer 2

I have enabled the trace flag and extended events to capture the deadlock. I can now able to capture and see the deadlock graph using extended events. However, how can i send the deadlock report to someone, i don't see options to save as .xml and send it to.

Admingod SSCertifiable Points: 5997 More actions · Answer 3

Also found the table name and the index name which is a primary key and it's a simple update statement with updae lock (U). So what are the options to not cause deadlock in this situation? I know there is not a simple answer. But what are the options exists?

John Mitchell-245523 SSC Guru Points: 148809 More actions · Answer 4

I use this query to capture recent deadlocks. If you run it in SSMS and click on the DeadlockGraph field, the XML will open in a separate window and you'll be able to save it to disk. Beware, though - these extended events are stored in the ring buffer and, depending on how busy your server is, they may not remain there for long.

SELECT event_data = CONVERT(XML, event_data) AS DeadlockGraph
FROM sys.fn_xe_file_target_read_file(N'system_health*.xel', NULL, NULL, NULL)
WHERE event_data like '%xml_deadlock_report%';

Admingod wrote:

Also found the table name and the index name which is a primary key and it's a simple update statement with updae lock (U). So what are the options to not cause deadlock in this situation? I know there is not a simple answer. But what are the options exists?

Not enough information there to help you, I'm afraid. Find the two (or more) statements involved in the deadlock and tune either the queries themselves, the indexing, or the settings (maxdop etc).

John

Grant Fritchey SSC Guru Points: 398882 More actions · Answer 5

Thanks John, I was just starting to type a long screed.

Let me reinforce what John says. In SQL Server 2008 and better, you no longer need to implement trace flags and pollute your error logs with deadlock graphs. The Extended Events system_health session captures all deadlocks, by default. Further, unless you explicitly stop it, system_health is running on all your systems, again, by default. John's query is an excellent method of plucking that data out.

And yeah, blocking and deadlocks are related, but not the same thing, at all.

"The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
- Theodore Roosevelt

Author of:
SQL Server Execution Plans
SQL Server Query Performance Tuning

Admingod SSCertifiable Points: 5997 More actions · Answer 6

Thanks for the info. Sorry about not providing enough information. So i captured the deadlock and i find the two statements are same which is Update statement. So i see the Victim process has the same update statement that has participated in the deadlock. Is that possible? Also the resource list shows the same objects. All i can see is the Page lock with mode "U". Let me know if you need more info?

Grant Fritchey SSC Guru Points: 398882 More actions · Answer 7

Statements can certainly be the same. The question is, what the SPID of the two processes? If they're the same, you're experiencing parallelism deadlocks. If they're different, it's just two different queries hitting an issue.

Deadlocks are all about the embrace. I have a resource you need and I want a resource you have. You have a resource I need and want a resource I have. Neither of us can let go. So, the engine picks a winner & a loser. The winner gets the resource it needs. The loser rolls back. This is very different from a traditional blocking situation where I have a resource and I'm making you wait on it.

Here's an article on how to deal with deadlocks. It should help.

"The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
- Theodore Roosevelt

Author of:
SQL Server Execution Plans
SQL Server Query Performance Tuning

Admingod SSCertifiable Points: 5997 More actions · Answer 8

They are both different process id. However, i do see that the updates are coming from two different app servers, app server1 is the victim query and from app server2 is the one which ran successfully. The reason we have two app server is for load balancing. Would that could be related to this? How does Max(DOP) would help with deadlock situation? Thanks in advance!

Grant Fritchey SSC Guru Points: 398882 More actions · Answer 9

This is not a parallelism issue. Don't go messing with MAXDOP.

No, you're hitting a classic deadlock problem. You need to follow the link to that article I provided. What's going on is very likely that you've got two pieces of code that access things in different orders. One goes after TableA, then TableB. The other goes after TableB then TableA. Classic deadlock. The answer is to have everything access the tables in the same order. Also, remember that deadlocks are primarily a performance problem. If the updates happen fast enough, even if you have multiple access paths (which you shouldn't have), you won't notice a problem because they'll clear. So another option always available for deadlocks is query tuning. Make the darned things run faster. However, get all your code to go after TableA then TableB (or vice versa). One path through the data. Then, you may see blocking as they wait on one another, but you won't see deadlocks.

"The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
- Theodore Roosevelt

Author of:
SQL Server Execution Plans
SQL Server Query Performance Tuning

Admingod SSCertifiable Points: 5997 More actions · Answer 10

Thanks Grant. I really appreciate for helping me with this. Yes it is a classic deadlock problem . Actually we have only one table and just one simple update. So adding the resources would be one option you think? Not sure if there is any room to do query tuning it's a simple adhoc update query coming from application.

x SSC-Insane Points: 23660 More actions · Answer 11

The app may be handling the deadlock "gracefully" by retrying the failed process (try/catch) without the user even knowing, but ideally the app should be fixed to reduce the deadlocks. Using Try/Catch and not actually fixing the cause is just lazy, although there are occasionally places where fixing a deadlock is really difficult.

Now this is interesting, are we to say its safe to write apps that do not retry or recover after a deadlock? I know that oftentimes app developers will just ignore deadlocks or concurrency errors like doomed transactions, but I thought OMITTING the try/catch was the lazy path, not USING them.

Once the deadlock occurs, the transaction is bailed from, and the app continues with other work, isn't this sort of a loss of data situation?

I've been of the opinion to try / catch, because with query plans that get recompiled and changed, I thought it was a relatively tall order to predict every instance of deadlocks ahead of time, am I wrong here?

Admingod SSCertifiable Points: 5997 More actions · Answer 12

Does SQL Server do a page lock and is that normal behavior if it is getting a large amount of inserts/update and the fill factor is set to above 80%?

Leo.Miller SSChampion Points: 13051 More actions · Answer 13

I don't have a problem with Try/Catch brackets, and I would always encourage developers to use them. I have a problem with this being the "solution". A Try/Catch is a work around where I absolutely can't find an effective/economical solution, or where there is a risk of a problem and rather than have the app fail, we allow for the this. The worst case I saw was a multi page Internet Banking application that ALWAYS failed when the users applied their changes, and this was handled with a Try/Catch. It turned out page 1 data updates were deadlocking with page 3 data updates for every client, but the developers had "solved" it! Once I showed them were the execution sequences needed to be resolved, their error log suddenly dropped in size by orders of magnitude, and they could actually find and start working on real bugs.

The moral of the story: The right tool for the right job!

Leo
Nothing in life is ever so complicated that with a little work it can't be made more complicated.

Leo.Miller SSChampion Points: 13051 More actions · Answer 14

Admingod, you have a number of questions and comments that haven't really been answered, and you may need to supply some more info to get some answers.

You mention finding the queries/updates - A single table and the Primary key. This sounds like one query is doing an update with the sequence: 1: Update the Data Pages 2: Update the Primary Key

Then there should be a 2nd query (probably not an update) with the sequence 1: PK Seek or Scan 2: Key Lookup (to fetch row data)

Probably the 2nd query is the Deadlock Victim. It sounds as if the PK is poorly designed, so an update is requiring a lot of locks. You would need to supply at least the key structure and some code for further analysis.

Then you say the Update has an "update Lock", but this seems wrong. By default, updates take out locks. The update lock as you seem to be referring to are usually used on select statements (SELECT,,, for update..) so the data is locked by that thread and can't be modified by other users until released. This is actually a way of reducing deadlocks. Again, we need more information.

SQL uses Lock Escalation, and yes this can get to the page, table or even database level. Default locking is per row, and lock escalation depends on the percent of a page locked. This is to conserve memory because every lock (regardless of what's locked) is a piece of RAM (a few bits as I recall, but it's been a while). If you are doing a lot of updates in a single batch on one table, and a lot of pages are effected, then yes the whole table can end up being blocked. You can suppress lock escalation with a trace flag but I'd be careful how you use this.

I recommend looking at what you app is doing, reducing batch sizes, checking the PK design to see what actually happens during an update. For instance, is the PK or part of the PK possibly being modified by the update? This would almost certianly be causing the sort of deadlock you are seeing.

Leo
Nothing in life is ever so complicated that with a little work it can't be made more complicated.