tracing deadlock issue

Question

Post reply

tracing deadlock issue

mjschwenger

SSCarpal Tunnel

Points: 4019
More actions
June 6, 2006 at 4:08 pm

#60809

I have a small app used internally. From time to time it reads records from a table, and then updates/inserts records in other tables in the same database. Also, it marks the records in the table to show that they are already processed.I could point the app to use this table or to use a flat file. When I use the table I'm getting regular deadlocks - almost 100 per min. They are not on the table from where the records are coming, but when reading from the file I do not have any deadlocks. when reading from a table the deadlocks are on the tables which have been updated due to the records in the initial table.
I made several traces to find out why I have deadlocks when using the table but they do not show anything wrong.
Then I tried to copy the initial table in different database, and wonder why, but that cut the deadlocks more then 80%.
I need help in troubleshooting this issue. Any idea is appreciated.
Thanks, mj

Viewing 8 posts - 1 through 7 (of 7 total)

You must be logged in to reply to this topic. Login to reply

Prakash Sawant SSCrazy Points: 2173 More actions · Answer 1

Is you app is OLTP or OLAP ?

How you are access the data from app is it the SP/view or simple t-sql ?

------------
Prakash Sawant

http://psawant.blogspot.com

colin.Leversuch-Roberts SSC Guru Points: 52643 More actions · Answer 2

I'm sort of confused by your claim of 100 deadlocks/min -- for a lightly used database that's very high. How are you measuring your deadlocks?

To get information on deadlocks enabale the trace flags 1204 and 1205, note that you'll get a large error log, but you will be able to see exactly what is causing them.

[font="Comic Sans MS"]The GrumpyOldDBA[/font]
www.grumpyolddba.co.uk
http://sqlblogcasts.com/blogs/grumpyolddba/

mjschwenger SSCarpal Tunnel Points: 4019 More actions · Answer 3

My app is heavy OLTP. It's a C++ application with rogwave database layer. All it does is to read the files provided in certain format and then decides if it's going to insert/update existing data in order to prevent duplications. This is not just a simple load process. It's very complicated logic that manipulates records for specific custom determined rules.

I used the flags you are talking about and I know what statement exactly causes the deadlocks but this has nothing to do with the load table. The deadlocks occur on one of the other tables that got updated due to the records selected from the load table.

I guess what I'm trying to figure out is: Why if the app is reading a file to get the data, there's no deadlocks at all? Why if It uses the load table in the same database I have a lot of deadlocks? And why if I move the table in different database, I have much less deadlocks?

This is the same application running in all 3 cases - just the source of the data is different.

The users require the same performance for the 2 data sources(file and table) and I'm wondering what can I do to troubleshoot this situation?

Even to address this issue to the development team for this application, I'll need to have some explanation...

The major question for me at this point is: How can I find out how the deadlocks on the Updatable table are reflected by the usage of the loading table versus flat file?

Thanks a lot for the help,

mj

colin.Leversuch-Roberts SSC Guru Points: 52643 More actions · Answer 4

Ah well .. two ways to decrease deadlocks ..

make reads dirty

increase spindles

The basis to reduce deadlocks, assuming the code isn't aweful, is to speed up transactions, if the disk subsystem runs slow ( or is raid 5 with lots of writes ) then the transactions / disk access takes longer to complete and increases the chance of a deadlock. Make sure your transaction log is on raid 1 as any slowness in completing writes to the tlog will also slow the whole process and increase the chnace of deadlocks.

From your description I'd say latency in the disk subsystem is your problem. It only takes a couple of milliseconds to make a difference. Track the disk i/o completion time for reads and writes on your drives ( physical and logical ) if these rise during your problems then that's your issue. It's difficult to give exact figures for i/o completion time ( depends upon disks ) but lower is better and I'd be looking at under 6 ms ideally.

[font="Comic Sans MS"]The GrumpyOldDBA[/font]
www.grumpyolddba.co.uk
http://sqlblogcasts.com/blogs/grumpyolddba/

mjschwenger SSCarpal Tunnel Points: 4019 More actions · Answer 5

"Assuming the code isn't awful" ... well, I do not know about that...

Actually the code is not bad - all is with transactions, small, fast, etc., retrieved only data needed good normalized tables… But the way the processes are running is badly designed.

The way I see the problem is that there are several instances of the same application running in the same time, and this is how I get the deadlocks. Instead of the application running on multiple treads, we use multiple instances of the same app.

The problem is that all these instances are running against the same data tables and very often I could see them running even after the same data block. On top of this, all application instances running are selecting and inserting/updating the same data tables and their indexes… this is the logic which I’m not able to change (3rd party product), but we are waiting for the new release which will be multithreading… They it looks to me I’m going to have even more problems…

Are the deadlocks in SQL server reported by PID or by connection?

And, my question again is – why if the table is in a different database the deadlocks are very rear?

Ronzo Hall of Fame Points: 3917 More actions · Answer 6

Have you tried setting the trace flags per Colin's 6/7/2006 post? This will tell you exactly what is deadlocking.

As for why the rate of deadlocking changes, a contributor besides disk bandwidth is locality of records being updated, i.e. if a lock gets escalated, what other updates are blocked. Without detailed information on the indexes and transactions, it is not really possible to answer definitively. Your best bet is set the flags and read some good articles on interpreting deadlock traces.

Have Fun!
Ronzo

mjschwenger SSCarpal Tunnel Points: 4019 More actions · Answer 7

Yes, I did the trace and I know exactly where the deadlocks appear but I cannot explain why as the application has the same behavior. And this showed me that the deadlock is not connected to the table where the application is pulling the records (loading table). The logic of the processes is the same no matter if they pull the data from table (worst deadlocks), file (Norman performance), or the same loading table but in another database (much less deadlocks).

Here is the logic:

There are 5 processes running in the same time. Each of them mainly goes the same way. Very rarely the logic will change depending of the data processed but for this experiment, I set the data to be the same and no exception will occur.

Each process goes like this:

Selects 5 records from the loading table and updates them to show they are processed in the loading table. Then parse them and throws them in the BUS to be pulled by the second process that will check 1 table called account for existing information. If the information is already there, it’ll update the records in this table and finish the processing. That simple! If this is new information then a lot of work begins… but I do not even get to this point.

The problem piece is the updating of this account table. This is the place where the deadlocks are all over.

So, my question is: The application is the same; the records are exactly the same.

When the app selects the records from file – no deadlocks.

When selects them from table in the same database- huge amount of deadlocks;

When selects them from the same table moved to another database on the same datafiles/server – much less deadlocks.

I checked the application code – the part for the selecting/updating the load table is very simple and the trace showed there are no problems there.

I cannot explain why is that… Even if I have to go to the development and to ask them to look at, I need to have some explanation – the it’s right now they said – it’s SQL server problem, not application…