Deadlock Problem on a Heap Table

Question

Deadlock Problem on a Heap Table

Viewing 15 posts - 16 through 30 (of 44 total)

You must be logged in to reply to this topic. Login to reply

lmarkum SSCertifiable Points: 7989 More actions · Answer 1

Jeff Moden (12/30/2016)

lmarkum (12/30/2016)
like are the TrackingNumber values always going to be ever-increasing
You should be able to easily figure that out on your own because you have an IDENTITY column. Sort by the TrackingNumber and see if the values in the IDENTITY column are always increasing in value or if it has "out of place" values, which would indicate that the TrackingNumber is not inserted in an ever-increasing manner.
If the later is the case, the Clustered Index needs to be on the IDENTITY column to prevent page splits and you need a non-clustered index on the TrackingNumber. Of course, that won't fix the problem with deadlocks all by itself because you need to address the issue of row expansion causing page splits during an UPDATE by changing the datatypes and the other things I suggested.

I have confirmed that the TrackingNumber is not ever-increasing. However, I have already used the IDENTITY column as the clustered PK and that did nothing to stop the deadlocks. I don't see the page splits as contributing much to the issue because the page splits are minimal. When I query Ola's CommandLog table there have been only two reorganize commands executed on the Clustered PK of (TrackingNumber, ID) in the last 30 days and the fragmentation was just barely over 6% both times. What I need to do is something that will change the way the locks are taken.

Jeff Moden SSC Guru Points: 1004406 More actions · Answer 2

lmarkum (12/30/2016)

Jeff Moden (12/30/2016)
lmarkum (12/30/2016)
like are the TrackingNumber values always going to be ever-increasing
You should be able to easily figure that out on your own because you have an IDENTITY column. Sort by the TrackingNumber and see if the values in the IDENTITY column are always increasing in value or if it has "out of place" values, which would indicate that the TrackingNumber is not inserted in an ever-increasing manner.
If the later is the case, the Clustered Index needs to be on the IDENTITY column to prevent page splits and you need a non-clustered index on the TrackingNumber. Of course, that won't fix the problem with deadlocks all by itself because you need to address the issue of row expansion causing page splits during an UPDATE by changing the datatypes and the other things I suggested.
I have confirmed that the TrackingNumber is not ever-increasing. However, I have already used the IDENTITY column as the clustered PK and that did nothing to stop the deadlocks. I don't see the page splits as contributing much to the issue because the page splits are minimal. When I query Ola's CommandLog table there have been only two reorganize commands executed on the Clustered PK of (TrackingNumber, ID) in the last 30 days and the fragmentation was just barely over 6% both times. What I need to do is something that will change the way the locks are taken.

If the rows were only half the size (345 bytes) of their max possible size (690 bytes) as defined by your table, that would mean that you could fit 23 rows per page. That means that the 425,852 rows in the table would take up about 18,515 pages. 6% of that is about 1,110. And that since your bi-monthly defragmentation. That seems like a lot of possible splits to me.

Of course, since you've moved the CI to the IDENTITY column, the INSERTs will be virtually split-free now. It won't help the splits caused by UPDATEs if you used a 100% Fill Factor. If you're not going to change the column datatypes to all be fixed width (wouldn't require any index maintenance then because fragmentation means nothing to singleton updates like you're doing), I'd change the Fill Factor for the CI to be 80% as an initial test. Did you also put an NCI on the TrackingNumber? If so, did you happen to give it a Fill Factor of 90% or less since it's not ever increasing so that you avoid some of the page splitting there?

And considering that when it was a heap, you had almost 650 times the number of forwarded rows as there are rows in the table, I'm thinking that you did and still do have a page splitting problem during the updates for the reasons I previously stated. If you want to try to patch this temporarily, establish the Fill Factors I've already recommended.

Any chance of getting an Actual Execution Plan out of the procedure when it does the UPDATE portion?

And is there anything else that's updating this table other than this one stored procedure? It would be interesting to see the text behind the deadlock chart to find out what the SQL causing the deadlocks is.

Also, is the stored procedure you posted part of a larger explicit transaction?

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

TheSQLGuru SSC Guru Points: 134017 More actions · Answer 3

Jeff Moden (12/30/2016)
TheSQLGuru (12/30/2016)
Jeff: "Another potentially very serious problem with all of this is the idea that a lot of the rows will suffer a lot of updates. "
That actually isn't a problem on this system. The OP already verified that there are in fact no duplicates on TransactionNumber.
The datatypes used are the problem with the UPDATEs. Even updating one of those bloody VARCHAR(1) columns from NULL to some value will cause the given row to expand and expansion of rows will take a heavy toll in the form of page splits no matter what the CI is. Whether the TrackingNumber has duplicates, is unqiue, or whatever, is a comparatively minor part of the problem.

1) Should page splits be a problem (covered in a subsequent post as not really) that can be solved by a fill factor adjustment or by simply not worrying about it. Over time it will stabilize out and you will be fine. Just like you have been espousing for what, a year now?:-)

2) Page splits are probably not a part of the main need of the OP - namely help with a deadlocking issue.

3) I am virtually certain that the TrackingNumber not being declared as unique (and also being the clustered PK) is far more than "a comparatively minor part of the problem". To me it is THE root cause of the problem. Logically the silly string date conversion comparison mess (currently causing a range scan) will be moot (although the datatypes of the parameters should absolutely be fixed)

Best,
Kevin G. Boles
SQL Server Consultant
SQL MVP 2007-2012
TheSQLGuru on googles mail service

Sergiy SSC Guru Points: 110208 More actions · Answer 4

lmarkum (12/23/2016)
I also dropped the non-clustered index IX_UPSTracking on just TrackingNumber since that seemed redundant to the new clustered PK and another existing index IX_UPSTracking on TrackingNumber, PackageActivityDateTime. I’m still seeing the blocking. Now the deadlocks are on the IX_UPSTracking index of TrackingNumber,PackageActivityDateTime.
Please see attached data on the deadlock itself, the indexes and the DDL for the table.
Any suggestions would be appreciated.

The attached deadlock diagram shows the deadlock on IX_UPSTracking.

You say you've dropped it. So, the diagram is irrelevant now.

Can you post the current state of the table and the deadlock diagram as it happens now?

_____________
Code for TallyGenerator

lmarkum SSCertifiable Points: 7989 More actions · Answer 5

Sergiy (12/30/2016)
lmarkum (12/23/2016)
I also dropped the non-clustered index IX_UPSTracking on just TrackingNumber since that seemed redundant to the new clustered PK and another existing index IX_UPSTracking on TrackingNumber, PackageActivityDateTime. I’m still seeing the blocking. Now the deadlocks are on the IX_UPSTracking index of TrackingNumber,PackageActivityDateTime.
Please see attached data on the deadlock itself, the indexes and the DDL for the table.
Any suggestions would be appreciated.
The attached deadlock diagram shows the deadlock on IX_UPSTracking.
You say you've dropped it. So, the diagram is irrelevant now.
Can you post the current state of the table and the deadlock diagram as it happens now?

It is the same deadlock pattern after the change. The deadlock just moved to the only remaining NCI of (Tracking number, PackageActivityDateTime).

TheSQLGuru SSC Guru Points: 134017 More actions · Answer 6

lmarkum (12/30/2016)
Sergiy (12/30/2016)
lmarkum (12/23/2016)
I also dropped the non-clustered index IX_UPSTracking on just TrackingNumber since that seemed redundant to the new clustered PK and another existing index IX_UPSTracking on TrackingNumber, PackageActivityDateTime. I’m still seeing the blocking. Now the deadlocks are on the IX_UPSTracking index of TrackingNumber,PackageActivityDateTime.
Please see attached data on the deadlock itself, the indexes and the DDL for the table.
Any suggestions would be appreciated.
The attached deadlock diagram shows the deadlock on IX_UPSTracking.
You say you've dropped it. So, the diagram is irrelevant now.
Can you post the current state of the table and the deadlock diagram as it happens now?
It is the same deadlock pattern after the change. The deadlock just moved to the only remaining NCI of (Tracking number, PackageActivityDateTime).

If the current clustered index's leading column is TrackingNumber that NCI is completely useless (unless it is used by some other query as a very fast existence check, covered index or something). If you clustered index is still ID, TrackingNumber swap them back around and drop the NCI. See if that makes the deadlocks go away.

Best,
Kevin G. Boles
SQL Server Consultant
SQL MVP 2007-2012
TheSQLGuru on googles mail service

Sergiy SSC Guru Points: 110208 More actions · Answer 7

TheSQLGuru (12/30/2016)
lmarkum (12/30/2016)
Sergiy (12/30/2016)
lmarkum (12/23/2016)
I also dropped the non-clustered index IX_UPSTracking on just TrackingNumber since that seemed redundant to the new clustered PK and another existing index IX_UPSTracking on TrackingNumber, PackageActivityDateTime. I’m still seeing the blocking. Now the deadlocks are on the IX_UPSTracking index of TrackingNumber,PackageActivityDateTime.
Please see attached data on the deadlock itself, the indexes and the DDL for the table.
Any suggestions would be appreciated.
The attached deadlock diagram shows the deadlock on IX_UPSTracking.
You say you've dropped it. So, the diagram is irrelevant now.
Can you post the current state of the table and the deadlock diagram as it happens now?
It is the same deadlock pattern after the change. The deadlock just moved to the only remaining NCI of (Tracking number, PackageActivityDateTime).
If the current clustered index's leading column is TrackingNumber that NCI is completely useless (unless it is used by some other query as a very fast existence check, covered index or something). If you clustered index is still ID, TrackingNumber swap them back around and drop the NCI. See if that makes the deadlocks go away.

The number of "if"s used by Kevin indicates a lot of uncertainty.

Which proves my point - please post the current state of the table (DDL) including all indexes and constraints.

Don't forget to add a fresh deadlock diagram.

_____________
Code for TallyGenerator

lmarkum SSCertifiable Points: 7989 More actions · Answer 8

TheSQLGuru (12/30/2016)
lmarkum (12/30/2016)
Sergiy (12/30/2016)
lmarkum (12/23/2016)
I also dropped the non-clustered index IX_UPSTracking on just TrackingNumber since that seemed redundant to the new clustered PK and another existing index IX_UPSTracking on TrackingNumber, PackageActivityDateTime. I’m still seeing the blocking. Now the deadlocks are on the IX_UPSTracking index of TrackingNumber,PackageActivityDateTime.
Please see attached data on the deadlock itself, the indexes and the DDL for the table.
Any suggestions would be appreciated.
The attached deadlock diagram shows the deadlock on IX_UPSTracking.
You say you've dropped it. So, the diagram is irrelevant now.
Can you post the current state of the table and the deadlock diagram as it happens now?
It is the same deadlock pattern after the change. The deadlock just moved to the only remaining NCI of (Tracking number, PackageActivityDateTime).
If the current clustered index's leading column is TrackingNumber that NCI is completely useless (unless it is used by some other query as a very fast existence check, covered index or something). If you clustered index is still ID, TrackingNumber swap them back around and drop the NCI. See if that makes the deadlocks go away.

Kevin. Thanks for staying with this post and following the path I'm on. Much appreciated.

The results from an index usage tool shows me the NCI is being extensively used. Way more reads than writes. However, I need to solve the deadlock issue, so I may drop the TrackingNumber column from the NCI and just leave PackageActivityDateTime since that is a column that is commonly searched.

Anyway, I can figure out covering indexes after the deadlock problem is solved.

Jeff Moden SSC Guru Points: 1004406 More actions · Answer 9

TheSQLGuru (12/30/2016)
Jeff Moden (12/30/2016)
TheSQLGuru (12/30/2016)
Jeff: "Another potentially very serious problem with all of this is the idea that a lot of the rows will suffer a lot of updates. "
That actually isn't a problem on this system. The OP already verified that there are in fact no duplicates on TransactionNumber.
The datatypes used are the problem with the UPDATEs. Even updating one of those bloody VARCHAR(1) columns from NULL to some value will cause the given row to expand and expansion of rows will take a heavy toll in the form of page splits no matter what the CI is. Whether the TrackingNumber has duplicates, is unqiue, or whatever, is a comparatively minor part of the problem.
1) Should page splits be a problem (covered in a subsequent post as not really) that can be solved by a fill factor adjustment or by simply not worrying about it. Over time it will stabilize out and you will be fine. Just like you have been espousing for what, a year now?:-)
2) Page splits are probably not a part of the main need of the OP - namely help with a deadlocking issue.
3) I am virtually certain that the TrackingNumber not being declared as unique (and also being the clustered PK) is far more than "a comparatively minor part of the problem". To me it is THE root cause of the problem. Logically the silly string date conversion comparison mess (currently causing a range scan) will be moot (although the datatypes of the parameters should absolutely be fixed)

With regards to #3, I made that comment with the understanding that the CI had been moved to the IDENTITY column. I absolutely agree that a UNIQUE NCI should be assigned to the TrackingNumber column.

On the other two, I'm convinced that the horrible table design is causing a large part of the problem for deadlocks due to both page splits and "on page" data movement caused by the constant expansion of individual rows due to the horrible table design.

[font="Arial Black"]But... [/font]let's assume that you're correct and all of that plays little or no part in the deadlock problem. The OP had moved the CI to the IDENTITY column and, though it wasn't unique, had also placed an NCI on the TrackingNumber column (or at least I think that's what happened... too many changes have been made to quickly without the correct analysis... the op is even now suggesting to put an NCI on the bloody date column for an inequality comparison instead of the equality comparison on TrackingNumber). It still didn't help at all with the deadlocks and the deadlocks moved with the indexes.

That suggests that something else is wrong and that's why I want to see the text from the deadlock graph, which will identify the code that's taken part in the deadlocks. We still haven't seen what the code from the text of the deadlock graph is. We haven't even established that the given stored procedure is the only thing doing updates. 😉

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

lmarkum SSCertifiable Points: 7989 More actions · Answer 10

Jeff Moden (12/30/2016)
TheSQLGuru (12/30/2016)
Jeff Moden (12/30/2016)
TheSQLGuru (12/30/2016)
Jeff: "Another potentially very serious problem with all of this is the idea that a lot of the rows will suffer a lot of updates. "
That actually isn't a problem on this system. The OP already verified that there are in fact no duplicates on TransactionNumber.
The datatypes used are the problem with the UPDATEs. Even updating one of those bloody VARCHAR(1) columns from NULL to some value will cause the given row to expand and expansion of rows will take a heavy toll in the form of page splits no matter what the CI is. Whether the TrackingNumber has duplicates, is unqiue, or whatever, is a comparatively minor part of the problem.
1) Should page splits be a problem (covered in a subsequent post as not really) that can be solved by a fill factor adjustment or by simply not worrying about it. Over time it will stabilize out and you will be fine. Just like you have been espousing for what, a year now?:-)
2) Page splits are probably not a part of the main need of the OP - namely help with a deadlocking issue.
3) I am virtually certain that the TrackingNumber not being declared as unique (and also being the clustered PK) is far more than "a comparatively minor part of the problem". To me it is THE root cause of the problem. Logically the silly string date conversion comparison mess (currently causing a range scan) will be moot (although the datatypes of the parameters should absolutely be fixed)
With regards to #3, I made that comment with the understanding that the CI had been moved to the IDENTITY column. I absolutely agree that a UNIQUE NCI should be assigned to the TrackingNumber column.
On the other two, I'm convinced that the horrible table design is causing a large part of the problem for deadlocks due to both page splits and "on page" data movement caused by the constant expansion of individual rows due to the horrible table design.
[font="Arial Black"]But... [/font]let's assume that you're correct and all of that plays little or no part in the deadlock problem. The OP had moved the CI to the IDENTITY column and, though it wasn't unique, had also placed an NCI on the TrackingNumber column (or at least I think that's what happened... too many changes have been made to quickly without the correct analysis... the op is even now suggesting to put an NCI on the bloody date column for an inequality comparison instead of the equality comparison on TrackingNumber). It still didn't help at all with the deadlocks and the deadlocks moved with the indexes.
That suggests that something else is wrong and that's why I want to see the text from the deadlock graph, which will identify the code that's taken part in the deadlocks. We still haven't seen what the code from the text of the deadlock graph is. We haven't even established that the given stored procedure is the only thing doing updates. 😉

Jeff, actually, you should have seen the text that is involved in the deadlock. It was one of the things I attached to the post early on. There is also a deadlock graph attached to the post. I also used a third party object search tool and determined that this stored procedure is the only thing modifying the UPSTracking table.

I have stated a couple of times in the post what changes I have made and Kevin is tracking with that info just fine.

You have misunderstood if you thought that the current CI was onlying on the ID column that is defined as an IDENTITY. I will reiterate that the current CI is also the PK and is defined as (Tracking number, ID). There already was a NCI on (TrackingNumber, PackageActivityDate). That NCI was the most recent source of the deadlocks. I actually disabled that NCI at 8:10 PM central. Since then there was one deadlock on the Clustered PK. Subsequent activity has not resulted in a deadlock.

The number of seeks on the current Clustered PK is going up rapidly, while scans are going up very, very slowly. Also, there are now only row locks taking place and a tiny amount of them for very short durations as compared row and page locks which were quite long.

Sergiy SSC Guru Points: 110208 More actions · Answer 11

lmarkum (12/30/2016)
However, I need to solve the deadlock issue,

Then you need to fix the code.

Look what people told you:

Jeffrey Williams 3188 (12/29/2016)
You need to fix this issue:
CONVERT(datetime,PackageActivityDateTime, 120) < CONVERT(datetime,@PackageActivityDateTime, 120)
The column should already be a datetime data type so it should not need to be converted.

Yes, the column is a datetime data type indeed:

[PackageActivityDateTime] [datetime] NULL,

, so converting it to datetime is plain stupid.

Change it to this:

PackageActivityDateTime<CONVERT(datetime,@PackageActivityDateTime, 120)

After that you drop every index and PK/unique constraint you have on the table and create clustered primary key on (TrackingNumber,PackageActivityDateTime)

But you must have the code fixed first.

_____________
Code for TallyGenerator

Jeff Moden SSC Guru Points: 1004406 More actions · Answer 12

What you included in the original post was a picture of the deadlock graph, not the text, which would have shown the actual code involved.

But, from what you say, it's all a bit of a moot point now. According to what you've stated, there have been no more deadlocks since your last change. Glad you found a fix. Unlike Kevin, I lost track of the index changes.

Unfortunately, the fix probably means the structure of the table won't be fixed.

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

lmarkum SSCertifiable Points: 7989 More actions · Answer 13

Jeff Moden (12/30/2016)
What you included in the original post was a picture of the deadlock graph, not the text, which would have shown the actual code involved.
But, from what you say, it's all a bit of a moot point now. According to what you've stated, there have been no more deadlocks since your last change, correct?

You're correct is was a picture of the deadlock graph and not the text. I cleaned up the picture a bit to try to hide some of the info. What I can export from SQL Diagnostic Manager is what translates into the deadlock graph in SSMS.

And yes, you understand correctly. As of now I had one deadlock on the current Clustered PK and after that things have been fine. That was just over 3 hours ago. I will check in the morning to see if any other deadlocks have occurred overnight.

I agree that the table structure needs to be addresseducated and will do that. Several hours ago I started that conversation with the Devs via email and we can discuss it when we're all back in the office after the holiday. I do appreciate your input and observations about the design problems.

lmarkum SSCertifiable Points: 7989 More actions · Answer 14

Sergiy (12/30/2016)
lmarkum (12/30/2016)
However, I need to solve the deadlock issue,
Then you need to fix the code.
Look what people told you:
Jeffrey Williams 3188 (12/29/2016)
You need to fix this issue:
CONVERT(datetime,PackageActivityDateTime, 120) < CONVERT(datetime,@PackageActivityDateTime, 120)
The column should already be a datetime data type so it should not need to be converted.
Yes, the column is a datetime data type indeed:
[PackageActivityDateTime] [datetime] NULL,
, so converting it to datetime is plain stupid.
Change it to this:
PackageActivityDateTime<CONVERT(datetime,@PackageActivityDateTime, 120)
After that you drop every index and PK/unique constraint you have on the table and create clustered primary key on (TrackingNumber,PackageActivityDateTime)
But you must have the code fixed first.

No. I don't have to fix the code first. The code has nothing to do with why the deadlock is happening. Read the post from Jonathan K. that I listed when I posted the question.

Why would I use (Tracking number, PackageActivityDateTime) as a Clustered PK? I can'take even guarantee that would be unique.

TheSQLGuru SSC Guru Points: 134017 More actions · Answer 15

lmarkum (12/30/2016)
TheSQLGuru (12/30/2016)
lmarkum (12/30/2016)
Sergiy (12/30/2016)
lmarkum (12/23/2016)
I also dropped the non-clustered index IX_UPSTracking on just TrackingNumber since that seemed redundant to the new clustered PK and another existing index IX_UPSTracking on TrackingNumber, PackageActivityDateTime. I’m still seeing the blocking. Now the deadlocks are on the IX_UPSTracking index of TrackingNumber,PackageActivityDateTime.
Please see attached data on the deadlock itself, the indexes and the DDL for the table.
Any suggestions would be appreciated.
The attached deadlock diagram shows the deadlock on IX_UPSTracking.
You say you've dropped it. So, the diagram is irrelevant now.
Can you post the current state of the table and the deadlock diagram as it happens now?
It is the same deadlock pattern after the change. The deadlock just moved to the only remaining NCI of (Tracking number, PackageActivityDateTime).
If the current clustered index's leading column is TrackingNumber that NCI is completely useless (unless it is used by some other query as a very fast existence check, covered index or something). If you clustered index is still ID, TrackingNumber swap them back around and drop the NCI. See if that makes the deadlocks go away.
Kevin. Thanks for staying with this post and following the path I'm on. Much appreciated.
The results from an index usage tool shows me the NCI is being extensively used. Way more reads than writes. However, I need to solve the deadlock issue, so I may drop the TrackingNumber column from the NCI and just leave PackageActivityDateTime since that is a column that is commonly searched.
Anyway, I can figure out covering indexes after the deadlock problem is solved.

Of course the TrackingNumber/PackageActivityDateTime index is being used extensively. From my recollection of the last state of the index you had ID, TN as the clustered PK. That would certainly force the use of the NCI to find a TN. And if that isn't the case then stating that the PADT is used extensively for searching means you would be SCANNING that NCI to find PADT values that match your predicate. The first use goes away when you switch the CPK to be just TN. The second switches to a SEEK as soon as you switch the NCI to being just that column.

BTW, the likely SCANs of the TN/PADT NCI could be a signficant contributor to the deadlock issue. Get those to seeks and that could be a big help (both performance and concurrency).

Best,
Kevin G. Boles
SQL Server Consultant
SQL MVP 2007-2012
TheSQLGuru on googles mail service