Deadlock Problem on a Heap Table

  • Jeff Moden (12/30/2016)


    TheSQLGuru (12/30/2016)


    Jeff Moden (12/30/2016)


    TheSQLGuru (12/30/2016)


    Jeff: "Another potentially very serious problem with all of this is the idea that a lot of the rows will suffer a lot of updates. "

    That actually isn't a problem on this system. The OP already verified that there are in fact no duplicates on TransactionNumber.

    The datatypes used are the problem with the UPDATEs. Even updating one of those bloody VARCHAR(1) columns from NULL to some value will cause the given row to expand and expansion of rows will take a heavy toll in the form of page splits no matter what the CI is. Whether the TrackingNumber has duplicates, is unqiue, or whatever, is a comparatively minor part of the problem.

    1) Should page splits be a problem (covered in a subsequent post as not really) that can be solved by a fill factor adjustment or by simply not worrying about it. Over time it will stabilize out and you will be fine. Just like you have been espousing for what, a year now?:-)

    2) Page splits are probably not a part of the main need of the OP - namely help with a deadlocking issue.

    3) I am virtually certain that the TrackingNumber not being declared as unique (and also being the clustered PK) is far more than "a comparatively minor part of the problem". To me it is THE root cause of the problem. Logically the silly string date conversion comparison mess (currently causing a range scan) will be moot (although the datatypes of the parameters should absolutely be fixed)

    With regards to #3, I made that comment with the understanding that the CI had been moved to the IDENTITY column. I absolutely agree that a UNIQUE NCI should be assigned to the TrackingNumber column.

    On the other two, I'm convinced that the horrible table design is causing a large part of the problem for deadlocks due to both page splits and "on page" data movement caused by the constant expansion of individual rows due to the horrible table design.

    [font="Arial Black"]But... [/font]let's assume that you're correct and all of that plays little or no part in the deadlock problem. The OP had moved the CI to the IDENTITY column and, though it wasn't unique, had also placed an NCI on the TrackingNumber column (or at least I think that's what happened... too many changes have been made to quickly without the correct analysis... the op is even now suggesting to put an NCI on the bloody date column for an inequality comparison instead of the equality comparison on TrackingNumber). It still didn't help at all with the deadlocks and the deadlocks moved with the indexes.

    That suggests that something else is wrong and that's why I want to see the text from the deadlock graph, which will identify the code that's taken part in the deadlocks. We still haven't seen what the code from the text of the deadlock graph is. We haven't even established that the given stored procedure is the only thing doing updates. 😉

    The key point you are missing is that the TN NCI he placed was NOT UNIQUE and thus always required a range scan to resolve the date predicate (as well as a key lookup to the base table) to resolve the query.

    Best,
    Kevin G. Boles
    SQL Server Consultant
    SQL MVP 2007-2012
    TheSQLGuru on googles mail service

  • TheSQLGuru (12/31/2016)


    Jeff Moden (12/30/2016)


    TheSQLGuru (12/30/2016)


    Jeff Moden (12/30/2016)


    TheSQLGuru (12/30/2016)


    Jeff: "Another potentially very serious problem with all of this is the idea that a lot of the rows will suffer a lot of updates. "

    That actually isn't a problem on this system. The OP already verified that there are in fact no duplicates on TransactionNumber.

    The datatypes used are the problem with the UPDATEs. Even updating one of those bloody VARCHAR(1) columns from NULL to some value will cause the given row to expand and expansion of rows will take a heavy toll in the form of page splits no matter what the CI is. Whether the TrackingNumber has duplicates, is unqiue, or whatever, is a comparatively minor part of the problem.

    1) Should page splits be a problem (covered in a subsequent post as not really) that can be solved by a fill factor adjustment or by simply not worrying about it. Over time it will stabilize out and you will be fine. Just like you have been espousing for what, a year now?:-)

    2) Page splits are probably not a part of the main need of the OP - namely help with a deadlocking issue.

    3) I am virtually certain that the TrackingNumber not being declared as unique (and also being the clustered PK) is far more than "a comparatively minor part of the problem". To me it is THE root cause of the problem. Logically the silly string date conversion comparison mess (currently causing a range scan) will be moot (although the datatypes of the parameters should absolutely be fixed)

    With regards to #3, I made that comment with the understanding that the CI had been moved to the IDENTITY column. I absolutely agree that a UNIQUE NCI should be assigned to the TrackingNumber column.

    On the other two, I'm convinced that the horrible table design is causing a large part of the problem for deadlocks due to both page splits and "on page" data movement caused by the constant expansion of individual rows due to the horrible table design.

    [font="Arial Black"]But... [/font]let's assume that you're correct and all of that plays little or no part in the deadlock problem. The OP had moved the CI to the IDENTITY column and, though it wasn't unique, had also placed an NCI on the TrackingNumber column (or at least I think that's what happened... too many changes have been made to quickly without the correct analysis... the op is even now suggesting to put an NCI on the bloody date column for an inequality comparison instead of the equality comparison on TrackingNumber). It still didn't help at all with the deadlocks and the deadlocks moved with the indexes.

    That suggests that something else is wrong and that's why I want to see the text from the deadlock graph, which will identify the code that's taken part in the deadlocks. We still haven't seen what the code from the text of the deadlock graph is. We haven't even established that the given stored procedure is the only thing doing updates. 😉

    The key point you are missing is that the TN NCI he placed was NOT UNIQUE and thus always required a range scan to resolve the date predicate (as well as a key lookup to the base table) to resolve the query.

    No more deadlocks appearing in SQL Diagnostic Manager since that first deadlock after the last change last night. That deadlock happened around 8 PM Central. So, what I have now is just a Clustered PK on (TrackingNumber, ID). In the estimated execution plan I just have seeks on that index to get the tracking number. I will watch this over the next few days and see what happens.

  • lmarkum (12/31/2016)


    TheSQLGuru (12/31/2016)


    Jeff Moden (12/30/2016)


    TheSQLGuru (12/30/2016)


    Jeff Moden (12/30/2016)


    TheSQLGuru (12/30/2016)


    Jeff: "Another potentially very serious problem with all of this is the idea that a lot of the rows will suffer a lot of updates. "

    That actually isn't a problem on this system. The OP already verified that there are in fact no duplicates on TransactionNumber.

    The datatypes used are the problem with the UPDATEs. Even updating one of those bloody VARCHAR(1) columns from NULL to some value will cause the given row to expand and expansion of rows will take a heavy toll in the form of page splits no matter what the CI is. Whether the TrackingNumber has duplicates, is unqiue, or whatever, is a comparatively minor part of the problem.

    1) Should page splits be a problem (covered in a subsequent post as not really) that can be solved by a fill factor adjustment or by simply not worrying about it. Over time it will stabilize out and you will be fine. Just like you have been espousing for what, a year now?:-)

    2) Page splits are probably not a part of the main need of the OP - namely help with a deadlocking issue.

    3) I am virtually certain that the TrackingNumber not being declared as unique (and also being the clustered PK) is far more than "a comparatively minor part of the problem". To me it is THE root cause of the problem. Logically the silly string date conversion comparison mess (currently causing a range scan) will be moot (although the datatypes of the parameters should absolutely be fixed)

    With regards to #3, I made that comment with the understanding that the CI had been moved to the IDENTITY column. I absolutely agree that a UNIQUE NCI should be assigned to the TrackingNumber column.

    On the other two, I'm convinced that the horrible table design is causing a large part of the problem for deadlocks due to both page splits and "on page" data movement caused by the constant expansion of individual rows due to the horrible table design.

    [font="Arial Black"]But... [/font]let's assume that you're correct and all of that plays little or no part in the deadlock problem. The OP had moved the CI to the IDENTITY column and, though it wasn't unique, had also placed an NCI on the TrackingNumber column (or at least I think that's what happened... too many changes have been made to quickly without the correct analysis... the op is even now suggesting to put an NCI on the bloody date column for an inequality comparison instead of the equality comparison on TrackingNumber). It still didn't help at all with the deadlocks and the deadlocks moved with the indexes.

    That suggests that something else is wrong and that's why I want to see the text from the deadlock graph, which will identify the code that's taken part in the deadlocks. We still haven't seen what the code from the text of the deadlock graph is. We haven't even established that the given stored procedure is the only thing doing updates. 😉

    The key point you are missing is that the TN NCI he placed was NOT UNIQUE and thus always required a range scan to resolve the date predicate (as well as a key lookup to the base table) to resolve the query.

    No more deadlocks appearing in SQL Diagnostic Manager since that first deadlock after the last change last night. That deadlock happened around 8 PM Central. So, what I have now is just a Clustered PK on (TrackingNumber, ID). In the estimated execution plan I just have seeks on that index to get the tracking number. I will watch this over the next few days and see what happens.

    Understood but isn't the TrackingNumber supposed to be UNIQUE? That would be the data quality enforcement side of the index.

    Also, when you do your bi-monthly reorganize, watch for a possible rise in blocking (not necessarily deadlocking) if the CI has a 0 or 100% Fill Factor. For the expansion in size of updated rows and the non-sequential inserts of new rows, you're still going to see page splits until the splits afford the affected pages with enough room to grow the rows or insert them in the "middle" of the table.

    @kevin,

    If the NCI had a leading column of the TrackingNumber and there's only one of each tracking number in the table, how do you figure that would result in a RangeScan of more than the one row?

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Jeff:

    "If the NCI had a leading column of the TrackingNumber and there's only one of each tracking number in the table, how do you figure that would result in a RangeScan of more than the one row?"

    Because the index was not declared as unique and there was another filter on the query.

    I am thinking more and more that the secondary set of queries that hit the date column really could be an issue with the deadlocks. Having to do a scan because that NCI the OP had with the date as the second column.

    Best,
    Kevin G. Boles
    SQL Server Consultant
    SQL MVP 2007-2012
    TheSQLGuru on googles mail service

  • TheSQLGuru (12/31/2016)


    Jeff:

    "If the NCI had a leading column of the TrackingNumber and there's only one of each tracking number in the table, how do you figure that would result in a RangeScan of more than the one row?"

    Because the index was not declared as unique and there was another filter on the query.

    That would not be a problem if not the code.

    Every index has statistics on it, and statistics could suggest optimiser that the date selection is ultimately selective.

    But because of the function applied to the datetime column in the query the statistics were not applicable.

    If the OP would fix the code straight away there would not be a need to play with indexes.

    _____________
    Code for TallyGenerator

  • Sergiy (1/1/2017)


    Every index has statistics on it, and statistics could suggest optimiser that the date selection is ultimately selective.

    I'm sorry Sergiy, but that is absolutely not the case. The optimizer CANNOT be allowed to presume based on statistics that there will only be one row for a given predicate unless that given predicate is declaratively unique on the involved table. There are several reasons for this but the most obvious is that statistics are not maintained to be transactionally consistent so any additional row added that the statistics aren't refreshed to accurately reflect breaks any assumption on uniqueness you would have the optimizer base it's decisions on.

    Even without a function around a column the existing predicate on the UPDATE will require creating a query plan that evaluates all matching rows to apply the date predicate. Without that bad data could result.

    Best,
    Kevin G. Boles
    SQL Server Consultant
    SQL MVP 2007-2012
    TheSQLGuru on googles mail service

  • Overnight I do still see some deadlocks happening on the INSERT to UPSTRacking and on the IF NOT EXISTS(SELECT * FROM UPSTracking WHERE TrackingNumber = @TrackingNumber).

    I am working on writing up the changes that have been suggested for the proc and will attach that in a bit. I will also upload the latest DDL for the table.

  • TheSQLGuru (1/1/2017)


    Sergiy (1/1/2017)


    Every index has statistics on it, and statistics could suggest optimiser that the date selection is ultimately selective.

    I'm sorry Sergiy, but that is absolutely not the case. The optimizer CANNOT be allowed to presume based on statistics that there will only be one row for a given predicate unless that given predicate is declaratively unique on the involved table. There are several reasons for this but the most obvious is that statistics are not maintained to be transactionally consistent so any additional row added that the statistics aren't refreshed to accurately reflect breaks any assumption on uniqueness you would have the optimizer base it's decisions on.

    Even without a function around a column the existing predicate on the UPDATE will require creating a query plan that evaluates all matching rows to apply the date predicate. Without that bad data could result.

    Kevin, I'm gonna say only one thing.

    Parameter sniffing.

    For an SQL guru (:-)) it must be enough to prove my case.

    _____________
    Code for TallyGenerator

  • I had a really awesome reply almost finished last night, with sample scripts and data and plan analysis. I couldn't finish it before going to bed though and when I woke up this morning SSMS had hung (I forgot it has been doing that on resuming from sleep lately). And naturally file auto-recover didn't work either (horribly implemented in SSMS). So I lost it all. 🙁

    Anyway, the really short answer now is that I will just be more specific in my initial statement to say that the optimizer cannot assume NO MORE THAN ONE row will come out of a predicate UNLESS there is some form of uniqueness defined that will GUARANTEE that. So a column and any other column as the PK is very different from a single-column PK as far as optimization goes when you have as predicate on that one column. You can see this in the query plans generated for the two different scenarios (which I had done dang it, but don't have time to redo today).

    Best,
    Kevin G. Boles
    SQL Server Consultant
    SQL MVP 2007-2012
    TheSQLGuru on googles mail service

  • I have attached the proposed changes to the table and to the InsertUpdate proc. Now that the Devs are in, I am hoping to confirm today that the TrackingNumber will always be unique. I have learned that these numbers could potentially come from different shipping sources, so it might be possible to get a duplicate that way. If always unique then I will take the additional step of making that the clustered PK. I think that final step may ultimately resolve this because it would guarantee that only one row would meet the TrackingNumber criteria. However, I still wonder about the use of the RangeS-S and RangeI-N locks. In other words, is the serializable isolation level contributing to the issue? In the article I cited in the initial post, Jonathan indicated that the person had tried changing the isolation level but that did not resolve the issue.

    All current and proposed indexes are in the DDL text file. There are no foreign keys.

  • 1) You have extra BEGIN TRAN's in the sproc. You should just have one at the very top and the very bottom.

    2) You should have some explicit error checking in the sproc, but this isn't mandatory.

    3) If TrackingNumber is guaranteed to be unique then the identity is useless bloat and should be completely removed from the table. 4 bytes for absolutely no benefit (since there aren't any child tables).

    4) I note that this sproc actually won't create a duplicate TN row.

    5) You mentioned other queries doing searches just by PackageActivityDateTime. If that is the case then you MUST create a nonclustered index on that field alone. I would also bet that field should be declared NOT NULL based on simple logic that every package has a value for it. There may be other similar fields. NULLable fields are inefficient from a storage engine perspective.

    Best,
    Kevin G. Boles
    SQL Server Consultant
    SQL MVP 2007-2012
    TheSQLGuru on googles mail service

  • TheSQLGuru (1/3/2017)


    1) You have extra BEGIN TRAN's in the sproc. You should just have one at the very top and the very bottom.

    2) You should have some explicit error checking in the sproc, but this isn't mandatory.

    3) If TrackingNumber is guaranteed to be unique then the identity is useless bloat and should be completely removed from the table. 4 bytes for absolutely no benefit (since there aren't any child tables).

    4) I note that this sproc actually won't create a duplicate TN row.

    5) You mentioned other queries doing searches just by PackageActivityDateTime. If that is the case then you MUST create a nonclustered index on that field alone. I would also bet that field should be declared NOT NULL based on simple logic that every package has a value for it. There may be other similar fields. NULLable fields are inefficient from a storage engine perspective.

    Thanks for the feedback. I have a hit a stone wall with the developers. None of the recommended changes I sent over will be implemented any time soon, which means they probably never will. Deadlocks are still there. I think I've done about all I can do. After all of this I was advised that the killed transactions from the deadlocks get handled by some manual process in our BizTalk server and it only takes 15 seconds to run the process. As a result, there really doesn't seem to be a big importance placed on fixing the proc.

  • lmarkum (1/5/2017)


    TheSQLGuru (1/3/2017)


    1) You have extra BEGIN TRAN's in the sproc. You should just have one at the very top and the very bottom.

    2) You should have some explicit error checking in the sproc, but this isn't mandatory.

    3) If TrackingNumber is guaranteed to be unique then the identity is useless bloat and should be completely removed from the table. 4 bytes for absolutely no benefit (since there aren't any child tables).

    4) I note that this sproc actually won't create a duplicate TN row.

    5) You mentioned other queries doing searches just by PackageActivityDateTime. If that is the case then you MUST create a nonclustered index on that field alone. I would also bet that field should be declared NOT NULL based on simple logic that every package has a value for it. There may be other similar fields. NULLable fields are inefficient from a storage engine perspective.

    Thanks for the feedback. I have a hit a stone wall with the developers. None of the recommended changes I sent over will be implemented any time soon, which means they probably never will. Deadlocks are still there. I think I've done about all I can do. After all of this I was advised that the killed transactions from the deadlocks get handled by some manual process in our BizTalk server and it only takes 15 seconds to run the process. As a result, there really doesn't seem to be a big importance placed on fixing the proc.

    Bummer. If you have TN, ID as the clustered PK and no nonclustered index on TN at all and also create a nonclustered index on PackageActivityDateTime (assuming it is seeked by just itself) you can hopefully avoid the deadlocks without a code change. You are still exposed to bad data though.

    Since the dates are all obviously castable to datetime can't you get just THAT change though? That one is a no-brainer. They would fail on casting just as much as they will fail when they aren't passed in as a parameter that can be made a datetime. I suppose it isn't worth it to fight that fight though. Given what I have seen your company will have a TREMENDOUS amount of other issues going on with their SQL Server app(s). 🙁

    Best,
    Kevin G. Boles
    SQL Server Consultant
    SQL MVP 2007-2012
    TheSQLGuru on googles mail service

  • Or, do the BSOFH thing. 😉 Set up an alert to email the devs and their boss every time a deadlock occurs. Make sure it also get's sent to their cell phones as text messages.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Jeff Moden (1/5/2017)


    Or, do the BSOFH thing. 😉 Set up an alert to email the devs and their boss every time a deadlock occurs. Make sure it also get's sent to their cell phones as text messages.

    Something tells me that would motivate them to take action - probably more motivated than they are now.

Viewing 15 posts - 31 through 44 (of 44 total)

You must be logged in to reply to this topic. Login to reply