Huge regression after upgrading from 2014 to 2016

  • About two weeks ago, I upgraded to SQL Server 2016 and was looking forward to the benefits and added features of SP1. Our business is Mon-Fri 8a to 7p eastern time. The upgrade was done on a Sunday afternoon and immediately Monday morning there were issues, with blocking and deadlocks. After some digging around I found there were two different processes that were causing deadlocks and were also the main lead for all the blocking. After analyzing the deadlocks, I finally realized the queries involved were run under serializable isolation. It turns out some older code being run from table adapters were using the default isolation which is serializable. Ok, that's a problem that I maybe should have already figured out, however I went from 4 or 5 deadlocks a week to 50 a day. Lock waits jumped from around 5% of total wait time to 80% of total wait time. We have the code fixed, but can't release it until late next week. Today, I changed the compatibility level to 120 and everything went back to how it was before I upgraded. Does anyone know if 2016 handles serializable queries differently? At this point, I am completely at a loss and wondering if once the serializable connections are gone if performance will be normal at compatibility 130. Once the code is release, I will try to go back to 2016 and will post the results in this tread. In the mean time if anyone has any comments, please feel free to provide them. Also, I am happy to answer anything about any other database/server settings that are currently in place.

  • Can't speak to your issue directly, but a number of different sessions at PASS Summit this year pointed out that changing the compat level from 120 to 130 (or the other way around) has the potential to fix a BUNCH of issues. Glad you found that without too much pain 🙂

    ------------------------------------------------------------------------------------------------Standing in the gap between Consultant and ContractorKevin3NFDallasDBAs.com/BlogWhy is my SQL Log File HUGE?!?![/url]The future of the DBA role...[/url]SQL Security Model in Plain English[/url]

  • Going back to compatibility level 120 to fix such issues seems just wrong to me. Glad you found a fix.

    Speaking of the fix, what was it? I ask because we're getting ready to make the leap from 2012 to 2016 in the very near future.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Jeff,

    As I said, I am really hoping this is all around serializable isolation. The code has not changed, in fact the code causing the problem is over 5 years old. I have been in many environments, and when I see a ton of complicated deadlocks, I normally look at isolation pretty quick. In this case although the process has been a bit nagging, causing most of the very few deadlocks and most of the long blocking, it didn't point me to isolation. I am thinking in compatibility 130 for some reason the locking has changed, causing wider locking than with 120. I am just guessing at this point, and will try and record some stats to try and prove it. My biggest problem is trying to reproduce the exact type of load in a test environment. As I said in my original post, we will release code within two weeks that will get rid of the serializble isolation as we have no need for it. I will switch back to 130 at that point and see what happens. I plan to set up a scenario when I get some time and profile locks during a complicated serializable process in both 120 and 130 and see what the difference is. I will post back to this thread with results and anything else I find.

    By the way, we ran 2016 in a test environment for a couple of months with no issues, but we just don't have anywhere close to the load and can not reproduce the load we have in production. Other than this, I have heard of nothing more than a few bad plans being produced by upgrading to 2016, however I now am very glad that compatibility level exists and the ability to change it.

  • Thanks for taking the time for the feedback.

    Yeah... the "fix" I was talking about was in the code changes that you said you couldn't release for a couple of weeks. I missed that in the first post. Thank you for your patience with that miss.

    Understood on the testing in a test environment. We always do a "load test" in the test environment with 3rd party software but it's never quite the same as in the prod environment. We have, on more than one occasion, tested everything as being good performance-wise only to have it be less than adequate and in need of a fix for performance once it gets to production. Can't always rollback in production, either. It depends on if there were any accompanying DDL changes required.

    Thank you again for your time. I have to admit, I'm looking forward to finding out if the code changes for isolation levels make the difference you hope for. Still, I think it's terrible that you had to do such a thing.

    There's also a breaking change (read that *** "a change that breaks code") for certain apps concerning conversions from DATETIME to DATETIME2 in 2016. See the following.

    https://msdn.microsoft.com/en-us/library/ms143179.aspx

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • I experienced performance regression after a migration project due to differing collations between the source database and tempdb. As I prepared the Server for the Migration I forgot to check.

    Research online showed that there were operations in SQL2008 that occured in the database itself, but were done in tempdb on the new instance.

    This might help:

    select name, collation_name from sys.databases;

    Could this be a part of your problem?

  • kevaburg (12/31/2016)


    I experienced performance regression after a migration project due to differing collations between the source database and tempdb. As I prepared the Server for the Migration I forgot to check.

    Research online showed that there were operations in SQL2008 that occured in the database itself, but were done in tempdb on the new instance.

    This might help:

    select name, collation_name from sys.databases;

    Could this be a part of your problem?

    Oooooo... Excellent point. That's on my checklist when we stand up new servers or do in-place upgrades. The reason why it's on my checklist is because I would otherwise forget to check for it. 🙂

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Jeff Moden (12/31/2016)


    kevaburg (12/31/2016)


    I experienced performance regression after a migration project due to differing collations between the source database and tempdb. As I prepared the Server for the Migration I forgot to check.

    Research online showed that there were operations in SQL2008 that occured in the database itself, but were done in tempdb on the new instance.

    This might help:

    select name, collation_name from sys.databases;

    Could this be a part of your problem?

    Oooooo... Excellent point. That's on my checklist when we stand up new servers or do in-place upgrades. The reason why it's on my checklist is because I would otherwise forget to check for it. 🙂

    Makes me wonder what's not on the checklist ... 🙂

    The absence of evidence is not evidence of absence.
    Martin Rees

    You can lead a horse to water, but a pencil must be lead.
    Stan Laurel

  • Phil Parkin (12/31/2016)


    Jeff Moden (12/31/2016)


    kevaburg (12/31/2016)


    I experienced performance regression after a migration project due to differing collations between the source database and tempdb. As I prepared the Server for the Migration I forgot to check.

    Research online showed that there were operations in SQL2008 that occured in the database itself, but were done in tempdb on the new instance.

    This might help:

    select name, collation_name from sys.databases;

    Could this be a part of your problem?

    Oooooo... Excellent point. That's on my checklist when we stand up new servers or do in-place upgrades. The reason why it's on my checklist is because I would otherwise forget to check for it. 🙂

    Makes me wonder what's not on the checklist ... 🙂

    lol! Just looked through my own, just to be sure... 🙂

  • Phil Parkin (12/31/2016)


    Jeff Moden (12/31/2016)


    kevaburg (12/31/2016)


    I experienced performance regression after a migration project due to differing collations between the source database and tempdb. As I prepared the Server for the Migration I forgot to check.

    Research online showed that there were operations in SQL2008 that occured in the database itself, but were done in tempdb on the new instance.

    This might help:

    select name, collation_name from sys.databases;

    Could this be a part of your problem?

    Oooooo... Excellent point. That's on my checklist when we stand up new servers or do in-place upgrades. The reason why it's on my checklist is because I would otherwise forget to check for it. 🙂

    Makes me wonder what's not on the checklist ... 🙂

    Heh... only the stuff that's not in my "stand it up" code. I even make calls through xp_CmdShell to make sure that things like CPU frequency are set to 100% at all times.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

Viewing 10 posts - 1 through 9 (of 9 total)

You must be logged in to reply to this topic. Login to reply