High CPU & blocking

  • Hello SQL Experts,

    We have SQL2017 enterprise edition with about 60 databases. Recently we are seeing very high cpu spikes. We have DB maintenance job running twice/wk (Hallenger Script). Manually updated the statistics but it didn't helpful. I don't have baseline about the no of transactions or queries that ran previously.  Followed the guidance provided by Microsoft. Seems the queries that are showing with high cpu are running for 2-3 years with out any modifications. The missing indexes are several. I can't able to identify which one can be benefit. We have replication configured on this server (Subscriber). The replication is having latency from distributor to subscriber due to high CPU/blocking. Looking for your advice on this issue.

    https://learn.microsoft.com/en-us/troubleshoot/sql/database-engine/performance/troubleshoot-high-cpu-usage-issues

  • His name is "Hallengren". 😉

     

    ramana3327 wrote:

    Hello SQL Experts,

    Followed the guidance provided by Microsoft.

    What guidance???  Be specific because, if they have to do with indexes, MS finally figured out what a lot of us have been saying for years... the old 5/30 methods are not a "Best Practice" and were never meant to be a "Best Practice" and have been changing then on a regular basis since 20 April 2021.

    And when, in relationship to the CPU spikes, did you start to follow them?

    And a CPU "spike" normally looks like a short spike.  What do you mean by CPU spikes and when do they occur?  All the time or only after your index maintenance?

    And, have you looked at the "ACTUAL" execution plans of the procs causing the "spikes"?

    Please see the article at the second link in my signature line below for how to post performance issues.

    And my immediate advice would be that you're probably doing your index maintenance incorrectly because you're probably using REORGANIZE as a part of it and you're probably using Logical Fragmentation as an indication of what needs to be maintained.  They can both cause a whole lot of damage and, depending on the type of fragmentation, REORGANIZE can perpetuate fragmentation and also make things fragment much more quickly (lot more page splits than necessary).

    And I'll just bet that you haven't actually proven that the index maintenance is actually helping performance.  If you look at all the BBFAAT (Books, Blogs, Forums, Articles, AI, and Talks), every one says it CAN but no one ever proves it.. and you probably haven't either.

    I went almost 4 years with no index maintenance on my production box and performance actually got better over the first 3 months because I was no longer causing damage by using the supposed but totally incorrect 5/30 recommendations that more than 98 % of the world is using.

    With that, I'll recommend that you simply stop doing index maintenance until you can actually prove which indexes will improve performance when they're defragged and only do regular statistics updates until then.

    I also strongly urge you to watch all 82 minutes of the 'tube at the following link.  Despite the name of the presentation, it's NOT just about GUIDs.  It's proof that the old "Best Practices" are actually worst practices and proves the damage that REORGANIZE does (not just on GUIDs) and the extreme amount of log file space (and related IO) it takes in many cases.  Be sure to watch the outtake after the Q'n'A, as well.

    https://www.youtube.com/watch?v=rvZwMNJxqVo

    And, if you're listening with a headset, be aware that they added 3 advertisements at the 15:00, 30:15, and 45:25. The advertisements are important to help support this event but I wanted to let you know that they're sudden and they're loud! They WILL lift your headset!

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Thank you.

    Not after the Index maintenance job. I check the queries that are consuming for years. They are not new o nes. I tried  updating the stats & cleared the proc cache but it didn't help. Restarting the SQL Server resolved the issues.

    • This reply was modified 1 year, 4 months ago by  ramana3327.
  • ramana3327 wrote:

    Thank you.

    Not after the Index maintenance job. I check the queries that are consuming for years. They are not new o nes. I tried  updating the stats & cleared the proc cache but it didn't help. Restarting the SQL Server resolved the issues.

    Just to say what you already know... that's not a great plan.  You really need to figure out what the issue is.  Could it be a connection leak?  Memory leak?  Those are just a couple of the many possibilities.  If you don't have Adam Machanic's "sp_WhoIsActive", look that up, get a copy, and learn how to use it for troubleshooting such things while they're happening.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Ah... almost forgot about another scenario that I ran into.  Are the connections to the server setup to explicitly turn off/disable M.A.R.S.  It they're not, they need to be.  The documentation says that they're disabled by default but the incident I suffered proved that they not.  I also had several other people in the user group I belong to check... sure enough, M.A.R.S. was enabled for them, as well.

    It doesn't always create such a problem but, when it does, all hell breaks loose like it did for you and, yes, only bouncing the SQL Server service or a reboot will fix it... and then it may only be temporary.  It crushed us for about a week until we figured out what the issue was.

    One of the main symptoms for us was a ton of rollbacks being done in the background.  I don't know if that's a guarantee but that's what we looked up to find out about it being a problem when M.A.R.S. is enabled by default or not.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Yes, it happened in the past. Not able to find the root cause. We are restarting the server for every 4 months otherwise we’re seeing the same issue like high cpu utilization.

    The No of missing indexes are about same from performance monitor dashboard before & after the restart. But after restart we don’t see even half of the cpu utilization until next 3 to 4 months.

    • This reply was modified 1 year, 4 months ago by  ramana3327.
  • The only thing that I can guess is either connection leaks or memory leaks cause by the application(s).

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

Viewing 7 posts - 1 through 6 (of 6 total)

You must be logged in to reply to this topic. Login to reply