High CPU and plan cache instability

Question

High CPU and plan cache instability

garryha

SSC-Addicted

Points: 435
More actions
January 14, 2021 at 11:06 am

#3832327
In the past couple of months my SQL Server instance has been experiencing higher than usual CPU usage than before, at times hitting 100% and affecting an application.
There have been no recent changes to any SQL Server configurations or options in that time, the OS may have in terms of Patching but that is another teams area.
When we have high sustained period of CPU use, I see mostly wait stats of CXPACKET, LCK_M_X, THREADPOOL, LATCH_EX, RESOURCE_SEMEPHOR_QUERY_COMPILE. CXPACKET for parallelism being by far the largest.
I like to run Brent Ozar’s sp_blitzcache and the one thing that stands out is that the plan cache seems to be cleared by something and 99% of the plans are created in the last hour. This says this could be memory pressure or plan cache instability.
I’m not sure what to check of where to look. I have thought of enabling LPIM which isn’t now but besides this not sure what to try and I would be grateful for any help.
- This topic was modified 3 years, 11 months ago by garryha.

Viewing 8 posts - 1 through 7 (of 7 total)

You must be logged in to reply to this topic. Login to reply

Mr. Brian Gale SSC-Insane Points: 24800 More actions · Answer 1

Have you checked for memory pressure?

Alternately, is this happening on a regular basis? Like, based on what you are seeing, could it be a scheduled task? I am wondering if something is freeing the procedure cache at a regular basis?

The above is all just my opinion on what you should do.
As with all advice you find on a random internet forum - you shouldn't blindly follow it. Always test on a test server to see if there is negative side effects before making changes to live!
I recommend you NEVER run "random code" you found online on any system you care about UNLESS you understand and can verify the code OR you don't care if the code trashes your system.

Lou SSCrazy Points: 2929 More actions · Answer 2

Garryha,

That's the life in a day for a dba.

There are many factors that could be behind that cpu presure scenario.

SQL Server Performance Dashboard could give you a quickly view of what is happening, if you suspect about query usage

Extended Events could help you check it.

I could check if memory is correctly configured and there are enough.

Check SQL Server Error log for any message that could give you ideas of what is damaged.

Finally, but not least, your maintenance plans, statistics, index plans you should keep an eye on it, are they working without inconvenience ?

Try to look at that

tmisganaw Old Hand Points: 340 More actions · Answer 3

Take a look at Adam Machanic sP-whosisactive will show who is doing what (save it to table)

http://whoisactive.com/downloads/

garryha SSC-Addicted Points: 435 More actions · Answer 4

I had started to log sp_whoisactive into a table.

I found that there was a long running process of the application that was running a very intensive denormalisation process on the database, that gradually built and built until it finally starte a stage when it rebuild indexes.

Obviously this is what killed server performance. We have made sure this does not run again.

However the business has become very twitchy about the appplication/server. On friday there was a period when again the CPU maxed out to 100% for about 10 mins, unlike before when it lasted longer, this time it stopped after the 15mins.

From looking at the results of sp_whoisactive I can see before the CPU got maxed I started to see a high number of RESOURCE_SEMAPHORE_QUERY_COMPILE wait types, these carried on for about 10 mins.

I kind of understand what this wait type is, but did this cause the high CPU for the period, as some kind of cumulative effect of threads waiting?

If it was, how can I fix this so that it won't happen again?

I don't have any scope to tune queries/IXs, would adding more memory help?

This reply was modified 3 years, 11 months ago by garryha.

Jeff Moden SSC Guru Points: 1004425 More actions · Answer 5

garryha wrote:

From looking at the results of sp_whoisactive I can see before the CPU got maxed I started to see a high number of RESOURCE_SEMAPHORE_QUERY_COMPILE wait types, these carried on for about 10 mins.
I kind of understand what this wait type is, but did this cause the high CPU for the period, as some kind of cumulative effect of threads waiting?
If it was, how can I fix this so that it won't happen again?
I don't have any scope to tune queries/IXs, would adding more memory help?

It's a shame that you "don't have any scope to tune queries/IXs" because that's probably just what the doctor ordered for this wait type. Please see the "Other Information" section of the article at the following URL...

https://www.sqlskills.com/help/waits/resource_semaphore_query_compile/

You also brought up the subject of the code getting to the point where it "gradually built and built until it finally starte a stage when it rebuild indexes". So, let me ask... is the "rebuild" process actually using REBUILD or is it doing a REORGANIZE?

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

Johan Bijnens SSC Guru Points: 135377 More actions · Answer 6

Just my 2ct: Check this value.

Maybe activating this setting can take some pressure of your plan cache because it only has to calculate the stub and not actually put it into the plan cache, unless it is executed a second time.

exec sp_configure 'optimize for ad hoc workloads'

result = 1 row

/* on this instance, the setting is active ! */
name                          minimum maximum config_value run_value
optimize for ad hoc workloads 0       1       1            1

To activate this setting:

exec sp_configure 'optimize for ad hoc workloads', 1;
reconfigure

Johan

Learn to play, play to learn !

Dont drive faster than your guardian angel can fly ...
but keeping both feet on the ground wont get you anywhere :w00t:

- How to post Performance Problems
- How to post data/code to get the best help[/url]

- How to prevent a sore throat after hours of presenting ppt

press F1 for solution, press shift+F1 for urgent solution 😀

Need a bit of Powershell? How about this

Who am I ? Sometimes this is me but most of the time this is me

garryha SSC-Addicted Points: 435 More actions · Answer 7

****Update on this****

It seems that the inital issue was another team who started to run a Denormalisation process on the database that was really intensive and just killed the server.

After they were stopped doing this I recieved very few RESOURCE_SEMAPHORE_QUERY_COMPILE wait types unlike before

The situation was much better than before, however CPU was still hitting 100% at times for longer than I wanted and I was still experiencing plan cache instability, it was continually being trimmed. After coming across this webpage -

https://social.msdn.microsoft.com/Forums/sqlserver/en-US/6b9118f3-b6c1-4f39-ae80-d149392656f9/sql-server-2016-memory-pressure-leads-to-the-plan-cache-clearing?forum=sqldatabaseengine

I realised that something at the os level had changed and must have been needing more memory, causing memory pressure and then it taking memory resource away from sql server and the plan cache.

I lowered my instance max server memory setting and things have settled down much more, CPU overall is significantly lower and rarely hitting 100% and the plan cache is a nice size and never being significantly trimmed down.