January 10, 2020 at 9:49 am
We have something that every so often is causing the CPU to spike to 90-100% on our SQL Server for a second, every 10-20 secs. This seems to happen for anywhere between 15-90 mins, and then stops.
So far I haven't been able to find what is causing this, there are no jobs running that run that frequently and at such random intervals. There are over 40 DBs on this instance so trying to narrow it down feels like a needle in a haystack. I've tried running sp_whoisactive and sp_blitzwho, but nothing is flagging up as out of the ordinary.
Does anyone have any ideas how I can begin to narrow down what this is?
January 10, 2020 at 10:16 am
Yes, I'm sure that it's SQL Server as I've watched the individual process and it's that going up to 100%.
Thanks, I'll try that the next time it happens
January 10, 2020 at 10:25 am
You don't have to wait until next time it happens - it's worth running it now since the queries may still be in memory, provided you don't have a small plan cache, a high volume of ad hoc queries, or queries that are run with the RECOMPILE option. Other events can cause plans to be lost from the cache, such as server restarts, clearing the cache, eviction of individual plans and statistics updates.
John
January 10, 2020 at 1:54 pm
If it's SQL Server, then it's caused by queries. So, go and look at query behavior. There are three ways to do this, in order to accuracy and detail: DMVs against the cache, Query Store, Extended Events. The last two, Query Store and Extended Events, you will need to configure and have running in order to capture the behavior. Through any of these, you can aggregate based on CPU and get an idea of which queries are running the most CPU. Now, if you want to specify a time frame, the DMVs against the cache aren't going to help you. They're aggregate only. Query Store aggregates based on the hour (by default, you can change this), so it will let you get to a specific time frame. Extended Events, depending on how you configure them, filters, etc., capture everything. This is the most detailed view. Be prepared for a lot of data though.
This is how I'd figure it out. What queries are running when we see the CPU spikes?
"The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
- Theodore Roosevelt
Author of:
SQL Server Execution Plans
SQL Server Query Performance Tuning
January 10, 2020 at 2:31 pm
I can't see anything out of the ordinary running when we see these spikes, when I run sp_whoisactive there doesn't appear to be anything there that's not running at other times.
I think I'm going to to set up an Extended Events trace to run next time I see it happening, it's not happening at the same times each time, e.g. yesterday it was doing it for nearly 90 mins between 11:30am and 1pm, today it was doing it from 9:30-9:45 and nothing since, so I can't set up a scheduled trace as chances are it won't pick anything up as we won't be having the issue at that time.
January 10, 2020 at 2:45 pm
That may work, although by the time you know it's happening, you may have missed a lot of the interesting stuff. That's why I suggested querying the plan cache now. Grant is right - it's only aggregate information, but it'll tell you when it was first run (or rather when the execution plan was created), when it was last run, how many times it ran in between, and how much CPU it used. Now, if any of the caveats that I mentioned before apply, then it's not going to be useful at all... but you won't have lost anything by trying.
John
January 10, 2020 at 2:55 pm
If it's SQL Server, it has to be a query. Something has to be running there. Otherwise, it might not be SQL Server. However, I'd go with the Extended Events approach to monitor it. Also, what about wait stats? Any information there?
"The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
- Theodore Roosevelt
Author of:
SQL Server Execution Plans
SQL Server Query Performance Tuning
January 10, 2020 at 3:52 pm
The main one was CXCONSUMER when I ran sp_whoisactive at the time it was spiking
January 10, 2020 at 4:13 pm
In 2016 Sp2(?) or greater, that just means parallelism is occurring. What else?
"The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
- Theodore Roosevelt
Author of:
SQL Server Execution Plans
SQL Server Query Performance Tuning
January 13, 2020 at 7:29 pm
I would be tempted to run perfmon alongside the event session, to see if the number of Batch Requests/sec does not also jump at the same time. It may be a lot of little queries suddenly being run all at once.
January 14, 2020 at 12:20 am
Sometime resource monitor also helps. Did you check that?
Viewing 12 posts - 1 through 11 (of 11 total)
You must be logged in to reply to this topic. Login to reply