December 6, 2019 at 11:59 am
I recently had an experience where I identified that there was a significant hole in my SQL Server troubleshooting toolset.
I had a server that was consuming ~60% more CPU than it typically does. I have developed some fairly nice processes that capture completed queries with all the CPU/IO metrics. This has been extremely helpful for identifying issues "postmortem" or after they complete.
In my recent experience I encountered a process that ended up running for many hours. It was creating so much noise that it was difficult to identify in the active running SQL. I fell back on my completed SQL analysis and ended up chasing ghosts.
At this point, I believe I am left with looking simply at execution time for actively running processes and using that as a strong indicator of something that needs to be reviewed for high resource consumption.
My question to the group is, am I missing some wonderful blog or MSDN article that will more efficiently fill my "running procedure" troubleshooting gap? Everything I have searched so far leads me to "completed procedure" analysis. This will not help if the process is still running.
Thanks for your consideration and help.
Steve
December 6, 2019 at 12:26 pm
there is a neat little trick you can do with profiler traces - make sure you capture the start event(RPC,statement whatever) as well as the end event
if you cant find the END event then either it errored of hasn't finished, so store your trace details in a table and query it - if you see a start event but no end then stick the start event details in an email.
aside from that - look at the reports in SQL server. .. performance dashboard helps me be proactive and the query store helps me a lot
MVDBA
December 6, 2019 at 12:34 pm
Right, I get what you are saying, but the point of my questions would be that there is no "end" event because the process is still running. The SQL reports leverage "completed" tasks. So unless I am missing something, your suggestion does not appear to be any different than what I believe I have now, which is looking for execution time on "actively running" processes.
Steve
December 6, 2019 at 12:57 pm
only trying to offer up a different box of tricks
i'd love to hear if you find a generic solution. other than that "activity monitor" lol, but that requires you to be on it 24x7
are there any system views that might help? - have a look through brent ozar's library (on his website), I bet he has a trick or 2 in this area
MVDBA
December 6, 2019 at 2:55 pm
So, for actively running queries, I recommend the DMVs, sys.dm_exec_requests is a good starting point. You can join that to other DMVs to get the execution plan, the SQL statement and more. Also, look into lightweight statistics as a way to get a live execution plan on actively running queries. An easy way to cheat on using the DMVs is to get sp_whoisactive. Otherwise, write your own queries against the DMVs, and you can see the currently active & ugly queries.
Mike is right. You can get starting events as well as stopping/ending events. I'd use Extended Events instead of Trace Events (Profiler) though. Especially because you can easily add in causality tracking as way to marry start & stop events (as well as events in the middle). Not even mentioning the fact that ExEvents have a lower overhead. Sorry Mike, #TeamExEvents for life man.
"The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
- Theodore Roosevelt
Author of:
SQL Server Execution Plans
SQL Server Query Performance Tuning
December 6, 2019 at 3:11 pm
Thanks Mike and Grant.
I was looking for tricks for actively running when I am troubleshooting. I agree a 24/7 eyeballs on monitor process is rather inefficient.
Grant, I TOTALLY forgot about sp_whoisactive. I have that and use it frequently. I just had not incorporated it into my regular set of tools when doing emergency troubleshooting. duh...
I do use extended events. That is what my original capture was modeled around. What got me into this post was that my extended event did not capture the long running event while it was running. My guess is that it would have had to end while I was running my extended event capture.
I think the DMV and sp_whoIsActive are probably the best areas of research to give me a better identification tool set to complement my current "completed execution" tool set.
Thanks Guys, I appreciate the time you gave me to explore my question
Steve
December 6, 2019 at 3:20 pm
So, for actively running queries, I recommend the DMVs, sys.dm_exec_requests is a good starting point. You can join that to other DMVs to get the execution plan, the SQL statement and more. Also, look into lightweight statistics as a way to get a live execution plan on actively running queries. An easy way to cheat on using the DMVs is to get sp_whoisactive. Otherwise, write your own queries against the DMVs, and you can see the currently active & ugly queries.
Mike is right. You can get starting events as well as stopping/ending events. I'd use Extended Events instead of Trace Events (Profiler) though. Especially because you can easily add in causality tracking as way to marry start & stop events (as well as events in the middle). Not even mentioning the fact that ExEvents have a lower overhead. Sorry Mike, #TeamExEvents for life man.
I just don't like the user interface or scripting for extended events - that's the only reason I go back to profiler traces #ineedtogetoutofmycomfortzone
MVDBA
December 6, 2019 at 3:25 pm
I just don't like the user interface or scripting for extended events - that's the only reason I go back to profiler traces #ineedtogetoutofmycomfortzone
The scripts are better in ExEvents. That one's a no-brainer. Actually, I'd argue that the interface is better too. It does more than you can ever do in Trace. The exception to that is matching queries to perfmon counters. They still can't do that in ExEvents.
However, I've said it before and I'll say it again. I actually don't care that much if people switch. I do however, screaming and kicking all the way, demand that when talking to new people working on newer versions of SQL Server, we don't suggest that they start with Trace. That's a hill I'm prepared to die on.
Find me at an event some time. I'll do a quick class on ExEvents.
"The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
- Theodore Roosevelt
Author of:
SQL Server Execution Plans
SQL Server Query Performance Tuning
December 6, 2019 at 4:05 pm
MVDBA (Mike Vessey) wrote:I just don't like the user interface or scripting for extended events - that's the only reason I go back to profiler traces #ineedtogetoutofmycomfortzone
The scripts are better in ExEvents. That one's a no-brainer. Actually, I'd argue that the interface is better too. It does more than you can ever do in Trace. The exception to that is matching queries to perfmon counters. They still can't do that in ExEvents.
However, I've said it before and I'll say it again. I actually don't care that much if people switch. I do however, screaming and kicking all the way, demand that when talking to new people working on newer versions of SQL Server, we don't suggest that they start with Trace. That's a hill I'm prepared to die on.
Find me at an event some time. I'll do a quick class on ExEvents.
fair point, well made - I concede , and i'll take you up on that offer if we cross paths at another event, preferably a summit event with free beers at the end
MVDBA
December 6, 2019 at 4:17 pm
The exception to that is matching queries to perfmon counters. They still can't do that in ExEvents.
One of my favorite bosses in times past had a wonderful saying that can be applied here.
"Instead of sifting through a pile of manure {he used a different word there} to try to figure out what the horse is thinking, go 'round to the other end of the horse and ask him".
Whether you use Traces or Extended Events (which DOES have the nasty problem you stated above), I find that way too many people waste way too much time tracking counters and wait times and a raft of other mostly useless stuff. I call it "useless" because of exactly what you've pointed out. It doesn't normally lead you directly to the offending code.
Long before I became aware of Adam's sp_WhoIsActive, I wrote a couple of little ditties. One is called "sp_WhatsRunning" and it takes no parameters, which greatly simplifies things in "urgent times". The other is called "sp_ShowWorst". It takes two parameters; the first is what you want to check by (CPU, Reads {which are logical reads}, Physical {which are physical reads including read aheads}, Writes, Duration, and Count {number of executions}). The second is the number of "top" items you want to return and isn't necessary to include if you like the default of "10". You can add the word "RECENT" to the beginning of the first parameter if you only want to consider those things that have executed in the past hour. Again, I tried to KISS it (Keep It Super Simple) so even a newbie could use it quickly during "urgencies".
No matter what the first parameter is (which only controls the ORDER BY for the TOP), it always returns the same information. There isn't a wait stat to be found in that mix. It returns several columns (Min, Max, Last, Average, Total) for the various items I listed for the first parameter, when the code was first "compiled", when it was last executed, and those are what actually count. It also captures the code in a clickable column so you can see what the code actually is (even if it seriously outstripes the normal 8K limit) and, provided that the code doesn't have an unresolveable Temp Table in it, will also provide the "intended" (close to "estimated" but not actually) execution plan,... again, in a clickable column.
The only thing that I've not done with it is to do the reasearch on how to capture the passed parameters for sp_WhatsRunning.
The bottom line here is that I don't actually care about things like wait stats and other "manure" to sift through. The only thing that users care about is "Duration" and the things I care about are mostly CPU time, Reads, and Duration and both procs go straight to the horse's mouth. sp_ShowWorst is used a whole lot in our "Continuous Improvement Program" and, over the years, has made our code a whole lot faster and efficient. When I started, our CPU's were averaging about 40% across the board and there were frequent 10-15 minute "outages" thanks to batch jobs running. Our largest database was only 65GB.
Today, that same data base is 1TB in size and we have another that has grown to 2TB. Except for one year where we made the serious mistake of following what have become "Best Practices" for index maintenance where the CPUs averaged 22%, we've been able to keep our CPUs down around 8% for the last 6 years. We've also made some remarkable improvements to our multiple multi-millon row batch jobs.
They were all caught by sp_WhatsRunning, sp_ShowWorst, and a remarkable bit of code written by Jonathon Keyhayias to help us find things that had reasonable fast execution times but took 2-22 seconds to recompile each of the thousands of times per hour that they were executed.
To wit, I won't introduce a newbie to SQL Performance Monitor never mind Extended Events. Instead, I teach them how to "ask the horse". And, as you say, "That's a hill I'm prepared to die on". 😀
--Jeff Moden
Change is inevitable... Change for the better is not.
December 6, 2019 at 4:21 pm
I use this to get active SQL statements.
SELECT sqltext.TEXT,
req.session_id,
req.status,
req.command,
req.cpu_time,
req.total_elapsed_time,
CONCAT(req.total_elapsed_time/1000,' secs') elapsed_secs,
CONCAT(req.total_elapsed_time/1000/60,' mins') elapsed_mins
FROM sys.dm_exec_requests req
CROSS APPLY sys.dm_exec_sql_text(req.sql_handle) AS sqltext
It also gives the amount of CPU each statement has consumed and the elapsed time.
Viewing 11 posts - 1 through 10 (of 10 total)
You must be logged in to reply to this topic. Login to reply