September 16, 2016 at 1:19 am
Hello,
I'm sure that there should be a similar topic around, but with these general keyword to be searched (performance, monitoring etc) is hard to locate it, so I open a new one with the hope of some help.
I have two case, in two different installations, both of them Microsoft SQL Server 2008 R2 (RTM) - 10.50.1600.1 (X64) (Build 7601: Service Pack 1). In both cases, from time to time, random and without any pattern the SQL server performance is degrading, it starts to be slow, lots of "Query timeout expired" messages are received and anything goes wrong, sometimes we receive unexpected deadlocks and so on. Even in sp_who2 I don't see any clear process suffocating the server, something it does. Now I want to find out what can be that "something".
Is there any know tool for monitoring the SQL Server (any profile trace template or any external tool) to see post-factum what was the server activity when the slowness occurred? We are normally learn about these slowness cases after it passed (the low performance disappear after a while like it appeared, without any visible explanation), when the users complains about them. When I connect to server to verify, all looks OK... So I need to see what happened at a certain moment in the past on the server.
I'm trying to understand these strange cases because (for example) I can't explain how comes that the exact same stored procedure which one thousand times is executed in 12 milliseconds, suddenly and without any (visible for me) explanation takes ~21 seconds to be executed... These kind of "hiccups" of the server are making me crazy and I can't understand what generates them. I'm sure it is a logical explanation somewhere, but I can't see it.
Any ideas are welcome.
Thank you,
RC
___________________________________________________________________
Is better shut up and look stupid than open your mouth and eliminate any doubt
September 16, 2016 at 4:56 am
At least for the stored procedure variable performance, I would suggest to check the statistics of the underlying tables used.
Issue may be related to parameter sniffing, the cached execution plan becoming "bad" for certain parameter values.
https://blogs.technet.microsoft.com/mdegre/2011/11/06/what-is-parameter-sniffing/
An easy check / workaround is to alter the stored procedure and use the with recompile hint to the SQL statements inside it, especially to the ones you may consider as being heavy and which are using the input parameters.
September 16, 2016 at 5:48 am
Before getting into lots of complicated checks on SQL Server I would check some basics first.
1. Is SQL Server the only program on the server?
2. Have you set the Maximum Server Memory, in SQL Server, to allow room for windows to run?
(Without Maximum Server Memory being set, SQL Server will try and use all the memory which means Windows will keep using it's swap file slowing the whole machine down.)
3. Do any other programs on the server, eg SSAS, also have enough memory to run?
etc.
If the basics look okay, I would then run Brent Ozar's sp_Blitz and work from there.
September 16, 2016 at 5:48 am
If you don't have monitoring in place, then I would start with capturing the wait statistics over time. You need to be able to compare them, today to yesterday, this week to last week, in order to identify what is changing, what the top waits are, etc. That will give you any indications around if the problem is related to the system, memory, disk, etc.
After that, you need to capture the performance metrics on the queries in order to understand when and how they're running slowly. Since you're in 2008R2, the best solution is to create a server-side trace. That will capture a lot of information, so be prepared to deal with it. However, you'll be able to pinpoint when the performance is slow on the queries. Once you know which queries are slow you can focus on the queries to determine what the root cause may be. Usually it's poorly written T-SQL. However, it can also be bad or missing indexes, out of date or incorrect statistics, poor database structures, or any number of other issues. However, you can't start troubleshooting any of this until you identify specifically where the problems are.
"The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
- Theodore Roosevelt
Author of:
SQL Server Execution Plans
SQL Server Query Performance Tuning
September 16, 2016 at 6:17 am
Further to Grant's post, even some basic monitoring of Latency, Waits and PLE once a month will show the way instances are heading:
http://www.sqlskills.com/blogs/paul/how-to-examine-io-subsystem-latencies-from-within-sql-server/
http://www.sqlskills.com/blogs/paul/wait-statistics-or-please-tell-me-where-it-hurts/
http://blog.sqlauthority.com/2010/12/13/sql-server-what-is-page-life-expectancy-ple-counter/
September 16, 2016 at 8:00 am
Ken McKelvey (9/16/2016)
Further to Grant's post, even some basic monitoring of Latency, Waits and PLE once a month will show the way instances are heading:http://www.sqlskills.com/blogs/paul/how-to-examine-io-subsystem-latencies-from-within-sql-server/
http://www.sqlskills.com/blogs/paul/wait-statistics-or-please-tell-me-where-it-hurts/
http://blog.sqlauthority.com/2010/12/13/sql-server-what-is-page-life-expectancy-ple-counter/
I'd shy away from PLE. By itself, it's such a poor indicator of problems that most people don't even look at it any more.
"The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
- Theodore Roosevelt
Author of:
SQL Server Execution Plans
SQL Server Query Performance Tuning
September 16, 2016 at 8:14 am
Grant Fritchey (9/16/2016)
Ken McKelvey (9/16/2016)
Further to Grant's post, even some basic monitoring of Latency, Waits and PLE once a month will show the way instances are heading:http://www.sqlskills.com/blogs/paul/how-to-examine-io-subsystem-latencies-from-within-sql-server/
http://www.sqlskills.com/blogs/paul/wait-statistics-or-please-tell-me-where-it-hurts/
http://blog.sqlauthority.com/2010/12/13/sql-server-what-is-page-life-expectancy-ple-counter/
I'd shy away from PLE. By itself, it's such a poor indicator of problems that most people don't even look at it any more.
I admit I ignore BCHR, but if I see the PLE graph having a sustained dip against my baseline I start get a little interested.. or am I barking down the wrong tree here?
September 16, 2016 at 10:13 am
BLOB_EATER (9/16/2016)
Grant Fritchey (9/16/2016)
Ken McKelvey (9/16/2016)
Further to Grant's post, even some basic monitoring of Latency, Waits and PLE once a month will show the way instances are heading:http://www.sqlskills.com/blogs/paul/how-to-examine-io-subsystem-latencies-from-within-sql-server/
http://www.sqlskills.com/blogs/paul/wait-statistics-or-please-tell-me-where-it-hurts/
http://blog.sqlauthority.com/2010/12/13/sql-server-what-is-page-life-expectancy-ple-counter/
I'd shy away from PLE. By itself, it's such a poor indicator of problems that most people don't even look at it any more.
I admit I ignore BCHR, but if I see the PLE graph having a sustained dip against my baseline I start get a little interested.. or am I barking down the wrong tree here?
It does mean that there's been a flush in memory, yeah. Is that a bad thing? It depends. Ah, well what does it depend on? That's the stuff that tells you if there's a problem, not the PLE dip. That's the thing. It just doesn't say a whole lot beyond that there's been stuff flushed out of cache. If it's consistently low (below the magic 300 number), but all other indicators are fine, performance is fine, is it bad that PLE is low? Not at all I'd say. That's the thing, it tells you stuff, but it doesn't tell you enough.
"The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
- Theodore Roosevelt
Author of:
SQL Server Execution Plans
SQL Server Query Performance Tuning
September 16, 2016 at 10:59 am
Grant Fritchey (9/16/2016)
BLOB_EATER (9/16/2016)
Grant Fritchey (9/16/2016)
Ken McKelvey (9/16/2016)
Further to Grant's post, even some basic monitoring of Latency, Waits and PLE once a month will show the way instances are heading:http://www.sqlskills.com/blogs/paul/how-to-examine-io-subsystem-latencies-from-within-sql-server/
http://www.sqlskills.com/blogs/paul/wait-statistics-or-please-tell-me-where-it-hurts/
http://blog.sqlauthority.com/2010/12/13/sql-server-what-is-page-life-expectancy-ple-counter/
I'd shy away from PLE. By itself, it's such a poor indicator of problems that most people don't even look at it any more.
I admit I ignore BCHR, but if I see the PLE graph having a sustained dip against my baseline I start get a little interested.. or am I barking down the wrong tree here?
It does mean that there's been a flush in memory, yeah. Is that a bad thing? It depends. Ah, well what does it depend on? That's the stuff that tells you if there's a problem, not the PLE dip. That's the thing. It just doesn't say a whole lot beyond that there's been stuff flushed out of cache. If it's consistently low (below the magic 300 number), but all other indicators are fine, performance is fine, is it bad that PLE is low? Not at all I'd say. That's the thing, it tells you stuff, but it doesn't tell you enough.
Absolutely thanks for the insight.
Viewing 9 posts - 1 through 8 (of 8 total)
You must be logged in to reply to this topic. Login to reply