October 28, 2012 at 9:42 pm
Hello Everyone,
Our production server experiences lot of blocking and locking from application.
I have attached the top waits from server. Can anyone help to understand or guide me what should be our first goal.
We cannot stop the traces for few days since we are in the midst of troubleshooting a application issue.
Other than trace wait I see a lot of waits on the OLEDB. We have Peer to peer replication setup done across networks.
wait_type waiting_tasks_countwait_time_ms
TRACEWRITE10376534 14220260175
OLEDB 300522122 14205053714
WRITELOG 547624686 7070550328
Does this sounds too much on a OLTP systems. Or is this a normal figure.
Please let me know if I need to submit any other information.
Thank you.
Babu
October 29, 2012 at 12:50 pm
baabhu (10/28/2012)
Hello Everyone,Our production server experiences lot of blocking and locking from application.
I have attached the top waits from server. Can anyone help to understand or guide me what should be our first goal.
We cannot stop the traces for few days since we are in the midst of troubleshooting a application issue.
Other than trace wait I see a lot of waits on the OLEDB. We have Peer to peer replication setup done across networks.
wait_type waiting_tasks_countwait_time_ms
TRACEWRITE10376534 14220260175
OLEDB 300522122 14205053714
WRITELOG 547624686 7070550328
Does this sounds too much on a OLTP systems. Or is this a normal figure.
Please let me know if I need to submit any other information.
Thank you.
Babu
one thing with sys.dm_os_wait_stats (which given the format is where i think you are looking) is that they are cumulative since last server restart (or stats reset). so with out knowing how much uptime your server has had these numbers could be really low, really high, or somewhere in the middle. also if you dont have any thing to compare to these numbers may be what is normal for your server and since you are looking at them while there is a problem have no idea if its even where you should be looking.
give that you ask if its to much on a oltp system i dont think you have a server base line (which is different for every server based on what its work load is.) with out alot more information it will be hard to pinpoint what is going on.
For performance Issues see how we like them posted here: How to Post Performance Problems - Gail Shaw[/url]
Need to Split some strings? Jeff Moden's DelimitedSplit8K[/url]
Jeff Moden's Cross tab and Pivots Part 1[/url]
Jeff Moden's Cross tab and Pivots Part 2[/url]
October 30, 2012 at 11:31 am
capnhector (10/29/2012)
baabhu (10/28/2012)
Hello Everyone,Our production server experiences lot of blocking and locking from application.
I have attached the top waits from server. Can anyone help to understand or guide me what should be our first goal.
We cannot stop the traces for few days since we are in the midst of troubleshooting a application issue.
Other than trace wait I see a lot of waits on the OLEDB. We have Peer to peer replication setup done across networks.
wait_type waiting_tasks_countwait_time_ms
TRACEWRITE10376534 14220260175
OLEDB 300522122 14205053714
WRITELOG 547624686 7070550328
Does this sounds too much on a OLTP systems. Or is this a normal figure.
Please let me know if I need to submit any other information.
Thank you.
Babu
one thing with sys.dm_os_wait_stats (which given the format is where i think you are looking) is that they are cumulative since last server restart (or stats reset). so with out knowing how much uptime your server has had these numbers could be really low, really high, or somewhere in the middle. also if you dont have any thing to compare to these numbers may be what is normal for your server and since you are looking at them while there is a problem have no idea if its even where you should be looking.
give that you ask if its to much on a oltp system i dont think you have a server base line (which is different for every server based on what its work load is.) with out alot more information it will be hard to pinpoint what is going on.
Thank you very much for your suggestion. How do we define a baseline. I know it's hard to tell without much of information.
Can you share how do you do for your PROD servers. Based on that I will define our prod baseline.
October 30, 2012 at 11:32 am
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
October 31, 2012 at 12:13 pm
This page from Paul Randal's blog may assist you:
http://www.sqlskills.com/blogs/paul/post/Wait-statistics-or-please-tell-me-where-it-hurts.aspx
October 31, 2012 at 2:00 pm
Lee Crain (10/31/2012)
This page from Paul Randal's blog may assist you:http://www.sqlskills.com/blogs/paul/post/Wait-statistics-or-please-tell-me-where-it-hurts.aspx
Cool article, thanks for posting it!
One of the main servers I deal with has CXPACKET waits of about 60%. I've told the developers which SPs have a bunch, but they have yet to change the maxdop on them. It will be interesting to see how much that helps performance.
October 31, 2012 at 2:22 pm
scogeb (10/31/2012)
Lee Crain (10/31/2012)
This page from Paul Randal's blog may assist you:http://www.sqlskills.com/blogs/paul/post/Wait-statistics-or-please-tell-me-where-it-hurts.aspx
Cool article, thanks for posting it!
One of the main servers I deal with has CXPACKET waits of about 60%. I've told the developers which SPs have a bunch, but they have yet to change the maxdop on them. It will be interesting to see how much that helps performance.
CXPACKET waits are not necessarily a bad thing, sometimes they occur when one parallel thread takes longer to execute a porttion of a task than other threads. That can easily occur when the distribution of data is uneven and more work is handed off to one scheduler than another. Some potential sources that you can look for are table scans that are occurring because of missing indexes or out of date distribution statistics that lead to poor query plans being created.
One thing that you might consider is setting the cost threshold for paralellism higher so that fewer queries run in parallel but you would want to test that out thoroughly in a test environment first because it could actually make overall performance worse.
Or as you suggest, you can get the developers to override MAXDOP inside of the offending stored procedures.
November 1, 2012 at 2:04 am
Thank you very much everyone.
Viewing 8 posts - 1 through 7 (of 7 total)
You must be logged in to reply to this topic. Login to reply