April 29, 2009 at 6:27 pm
Hello all, I have a SQL 2005 server that I know is under CPU pressure at times (dm_os_schedulers shows runnable_tasks_count around 18 across the 4 CPU's for the hour plus span of high CPU waits). What has me and my client wondering is how come when this bottleneck happens we end up with normally fast queries (1-2 seconds) getting stuck in a sos_scheduler_yield wait during this hour period. The query is a pretty simple select, why would it continue to yield the CPU to other threads. At some point shouldn't this query eventually get to the front of the line and get the signal wait it needs to move forward? Other queries appear to eventually make it through, why would others seem to get stuck?
This query I'm picking on is a lookup type query that runs all day long. Other queries also get stuck, but today this query showed up most often. Also, there were no other queries that had a stranglehold on any of the CPU's. Statistics are up to date (full scan 3 days ago).
Sorry for the general question, just wondering if anyone else has seen similar behavior and may have an opinion.
Thanks.
April 29, 2009 at 6:57 pm
We really need to query to attempt help. It is likely a bad join causing a table scan....nothing a rewrite of the query or an index can't fix.....let's see a query to get started.
April 29, 2009 at 10:57 pm
Here is the query. It's an accounting application. Transactions are in the trans table and details about those transactions are in the detail table. The query is pulling up the batch header of open batches of transactions of a certain transaction type for a certain property.
SELECT bh.uRef + ' (#' + rtrim(convert(char, bh.hMy - 1300000000)) + ')' FROM trans bh With (Index(I_TRANS_10)) WHERE bh.hPerson = 0 AND bh.iType = 13 AND bh.bOpen = -1 AND ( EXISTS ( SELECT t.hParent2 FROM trans t INNER JOIN detail d on d.hInvOrRec = t.hMy WHERE d.hProp in (select hProperty FROM listprop2 WHERE hPropList = 1811) AND t.hParent2 = bh.hMy ) OR bh.iType2 = 0) and bh.sModuleCreatedBy 'CM' order by uRef
April 30, 2009 at 7:52 am
Is this a new query? What was the last change to the database before this behavior started happening?
As for the query itself, I made one small modification relocating the bh.sModuleCreatedBy 'CM' to above the EXISTS statement. Just looking at the query, I would suggest really focusing on the inner join to look for additional or different columns to join on.
SELECT bh.uRef + ' (#' + rtrim(convert(char, bh.hMy - 1300000000)) + ')'
FROM trans bh With (Index(I_TRANS_10))
WHERE bh.hPerson = 0 AND bh.iType = 13 AND bh.bOpen = -1 AND
bh.sModuleCreatedBy 'CM' and
(EXISTS ( SELECT t.hParent2 FROM trans t
INNER JOIN detail d on d.hInvOrRec = t.hMy WHERE d.hProp in
(select hProperty FROM listprop2 WHERE hPropList = 1811)
AND t.hParent2 = bh.hMy )
OR bh.iType2 = 0
)
order by uRef
I would not make any query changes though if something else has recently changed with the server. Start with the last change and work backwards. If no changes, the table may have just grown to the point that a query inefficiency has started rearing it's head....
April 30, 2009 at 9:33 am
The query, database and server haven't gone through any recent changes, but the user load has been steadily increasing. Changing the query or schema (adding indexes) is possible, but more of a medium term solution, plus it needs development buy in before it would get started. Their first question would be, why does the same query run just great on the database for this client all day long but only periodically get stuck in an infinite CPU wait. And this isn't the only query that seemingly gets stuck.
I'm sure we are seeing CPU pressure, but at this poing I'm wondering why queries would be in a sos_scheduler_yield wait for over an hour. Queries that normally run without a problem when the CPU isn't strained. Shouldn't the query eventually stop yielding and start running? Other queries are able to even under CPU pressure.
We are looking to add more CPU's but I'm concerned that I may end up with 8 CPU's with high runnable tasks counts.
April 30, 2009 at 10:09 am
Adding hardware is always an option and it can't hurt. Easier to justify right now too.
It is also easy to justify some query changes if they help. Do timings on the original query versus the changes. This query is ripe for improvement with the inner joins and subselect. Lot's of production code, including mine, can be made more efficient....Unfortunately we don't have infinite time to make perfect queries so we have to wait on events like these to go in and tweak things....
Thanks for the question!
Viewing 6 posts - 1 through 5 (of 5 total)
You must be logged in to reply to this topic. Login to reply