July 4, 2017 at 7:25 pm
OK so we have an application that has gone through an upgrade which also saw the database move from a vmware virtual server with SQL 2008 R2 SP3 to vmware virtual server (windows server 2016) with SQL 2016 SP1. The performance of the application is a lot worse with load on the CPU of the SQL server. Things that used to take 3 seconds now take 1 minute. I can see that the application uses cursors and by tracing, I can link these long durations and high cpu to a lot of the cursor fetches. I have also had a look at the query in the cursor definition and it executes very quickly when run in SSMS using a clustered index seek but through the application, the cursor fetches are taking quite some time and adding load to the cpu. We can replicate the issue in preprod and dev. Dev is standalone. Prod and preprod are setup with synchronous availability groups (a primary and secondary replica not distributed). Looking at the activity monitor I can see some waits of PAGEIOLATCH_SH and WRITELOG at times. The pagelife expectancy seems healthy in the 5000's. I suspect the way the cursor is fetching the rows is not optimal.
The biggest impact query via a cursor is along the lines of select id, c2, c3.... c20 from view where id in (@p1, @p2 .... @p100)
Yes that's 100 parameters. The view is along the lines of select * from table where type=0
There is an index already on the type column. And the pk and clustered index is on the id column
We have tried the following troubleshooting:
- New indexes
- Filtered index
- Clearing proc cache
- Using the legacy cardinality estimation
- Turning off parameter sniffing
- Setting the compatibility level of the database back to 100
- Applied SQL 2016 SP1 CU3
- Restarted the instance
- Turned off SCOM monitoring
- Turned off the data collector
- Checked error logs and event logs
How can I better troubleshoot cursor fetches that have long durations and high cpu?
July 5, 2017 at 2:37 am
when you upgraded, did you update stats, refresh views etc.?
There is a handy post about what to do after an upgrade:
https://thomaslarock-com.cdn.ampproject.org/c/s/thomaslarock.com/2017/04/upgrading-sql-server-2016-post-upgrade-tasks/amp/
If you want to look further in to cursor stats then try looking at the dm_exec_cursors dmv.
Or you could explore the plan cache for queries with high run time or compile time? There are a few good posts from Jonathan Kehayias on sqlskills.com about this
July 5, 2017 at 2:48 am
Have run update stats on the database. Have run update stats with fullscan on the offending table. Can see the fetch is definitely using a bad plan with a costly filter operation. Maybe I need to force a good plan
July 5, 2017 at 4:13 am
burfos - Wednesday, July 5, 2017 2:48 AMHave run update stats on the database. Have run update stats with fullscan on the offending table. Can see the fetch is definitely using a bad plan with a costly filter operation. Maybe I need to force a good plan
Different cardinality estimator introduced in 2014?
There is a flag which turns it off.
_____________
Code for TallyGenerator
July 5, 2017 at 11:46 am
For that one query, try putting the variables into a (keyed) temp table (not a table variable), then joining to that table.:
CREATE TABLE #ids ( id int NULL );
CREATE CLUSTERED INDEX ids__CL ON #ids ( id ) WITH ( FILLFACTOR = 100 );
INSERT INTO #ids VALUES(@p1), (@p2),...,(@p100)
... FROM view INNER JOIN #ids i ON i.id = view.id
SQL DBA,SQL Server MVP(07, 08, 09) "It's a dog-eat-dog world, and I'm wearing Milk-Bone underwear." "Norm", on "Cheers". Also from "Cheers", from "Carla": "You need to know 3 things about Tortelli men: Tortelli men draw women like flies; Tortelli men treat women like flies; Tortelli men's brains are in their flies".
July 5, 2017 at 6:07 pm
Yep have tried legacy cardinality estimation.
Unable to modify the query as its a vendor app.
I have turned on the query store and can see that its definitely a bad plan. Pulls back all the 8 million rows and then does a costly filter on those rows with the id parameters
July 6, 2017 at 7:41 am
Probably too late to do this now, but it'll prove handy for any new upgrades you do.
Moving to 2016, the Query Store is your best friend. Move the database(s) in question, but leave the compatibility level on the old server. Run your system for a few weeks with the Query Store on. Then, switch compatibility mode. If you get regressions, you can use plan forcing from Query Store to immediately fix them through plan forcing. No need to change the code.
For the problem immediately in front of you, do you have the old plans? Are they different? It's highly likely that it's the new cardinality estimation, but it could also be changes within the optimizer (these are enabled by default in 2016, a change from previous versions). Without the old plans to compare, troubleshooting to spot the problem will be harder as will identifying a solution to communicate to the vendor.
"The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
- Theodore Roosevelt
Author of:
SQL Server Execution Plans
SQL Server Query Performance Tuning
July 26, 2017 at 2:42 am
Thanks for all the replies. Problems with query store is that it fills up in minutes... (2GB query store). Anyway the vendor has provided a patch fix for their application after I fed the info through to them. The problem was that with the old version they used sp_Cursoropen with values hard coded in the cursor definition and the new version they used sp_cursorprepexec with parameters. This new way of doing it had better performance for most except those of the format select c1,c2 from t1 where c1 in (@p1, @p2). This produced scans and bad execution plan. For those cursors they switched back to the old method in the patch they released
July 26, 2017 at 5:13 pm
Thanks for all the replies. Problems with query store is that it fills up in minutes... (2GB query store). Anyway the vendor has provided a patch fix for their application after I fed the info through to them. The problem was that with the old version they used sp_Cursoropen with values hard coded in the cursor definition and the new version they used sp_cursorprepexec with parameters. This new way of doing it had better performance for most except those of the format select c1,c2 from t1 where c1 in (@p1, @p2). This produced scans and bad execution plan. For those cursors they switched back to the old method in the patch they released
Sounds to me like it's still broken. It's still using a cursor and, it sounds like, poorly designed table.
--Jeff Moden
Change is inevitable... Change for the better is not.
July 26, 2017 at 6:46 pm
I don't disagree but... vendor app. If I could rewrite their tables\queries and get rid of cursors, I would.
July 29, 2017 at 12:07 pm
burfos - Wednesday, July 26, 2017 6:46 PMI don't disagree but... vendor app. If I could rewrite their tables\queries and get rid of cursors, I would.
Yeah... been there, done that, going through it again right now. It's never a fun thing especially if they have this thing in their throat called "portability", which makes them not want to use stored procedures, etc, etc. It's a real shame.
--Jeff Moden
Change is inevitable... Change for the better is not.
July 31, 2017 at 3:45 pm
burfos - Wednesday, July 26, 2017 6:46 PMI don't disagree but... vendor app. If I could rewrite their tables\queries and get rid of cursors, I would.
Have you talked to the vendor about it? If they claim compatibility with SQL Server 2016 (Compatibility Level 130), then I would think a significant drop in performance should be covered by your support contract. Odds are you are not the only customer experiencing the problem with their application.
Wes
Wes
(A solid design is always preferable to a creative workaround)
July 31, 2017 at 5:49 pm
whenriksen - Monday, July 31, 2017 3:45 PMburfos - Wednesday, July 26, 2017 6:46 PMI don't disagree but... vendor app. If I could rewrite their tables\queries and get rid of cursors, I would.Have you talked to the vendor about it? If they claim compatibility with SQL Server 2016 (Compatibility Level 130), then I would think a significant drop in performance should be covered by your support contract. Odds are you are not the only customer experiencing the problem with their application.
Wes
I've found that to be the case once in a while. I've updated COTS software before to perform better. It turned out the authors were quite interested in the improvements.
The answer is always "no" unless you ask, so you might want to give it a try.
August 2, 2017 at 9:05 am
Ed Wagner - Monday, July 31, 2017 5:49 PMwhenriksen - Monday, July 31, 2017 3:45 PMburfos - Wednesday, July 26, 2017 6:46 PMI don't disagree but... vendor app. If I could rewrite their tables\queries and get rid of cursors, I would.Have you talked to the vendor about it? If they claim compatibility with SQL Server 2016 (Compatibility Level 130), then I would think a significant drop in performance should be covered by your support contract. Odds are you are not the only customer experiencing the problem with their application.
Wes
I've found that to be the case once in a while. I've updated COTS software before to perform better. It turned out the authors were quite interested in the improvements.
The answer is always "no" unless you ask, so you might want to give it a try.
Heh... been there, done that, hate every minute of it because the whole thing, womb to tomb, ends up being a stupid exercise.
1. Spend a lot of time determining what the problem actually is.
2. Contact the S/W vendor to identify to advise them of the problem and that it needs to be fixed.
3. Listen to the vendor say stupid things like "it operates as designed" or "we can't duplicate the problem". That's when they finally get to looking at the problem, which may also be accompanied by multiple calls asking them when the hell they're going to get to the problem.
4. Do a deep dive to find out exactly where the problem is.
5. Do a deeper dive to figure out how to fix it without breaking the interface, etc.
6. Design and test the fix on expensive software that shouldn't have had the problem to begin with if they had actually done some reasonable testing.
7. Provide the fix to the vendor.
8. Go back and forth for another couple of months.
9. Finally get word that they were able to duplicate the problem and that the fix is scheduled to go in the next release.
10. Wait more months for the release to come out.
11. Find out that they changed the fix you gave them and the problem is actually worse.
12. Wash, rinse and repeat the above. They finally come up with a fix that works.
13. Get no cost breaks from the vendor on supposed "support", which costs dearly every bloody month.
Considering that this has happened on software with an initial purchase price sometimes in the hundreds of thousands of dollars range and monthly support costs larger than any of our salaries, I've grown quite weary of fixing "OPS" (Other People's S**T). We spend a whole lot of time on their over-expensive product to make their junk better and they won't even throw us a bone.
Then again, sometimes (actually, frequently) the answer is "No" even if you do ask. :crazy:
Sorry about the rant... it's been one of those decades. :sick:
--Jeff Moden
Change is inevitable... Change for the better is not.
August 2, 2017 at 9:40 am
Jeff Moden - Wednesday, August 2, 2017 9:05 AMEd Wagner - Monday, July 31, 2017 5:49 PMwhenriksen - Monday, July 31, 2017 3:45 PMburfos - Wednesday, July 26, 2017 6:46 PMI don't disagree but... vendor app. If I could rewrite their tables\queries and get rid of cursors, I would.Have you talked to the vendor about it? If they claim compatibility with SQL Server 2016 (Compatibility Level 130), then I would think a significant drop in performance should be covered by your support contract. Odds are you are not the only customer experiencing the problem with their application.
Wes
I've found that to be the case once in a while. I've updated COTS software before to perform better. It turned out the authors were quite interested in the improvements.
The answer is always "no" unless you ask, so you might want to give it a try.
Heh... been there, done that, hate every minute of it because the whole thing, womb to tomb, ends up being a stupid exercise.
1. Spend a lot of time determining what the problem actually is.
2. Contact the S/W vendor to identify to advise them of the problem and that it needs to be fixed.
3. Listen to the vendor say stupid things like "it operates as designed" or "we can't duplicate the problem". That's when they finally get to looking at the problem, which may also be accompanied by multiple calls asking them when the hell they're going to get to the problem.
4. Do a deep dive to find out exactly where the problem is.
5. Do a deeper dive to figure out how to fix it without breaking the interface, etc.
6. Design and test the fix on expensive software that shouldn't have had the problem to begin with if they had actually done some reasonable testing.
7. Provide the fix to the vendor.
8. Go back and forth for another couple of months.
9. Finally get word that they were able to duplicate the problem and that the fix is scheduled to go in the next release.
10. Wait more months for the release to come out.
11. Find out that they changed the fix you gave them and the problem is actually worse.
12. Wash, rinse and repeat the above. They finally come up with a fix that works.
13. Get no cost breaks from the vendor on supposed "support", which costs dearly every bloody month.Considering that this has happened on software with an initial purchase price sometimes in the hundreds of thousands of dollars range and monthly support costs larger than any of our salaries, I've grown quite weary of fixing "OPS" (Other People's S**T). We spend a whole lot of time on their over-expensive product to make their junk better and they won't even throw us a bone.
Then again, sometimes (actually, frequently) the answer is "No" even if you do ask. :crazy:
Sorry about the rant... it's been one of those decades. :sick:
Heh - I know what you mean about software vendors being obtuse and thinking they know everything there is to know. My experience went along a much abbreviated path on software that was much cheaper that what you describe.
1. Spend time determining what the problem actually is.
2. Contact the S/W vendor to identify to advise them of the problem and that it needs to be fixed.
3. Listen to the vendor say stupid things like "it operates as designed" or "that isn't a problem in our test environment". They also offer a bunch of excuses that really mean nothing and bury the whole thing in bureaucracy to avoid doing any work.
4. Fix the problem that never should have existed in the first place. Take it through normal testing. Release it to production.
5. Provide the fix to the vendor along with a description of why the problem is a problem. If they have a support forum, post the problem, root cause description and fix for the world to see. This helps others who encounter the same thing and encounter the same bureaucratic nonsense.
6. Let it go because I don't really care if they want to remain blissfully ignorant. If they decide to fix their own software, so be it. Any "upgrades" we install will be patched with the fix after installation and before going through testing and release to production.
Viewing 15 posts - 1 through 15 (of 15 total)
You must be logged in to reply to this topic. Login to reply