August 28, 2018 at 7:37 am
I have an issue with a stored proc that uses a cursor (yes, I KNOW THIS ISN'T GOOD) however it is client app code and I don't have scope to change it. The stored proc on premise using SQL 2008 SP2 and takes 30 mins. In Azure IaaS using SQL 2016 SP1 it takes 200 mins. I would have expected it to be faster than on prem given it has been upgraded to three versions newer not six times worse in execution time.
OS: Win2008 Ent / SQL 2008 SP2 x64 32GB RAM / 16 cpu
OS: Win2012 R2 DC / SQL 2016 SP1 x64 56GB RAM / 16 cpu (Azure IaaS with managed Premium SSDs)
Anyone experienced this before?
qh
August 28, 2018 at 8:50 am
quackhandle1975 - Tuesday, August 28, 2018 7:37 AMI have an issue with a stored proc that uses a cursor (yes, I KNOW THIS ISN'T GOOD) however it is client app code and I don't have scope to change it. The stored proc on premise using SQL 2008 SP2 and takes 30 mins. In Azure IaaS using SQL 2016 SP1 it takes 200 mins. I would have expected it to be faster than on prem given it has been upgraded to three versions newer not six times worse in execution time.OS: Win2008 Ent / SQL 2008 SP2 x64 32GB RAM / 16 cpu
OS: Win2012 R2 DC / SQL 2016 SP1 x64 56GB RAM / 16 cpu (Azure IaaS with managed Premium SSDs)Anyone experienced this before?
qh
Did you compare the query plans between the two?
It could be the new CE introduced in 2014. There are some suggestions on ways to handle CE issues in this blog:
SQL Server 2016 new features to deal with the new CE
Sue
August 28, 2018 at 9:09 am
Thanks Sue, I never considered the CE, so I will have a look at that. I have used Actual Execution Plan for the query on SQL 2016 Azure and most of the work (65%) is due to the cursor. It uses a Clustered Index Insert [CWT_PrimaryKey]. I don't have access to the on-premise environment as it is current production, but need to get the plan to compare.
Cheers
qh
August 31, 2018 at 2:50 am
Just a further update on this for anyone with similar issues. Firstly we looked at the entire on-premise SQL environment to see what was happening:
1. MAXDOP was set to 8
2. There is no HA (eg - Mirroring) on on-premise
3. Tempdb is set to multiple data files 8 x 5GB / 8 x 2GB
So, to keep everything consistent we applied these to the Azure SQL Servers. Aaaaaand….it was still the same poor performance. We even took the db out of the AG and this gave only a negligible perf gain. Damn! :crazy: Then one thing we did look at was this:
https://docs.microsoft.com/en-us/azure/virtual-machines/windows/sql/virtual-machines-windows-sql-performance
And specifically this one:
"Enable read caching on the disk(s) hosting the data files and TempDB data files."
Thus we shut down the server(s) and applied this and wow this jumped performance up by a few notches. Not as good as on premise however but still magnitude of improvement than we were getting. :satisfied: However I am still left with the issue where specific stored procs (using a cursor) complete in around 3.5 hours for On-premise. The Azure environment takes double that. I am now looking at increasing the disks for more IOPS onthe SQL trans log drives as I ran sp_whoisactive (a brilliant tool if you haven't used it before) and I am getting heavy WRITELOG waits when these stored procs run.
Any further recommendations greatly received as this has almost got me stumped. :pinch:
qh
August 31, 2018 at 5:28 am
quackhandle1975 - Friday, August 31, 2018 2:50 AMJust a further update on this for anyone with similar issues. Firstly we looked at the entire on-premise SQL environment to see what was happening:
1. MAXDOP was set to 8
2. There is no HA (eg - Mirroring) on on-premise
3. Tempdb is set to multiple data files 8 x 5GB / 8 x 2GBSo, to keep everything consistent we applied these to the Azure SQL Servers. Aaaaaand….it was still the same poor performance. We even took the db out of the AG and this gave only a negligible perf gain. Damn! :crazy: Then one thing we did look at was this:
https://docs.microsoft.com/en-us/azure/virtual-machines/windows/sql/virtual-machines-windows-sql-performanceAnd specifically this one:
"Enable read caching on the disk(s) hosting the data files and TempDB data files."Thus we shut down the server(s) and applied this and wow this jumped performance up by a few notches. Not as good as on premise however but still magnitude of improvement than we were getting. :satisfied: However I am still left with the issue where specific stored procs (using a cursor) complete in around 3.5 hours for On-premise. The Azure environment takes double that. I am now looking at increasing the disks for more IOPS onthe SQL trans log drives as I ran sp_whoisactive (a brilliant tool if you haven't used it before) and I am getting heavy WRITELOG waits when these stored procs run.
Any further recommendations greatly received as this has almost got me stumped. :pinch:
qh
It's very likely to be the new CE. Have you been down that avenue yet?
The absence of evidence is not evidence of absence.
Martin Rees
You can lead a horse to water, but a pencil must be lead.
Stan Laurel
August 31, 2018 at 8:36 am
Phil Parkin - Friday, August 31, 2018 5:28 AMquackhandle1975 - Friday, August 31, 2018 2:50 AMJust a further update on this for anyone with similar issues. Firstly we looked at the entire on-premise SQL environment to see what was happening:
1. MAXDOP was set to 8
2. There is no HA (eg - Mirroring) on on-premise
3. Tempdb is set to multiple data files 8 x 5GB / 8 x 2GBSo, to keep everything consistent we applied these to the Azure SQL Servers. Aaaaaand….it was still the same poor performance. We even took the db out of the AG and this gave only a negligible perf gain. Damn! :crazy: Then one thing we did look at was this:
https://docs.microsoft.com/en-us/azure/virtual-machines/windows/sql/virtual-machines-windows-sql-performanceAnd specifically this one:
"Enable read caching on the disk(s) hosting the data files and TempDB data files."Thus we shut down the server(s) and applied this and wow this jumped performance up by a few notches. Not as good as on premise however but still magnitude of improvement than we were getting. :satisfied: However I am still left with the issue where specific stored procs (using a cursor) complete in around 3.5 hours for On-premise. The Azure environment takes double that. I am now looking at increasing the disks for more IOPS onthe SQL trans log drives as I ran sp_whoisactive (a brilliant tool if you haven't used it before) and I am getting heavy WRITELOG waits when these stored procs run.
Any further recommendations greatly received as this has almost got me stumped. :pinch:
qh
It's very likely to be the new CE. Have you been down that avenue yet?
You're not the first person to ask that however I didn't get any noticeable gain when I turned it on for that particular database. Unless I was missing something.
qh
September 2, 2018 at 4:06 pm
Another update (for anyone that cares) 😀 I did some tests with running the process and using resource monitor to see what was going on whilst it was running. CPU, zip, RAM, zip. TempDB, you guessed it, zip. They were all doing nothing. Only workload was massive writes to the db log drive. So By The Power Of Greyskull Azure, we increased the disk size to 2TB so we get an IOPS increase from 5000 to 7500.
Yeah nothing happened. Still tons of writes to the log drive. I bloody hate computers sometimes. The next experiment is to try a G series based VM in Azure which apparently gives you 64,000 IOPS. These babies cost $$$$ so the client ain't gonna pay for it, we just want to see if it improves log write performance.
Stay tuned! :Wow:
qh
September 2, 2018 at 4:41 pm
have you tried setting Delayed Durability? just to see the impact - and permanent setting if it improves and the DB can leave with it on (or at least the particular cursor query that has the issue.
https://www.mssqltips.com/sqlservertip/5121/reduce-sql-server-writelog-waits-using-delayed-durability/
And cloud vCpus are always slower than On-prem - those 16 vCpu of that VM equate to a loss of 30 to 45% power when compared to a 16 Core (32 threads) server on prem.
Also log drive should not have write cache enabled.
Out of curiosity did you try Disk Striping? would be curious to see the impact.
Tempdb - did you put it on the local temp storage or on a premium disk? local normally faster. Even if it doesn't seem to make a difference here.
just to see the impact I would also try
- run it with the db set to simple logging
- increase frequency of log backups
September 2, 2018 at 7:50 pm
quackhandle1975 - Sunday, September 2, 2018 4:06 PMAnother update (for anyone that cares) 😀 I did some tests with running the process and using resource monitor to see what was going on whilst it was running. CPU, zip, RAM, zip. TempDB, you guessed it, zip. They were all doing nothing. Only workload was massive writes to the db log drive. So By The Power OfGreyskullAzure, we increased the disk size to 2TB so we get an IOPS increase from 5000 to 7500.Yeah nothing happened. Still tons of writes to the log drive. I bloody hate computers sometimes. The next experiment is to try a G series based VM in Azure which apparently gives you 64,000 IOPS. These babies cost $$$$ so the client ain't gonna pay for it, we just want to see if it improves log write performance.
Stay tuned! :Wow:
qh
What I really hate isn't the hardware... it's the hype. The good part about it all is that good code is even more important now. 😉
--Jeff Moden
Change is inevitable... Change for the better is not.
September 3, 2018 at 9:25 am
Right, we have a solution! Two things, one of the DBAs here decided to stick the offending code into a temp table. Hey presto - the process runs in just over a minute!!!
Also we found an issue with the following:
https://blogs.msdn.microsoft.com/alwaysonpro/2018/02/06/analyze-synchronous-commit-impact-on-high-commit-rate-workloads/
Which makes sense as the only major setup difference between On premise and Azure environment was in Azure we are using Always On AG's however we didn't test it properly as when taking the database out of the AG or (stopping the secondary server) then running the code there was no performance gain. We simply wrapped the cursor in a transaction and hey presto, runs in 4 minutes as opposed to 200 mins! And easier to implement than the temp table fix.
As Jeff pointed out above you can get lost in the hype, as were convinced the poor performance was the fault of Azure, however it was down to the (poor) code. Nice one, Jeff!
qh
September 3, 2018 at 9:44 am
quackhandle1975 - Monday, September 3, 2018 9:25 AMRight, we have a solution! Two things, one of the DBAs here decided to stick the offending code into a temp table. Hey presto - the process runs in just over a minute!!!
Also we found an issue with the following:
https://blogs.msdn.microsoft.com/alwaysonpro/2018/02/06/analyze-synchronous-commit-impact-on-high-commit-rate-workloads/Which makes sense as the only major setup difference between On premise and Azure environment was in Azure we are using Always On AG's however we didn't test it properly as when taking the database out of the AG or (stopping the secondary server) then running the code there was no performance gain. We simply wrapped the cursor in a transaction and hey presto, runs in 4 minutes as opposed to 200 mins! And easier to implement than the temp table fix.
As Jeff pointed out above you can get lost in the hype, as were convinced the poor performance was the fault of Azure, however it was down to the (poor) code. Nice one, Jeff!
qh
Heh... thanks for the feedback. Like I tell people, "Performance is in the code... or not." 😀
That notwithstanding, I still prefer on-premise to anything in the cloud. There's no need for trickery with on-premise... just good code.
--Jeff Moden
Change is inevitable... Change for the better is not.
September 4, 2018 at 3:48 am
I am just please we found a solution as when you are dealing with 10+ year old code and you are moving it to Azure/Cloud you can run into issues as the primary task is to just migrate it *as is* we don't have scope due to budget/politics/time constraints, etc to re-write the code so I am sure in the next few years people are going to run into these types of issues when more systems are migrated to the Cloud.
In my issue everyone was focusing on the cursor, which we all know is bad for SQL Server, but it still initially ran faster on premise.
qh
September 4, 2018 at 6:03 am
quackhandle1975 - Tuesday, September 4, 2018 3:48 AMI am just please we found a solution as when you are dealing with 10+ year old code and you are moving it to Azure/Cloud you can run into issues as the primary task is to just migrate it *as is* we don't have scope due to budget/politics/time constraints, etc to re-write the code so I am sure in the next few years people are going to run into these types of issues when more systems are migrated to the Cloud.In my issue everyone was focusing on the cursor, which we all know is bad for SQL Server, but it still initially ran faster on premise.
qh
I think it's awesome that you've documented what you ran into and the fixes that you ended up with. It's going to help a lot of folks with similar issues.
If you don't mind a suggestion, what you went through would make an awesome article and would help even more folks because it would be headlined instead of being present "only" in a forum post that comparatively few will read.
As for me, I've bookmarked/added it to my briefcase this thread because of your documentation on the subject. I really appreciate it because I know that, someday, I'm going to be in the same boat. Thank you again.
--Jeff Moden
Change is inevitable... Change for the better is not.
September 4, 2018 at 3:37 pm
Thank you Jeff, my pleasure. This is part of a huge migration to Azure (we go live this weekend) however once I get my life back, yeah I think that's a great idea. I've only ever done editorials for SSC, and provided Mr Jones approves I think I could replicate the issue in Azure and show the solution.
Hold that thought. 😀
qh
May 7, 2019 at 9:39 pm
Just a note I remembered when reading this.. I did see a video SOMEWHERE that pretty much said you always want to wrap just about everything in transactions in the cloud
Viewing 15 posts - 1 through 14 (of 14 total)
You must be logged in to reply to this topic. Login to reply