September 8, 2019 at 4:33 am
Here is the base fact set as of right now...maybe someone else will encounter this some day...
VMWare 6.7U3 HW level 15. Suddenly last week... a well-behaved server with well known established queries (and performance) suddenly went sideways.
The issue appears to be with proessor pressure.
I have a query that produces 90K rows in 3 seconds IF MaxDop is 1.
Run that query, forcing MaxDop 2... Activity monitor will show the query, but you'll see ZERO disk i/o, memory, and processor pressure. What you will see, however, is TempDB growing like mad... 15g in 2 minutes; If I let the query run for 5 hours... tempDB will hit 50gig...and I Still won't see data.
To reiterate, this is not a new server, this is not a new query... clearly "something" has changed...but technically, nothing we can isolate has changed in the environment.
September 8, 2019 at 7:40 am
Possibly rollback U3 would be the best option at the moment, specially as there are other people complaining about possible issues with this update.
Then get a copy of this VM on another server with same spec and load and with U2 installed - test and ensure the queries run as expected.
Update the server to U3 and see if performance is as expected or if it regresses to the problematic one.
Other things to look at/try
September 8, 2019 at 7:53 am
Thank you for the suggestions Frederico. We've done ALL of those already.
We just stood up a new server, pristine install on everything, same results.
I just upgraded the test environment to SQL 2017... same results.
September 8, 2019 at 8:33 am
That being the case then rollback U3 and contact Vmware with the issue through support - fact that can be reproduced on a brand new environment will help them.
September 8, 2019 at 4:48 pm
That one is next on the list now as of this morning.
The Virtual guys are now discussing Cisco firmware updates... apparently (in the environment where nothing changed on 8/3), we have now identified that within 7 days of Black Friday, Cisco Firmware was updated from 4.0(2d) to 4.0(4d), VMWare ESXi was upgraded from 6.7U2 to 6.7U3, Hardware level 14 to 15 (done after things went south).
We have backed out every single change, and even started with a pristine environment at what we think was running before 8/30 (it appears that there are way too many cooks, too many moving parts, and not nearly enough documentation), except for this mornings addition of Cisco Firmware. On the pristine environment, I even upgraded to sql 2017 to see if newer SQL would help... no change.
This is more Virtual Education than a DBA like me cares to have, and my weakness in knowing all the moving parts in someone elses environment puts me behind the eight ball on being able to encourage the systems folks in the right direction; (we are still questioning "queries") We are still trying to work through it, but I can prove it is a processor problem; any query that puts heavy processor pressure on the box is now sliding sideways; it'll get lost forever if I allow big queries to go parallel filling TempDB like a landfill.
Here is where we have been on versions of the various bits.
Edit: Cisco Firmware was rolled back to 4.0(2d)... and the server still goes sideways if we allow queries to go spawn to more than one processor.
September 9, 2019 at 10:52 am
You've moved out of my core knowledge. I agree that you've proven it's related to the CPU and therefore to the VMs management of it. Good luck getting the people who will then be tasked with actually doing the work to fix it to believe you.
"The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
- Theodore Roosevelt
Author of:
SQL Server Execution Plans
SQL Server Query Performance Tuning
Viewing 6 posts - 16 through 20 (of 20 total)
You must be logged in to reply to this topic. Login to reply