April 20, 2009 at 11:46 am
Hi,
I've encountered the following strange behaviour on some legacy DB environment. After adding extra processors (from 2 to 4) the performance for getting a query resultset dropped.
I've tested the foillwing situations:
--1 Processor(s): After 2 minutes, the resultset had amn average rowcount of 10 Million records
--2 Processor(s): After 2 minutes, the resultset had amn average rowcount of 7 Million records
--4 Processor(s): After 2 minutes, the resultset had amn average rowcount of 4 Million records
I used PAL to monitor for hardware related issues, but besides a bit slow VMDK, I found nothing wrong.
Environment I used consists of the following:
Windows 2000 SP4 and SQL2000 SP4
Configuration:
OS on VMWARE VMDK (including OS paging file)
APPS, LOG and DATA on LUN
Any ideas?
Kind regards,
Dave
April 20, 2009 at 11:54 am
Not sure what your "rowcounts after 2 minutes" is supposed to measure. What are you using to measure performance and what metrics are you keeping?
One possibility for reduced performance is if the queries are designed in such a way that parallelism hurts them. You can test this by running a benchmark query with and without MaxDOP 0.
- Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
Property of The Thread
"Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon
April 20, 2009 at 11:56 am
Depending on the query, parallelism may not be best. More processors may help by allowing additional queries to be processed at the same time.
Test your query again but with the following: OPTION (MAXDOP 1).
Try it again changing the MAXDOP value to 2, 3, and 4 as well.
You may want to investigate the Actual Execution Plan and see what it is doing differently as well with each change in Max Degree of Parallelism.
April 20, 2009 at 12:34 pm
The query takes a long time to execute. So after two minutes I stop the query to see how many records are retrieved. The more records retrieved in those two minutes, the better the query performance.
The issue in the execution plan is focusing on SORT (100%). Adding indexes would be the best in order to accommodate the GROUP BY the SORT in the execution plan is referring to. However I am not allowed to change the database design 🙁
The funny thing is that two weeks ago I had no issues. Just after change local storage to LUNs and adding more processors, issues starting to appear.
April 20, 2009 at 12:42 pm
Dave Hoogendoorn (4/20/2009)
The query takes a long time to execute. So after two minutes I stop the query to see how many records are retrieved. The more records retrieved in those two minutes, the better the query performance.The issue in the execution plan is focusing on SORT (100%). Adding indexes would be the best in order to accommodate the GROUP BY the SORT in the execution plan is referring to. However I am not allowed to change the database design 🙁
The funny thing is that two weeks ago I had no issues. Just after change local storage to LUNs and adding more processors, issues starting to appear.
Well, now you have changed the scenario. This isn't just one change, but multiple changes. How are the LUNs configured on the SAN? Are multiple servers actually sharing the same physical disks? This would have an impact on performance as well.
Have you tried the OPTION (MAXDOP 1) as I suggested?
April 20, 2009 at 1:03 pm
Dave Hoogendoorn (4/20/2009)
The query takes a long time to execute. So after two minutes I stop the query to see how many records are retrieved. The more records retrieved in those two minutes, the better the query performance.
That's not really a valid test. If most of the query isn't returnable yet, because it's being processed in parallel, the final result might be faster, but the partial result might be less. You'd need to test it all the way through to see which number of CPUs is faster.
- Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
Property of The Thread
"Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon
April 20, 2009 at 1:28 pm
Maxdop unfortunately did not give any major improvements.
Regarding the validity of the test;I agree. However. Before the issues started, the queries were much, much faster then those 2 minutes....
So stopping that early (within 2 minutes) is actually admitting that adding processor power is not a valid solution.
Thanks for the replies. I hope I will be allowed to change design somewhere this week
April 20, 2009 at 1:38 pm
Dave Hoogendoorn (4/20/2009)
Maxdop unfortunately did not give any major improvements.Regarding the validity of the test;I agree. However. Before the issues started, the queries were much, much faster then those 2 minutes....
So stopping that early (within 2 minutes) is actually admitting that adding processor power is not a valid solution.
Thanks for the replies. I hope I will be allowed to change design somewhere this week
Why are you focussing only on the number of processors? You also indicated that you switched from local storage to LUN's (indicating that you moved to a SAN or NAS storage system). This could also be part of your problem depending on how the LUN's are configured.
April 20, 2009 at 1:48 pm
Regarding I/O:
I already used PAL to get an impression of LAN (SAN) performance. After tweaking a little, discovered issues regarding SAN performance were solved. As for as I know throughput, access times etc are no problem.
Going back to the processor issue; I am already convinced that adding hardware is not the solution. What I'm especially interested in, is an explanation for this strange behavior (more processor power, less performance).
Just as a test case I will execute the test again only this time waiting for the final resultset.
April 20, 2009 at 1:51 pm
Can you also post the DDL for the tables (including indexes) and the query that you are running?
April 20, 2009 at 2:15 pm
Dave Hoogendoorn (4/20/2009)
Maxdop unfortunately did not give any major improvements.Regarding the validity of the test;I agree. However. Before the issues started, the queries were much, much faster then those 2 minutes....
So stopping that early (within 2 minutes) is actually admitting that adding processor power is not a valid solution.
Thanks for the replies. I hope I will be allowed to change design somewhere this week
I don't understand. You listed it as taking more than 2 minutes with 1 processor in your test. Is that just by setting MaxDOP? Or was it actually on just one processor?
- Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
Property of The Thread
"Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon
Viewing 11 posts - 1 through 10 (of 10 total)
You must be logged in to reply to this topic. Login to reply