More processors, less performance...

  • Hi,

    I've encountered the following strange behaviour on some legacy DB environment. After adding extra processors (from 2 to 4) the performance for getting a query resultset dropped.

    I've tested the foillwing situations:

    --1 Processor(s): After 2 minutes, the resultset had amn average rowcount of 10 Million records

    --2 Processor(s): After 2 minutes, the resultset had amn average rowcount of 7 Million records

    --4 Processor(s): After 2 minutes, the resultset had amn average rowcount of 4 Million records

    I used PAL to monitor for hardware related issues, but besides a bit slow VMDK, I found nothing wrong.

    Environment I used consists of the following:

    Windows 2000 SP4 and SQL2000 SP4

    Configuration:

    OS on VMWARE VMDK (including OS paging file)

    APPS, LOG and DATA on LUN

    Any ideas?

    Kind regards,

    Dave

  • Not sure what your "rowcounts after 2 minutes" is supposed to measure. What are you using to measure performance and what metrics are you keeping?

    One possibility for reduced performance is if the queries are designed in such a way that parallelism hurts them. You can test this by running a benchmark query with and without MaxDOP 0.

    - Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
    Property of The Thread

    "Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon

  • Depending on the query, parallelism may not be best. More processors may help by allowing additional queries to be processed at the same time.

    Test your query again but with the following: OPTION (MAXDOP 1).

    Try it again changing the MAXDOP value to 2, 3, and 4 as well.

    You may want to investigate the Actual Execution Plan and see what it is doing differently as well with each change in Max Degree of Parallelism.

  • The query takes a long time to execute. So after two minutes I stop the query to see how many records are retrieved. The more records retrieved in those two minutes, the better the query performance.

    The issue in the execution plan is focusing on SORT (100%). Adding indexes would be the best in order to accommodate the GROUP BY the SORT in the execution plan is referring to. However I am not allowed to change the database design 🙁

    The funny thing is that two weeks ago I had no issues. Just after change local storage to LUNs and adding more processors, issues starting to appear.

  • Dave Hoogendoorn (4/20/2009)


    The query takes a long time to execute. So after two minutes I stop the query to see how many records are retrieved. The more records retrieved in those two minutes, the better the query performance.

    The issue in the execution plan is focusing on SORT (100%). Adding indexes would be the best in order to accommodate the GROUP BY the SORT in the execution plan is referring to. However I am not allowed to change the database design 🙁

    The funny thing is that two weeks ago I had no issues. Just after change local storage to LUNs and adding more processors, issues starting to appear.

    Well, now you have changed the scenario. This isn't just one change, but multiple changes. How are the LUNs configured on the SAN? Are multiple servers actually sharing the same physical disks? This would have an impact on performance as well.

    Have you tried the OPTION (MAXDOP 1) as I suggested?

  • Dave Hoogendoorn (4/20/2009)


    The query takes a long time to execute. So after two minutes I stop the query to see how many records are retrieved. The more records retrieved in those two minutes, the better the query performance.

    That's not really a valid test. If most of the query isn't returnable yet, because it's being processed in parallel, the final result might be faster, but the partial result might be less. You'd need to test it all the way through to see which number of CPUs is faster.

    - Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
    Property of The Thread

    "Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon

  • Maxdop unfortunately did not give any major improvements.

    Regarding the validity of the test;I agree. However. Before the issues started, the queries were much, much faster then those 2 minutes....

    So stopping that early (within 2 minutes) is actually admitting that adding processor power is not a valid solution.

    Thanks for the replies. I hope I will be allowed to change design somewhere this week

  • Dave Hoogendoorn (4/20/2009)


    Maxdop unfortunately did not give any major improvements.

    Regarding the validity of the test;I agree. However. Before the issues started, the queries were much, much faster then those 2 minutes....

    So stopping that early (within 2 minutes) is actually admitting that adding processor power is not a valid solution.

    Thanks for the replies. I hope I will be allowed to change design somewhere this week

    Why are you focussing only on the number of processors? You also indicated that you switched from local storage to LUN's (indicating that you moved to a SAN or NAS storage system). This could also be part of your problem depending on how the LUN's are configured.

  • Regarding I/O:

    I already used PAL to get an impression of LAN (SAN) performance. After tweaking a little, discovered issues regarding SAN performance were solved. As for as I know throughput, access times etc are no problem.

    Going back to the processor issue; I am already convinced that adding hardware is not the solution. What I'm especially interested in, is an explanation for this strange behavior (more processor power, less performance).

    Just as a test case I will execute the test again only this time waiting for the final resultset.

  • Can you also post the DDL for the tables (including indexes) and the query that you are running?

  • Dave Hoogendoorn (4/20/2009)


    Maxdop unfortunately did not give any major improvements.

    Regarding the validity of the test;I agree. However. Before the issues started, the queries were much, much faster then those 2 minutes....

    So stopping that early (within 2 minutes) is actually admitting that adding processor power is not a valid solution.

    Thanks for the replies. I hope I will be allowed to change design somewhere this week

    I don't understand. You listed it as taking more than 2 minutes with 1 processor in your test. Is that just by setting MaxDOP? Or was it actually on just one processor?

    - Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
    Property of The Thread

    "Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon

Viewing 11 posts - 1 through 10 (of 10 total)

You must be logged in to reply to this topic. Login to reply