Ideal MAXDOP & CPUs - how to trace for parallel queries?

  • There is a discussion going on in our company on what the ideal CPU count and Max Degree of Parallelism are for a 3rd party database server.

    The server has 12 CPUs, 32GB RAM and all database sizes add up to < 30GB so they can all fit in memory (I tried to force this by doing a select * from every table). On certain payroll days, the CPU gets maxed out to 100% for a few seconds.

    MAXDOP was originally set to the default 0. We later changed it to 8 based on several 'best-practices' articles. However the vendor suggests to change it to 1 (no parallelism), while others suggest changing it to 4, so that one run-away query doesn't hog most of the CPUs.

    I'd like to find out how many CPUs are actually being used by queries. There is a Degree of Parallelism event in [Profiler](https://msdn.microsoft.com/en-us/library/ms187943.aspx). The BinaryData column says :

    0x00000000, indicates a serial plan running in serial.

    0x01000000, indicates a parallel plan running in serial.

    >= 0x02000000 indicates a parallel plan running in parallel.

    - What does "parallel plan running in serial" mean ?

    I see a lot of 0x01000000, and a few 0x08000000's in my trace.

    - How can i determine whether one query is hogging CPUs and if reducing it to 4 will help ?

  • In general, you want something like the number of physical cores in one NUMA node, 1.5* that or 2* that. The idea is that you don't want queries paralleling across multiple NUMA nodes if at all possible.

    Don't base it on what queries are doing, that'll be affected by resources and types of query and other load, etc.

    Make sure cost threshold has been increased, the default is stupidly low.

    If CPU usage is a problem, identify and tune heavy CPU queries.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • number of physical cores in one NUMA node

    I forgot to mention, the servers are virtual (VMware). Does usual NUMA/core setting on physical servers still apply to VMs ?

    Make sure cost threshold has been increased, the default is stupidly low.

    Oh.. what do y'all recommend ? 10, 50, 100.. ?

    If CPU usage is a problem, identify and tune heavy CPU queries.

    unfortunately this is vendor code, so we're kinda restricted to indexes/configurations.. the turnaround on code tuning into their next release is very slow.

  • Sure. Speak with your VM admin about how the physcal NUMA nodes are mapped. The later versions of VMWare handle NUMA config.

    As for cost threshold, 5 (the default) is stupidly low. Higher. If you still see small queries paralleling, then higher still.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • What does "parallel plan running in serial" mean ?

    When the cost of a plan exceeds the cost threshold for parallelism, a parallel plan should be chosen (and cached). At run time, SQL Server determines whether there are sufficient available worker threads to support the use of parallelism. If there are insufficient available worker threads, the cached parallel plan will be run without using parallelism (i.e. the plan will be run serially).

    The number of worker threads needed depends upon the max degree of parallelism setting, and the number of parallelism operators (in a plan) that can be run concurrently. If the plan has 4 parallelism operations that can run concurrently and if max degree of parallelism is set to 8, SQL Server will consider whether 32 worker threads are available, and use parallelism if they are.

    The danger of going massively parallel (at run time) is that all available worker threads can be consumed, which prevents new sessions from being able to immediately gain use of a worker thread. Usually, a lack of worker threads becomes a problem when a plan consumes 100 or more worker threads. In the worst scenarios, all schedulers (CPUs) become bogged down with handling one or two statements, and SQL Server asserts (in its errorlog) that all schedulers are "deadlocked" [sic].

    A max degree of parallelism setting of 1 is generally useful for development environments, because it makes the developers feel the pain ;-). However, parallelism does not behave like a synchronized swimming team. Instead, parallelism behaves more like a cricket team. Which means twice as many threads <> twice as fast. It also means some threads (like the bowler) might be waiting (AKA the CXPACKET wait_type) for teammates to complete their chases after the ball...

    Hopefully, your developer should be load testing SQL Server (unless they are a "on size fits all" SQL development shop). They seem willing to suggest a max degree of parallelism setting of 1 (which suggests you bought an OLTP system). Now that you have established what they are, you need to haggle over their price: Can they offer you a load test environment, so that you can determine what works best when their design runs upon your hardware? If the price is too high, I think being concerned about parallelism is a minor quibble, as long as you stick to best practices.

  • I found this query that checks how many plans in cache actually used parallelism in order to compare their cost with COP (cost of parallelism):

    SELECT

    CAST(n.value('(@StatementSubTreeCost)[1]', 'VARCHAR(128)') AS DECIMAL(18,2))

    FROM sys.dm_exec_cached_plans AS cp

    CROSS APPLY sys.dm_exec_query_plan(plan_handle) AS qp

    CROSS APPLY query_plan.nodes('/ShowPlanXML/BatchSequence/Batch/Statements/StmtSimple') AS qn(n)

    WHERE n.query('.').exist('//RelOp[@PhysicalOp="Parallelism"]') = 1

    It only returned 6 plans that used parallelism, out of a total 35K in the DMV !

    And their use counts were 1.

    Does that mean I don't really need to care about MAXDOP or COP values on this server?

  • No. It means that of the current plans in cache, only a few used parallelism. Depending how stable the plan cache is and when you looked at it, the plans could be a good sample of what runs on the server, or a terrible one.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass

Viewing 7 posts - 1 through 6 (of 6 total)

You must be logged in to reply to this topic. Login to reply