MaxDop setting

  • Pretty straight forward question.  We want to modify the MaxDop setting with our SQL Server instance.  The only question I have, is it necessary to request a system outage?  Or is this something we can do when we want to implement.  Right now it's set to default, zero.  Everything I've read, do it, it takes effect immediately.

    Any comments / URLs are appreciated - thanks.

  • Hi,

    you can change this settings without outage / restart.

    Kind regards,

    Andreas

  • if you run the following query you will see that the option is dynamic (see is_dynamic column)

    select * from sys.configurations where name = 'max degree of parallelism'

    It means that the change takes effect when the RECONFIGURE statement is executed.

    https://docs.microsoft.com/en-us/sql/relational-databases/system-catalog-views/sys-configurations-transact-sql

     

  • If you're looking at a default setting for MaxDOP, good chance you also have a default setting for Cost Threshold for Parallelism. In addition to changing your MaxDOP setting, which you should do, you also need to change the Cost Threshold setting. 5 is stupid low. Here's a way to get a good value.

    "The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
    - Theodore Roosevelt

    Author of:
    SQL Server Execution Plans
    SQL Server Query Performance Tuning

  • My plan was to change the MaxDop first, then the CTFP after a few days.  Any issue with that?

    Tried your query without success.  It's going to take me a bit to get it right, believe that will require looking at your other web page and queries.  Not exactly surprised but frankly, these are new concepts and am looking forward to the challenges (this is the good 'stuff").

    I had read a document from VMWare:

    https://www.vmware.com/content/dam/digitalmarketing/vmware/en/pdf/solutions/sql-server-on-vmware-best-practices-guide.pdf

    It pretty much suggested setting CTFP to 50.  Not quite satisfied with that.  Found queries written by Jonathan Kehayias and Jared Karney which I tried in my test environment.  Both yielded very similar results which was encouraging.

    Comments are appreciated - thank you all.

  • If your CTFP is still at the default value of 5, there is very little use in changing the max dop until it has been adjusted.

    At a ctfp of 5, basically any query with any level of expense is going to go parallel.

    I personally wouldn't start at a cost threshold of 50. In a virtual environment you can have performance problems more readily from excessive parallelism that you will on a physical server, so think they just recommended something a bit higher. Really depends on your workload. 15 is probably usually about the absolute minimum you would want to have it at. I have had only one case where I set it lower, and it was because of a poorly written third party query we couldnt modify that ran many times per second with a total runtime greater than number of cores * one second, which just caused it run super high CPU all the time. Allowing it to go parallel meant that all the queries could complete in much less than one second and leave more CPU resources for other apps.

    I generally arbitrarily start at 20 and adjust up and down from there, typically with up to about 35 for mixed workloads and don't think I've ever needed to get beyond that other than in data warehouse and reporting servers where I have set it as high as 65.

    As long as your virtual host is not subscribed more than one virtual processor to one logical processor you will usually benefit from max degrees of parallelism lower than 50 on non DW/report servers.

    In any case, if you change the max DOP to less than the number of cores in the server and your ctfp is still set to 5, you arent going to get very good data, and later could have a huge swing in performance in the wrong direction when you change ctfp, or could end up leaving performance on the table figuring out which max dop is best for that ctfp. 5 is really too low for any sort of work load on modern sql and modern hardware.

  • Browncoat42: for you being able to at least envision Data Warehouses with definite need for more than a to me "envisioned rule of thumb" you seem to apply.

     

    dwh_CTFP

    This screenshot has been taken just prior to this post and represents an unspecified MSSQL Data Warehouse.

    And this is an unspecified MSSQL OLTP Server:

    oltp_CTFP

    So either 5 … [65] is a clear fail here, of course these values depend on the hardware in use aswell which leaves me to somewhat this point of view on those 5 - 65 suggestions.

    First of all it's fine to suggest something like 50 for an initial analysis and testing which was then adopted broadly to be some sort of "default to go" which is utterly crap, it's fine for testing, evaluation and analysis and that's about it.

    How do you feel about providing a general formula based on that assumption?

    I'll give you some starting points and hints:

    50 = x(unknown Hardware) * y(miraculous non existing factor for unmeasured query cost any kind of query might be running, if it was measured in a way that it could represent mass usage that would be through CEIP and I am pretty sure Microsoft will deny capturing execution plan data without consent.)

    And just like in school: Please provide the way your solution was accomplished.

    Except it never was, neither by VMware (who, talking mildly, does not know jack s**t about MSSQL because it's not one of their products) or Microsoft or numerous people I've seen blogging and writing about this magic CTFP number that solves all challenges related to that.

     

    Don't get me wrong: Do whatever you like with your SQL Servers and especially believe in anything you want wether it's that CTFP of 5 - 65 as default for any server without review is appropriate or Scientology, doesn't matter because both seem equal legit to me (on the surface).

     

    Oh and what VMware actually recommends in regards to MSSQL is mainly this: Instead of a 10:1 Ratio for CPU Resources, please load Hosts providing SQL DBs only up to a 4:1 Ratio for CPU Resources, and the next major recommendation is to not exceed physical socket configurations for the VM (so if you have 2 Sockets physically, 4 Sockets 1 Core is bad mmkay? It's bad anyways but that's really really baaad.).

     

    Grant provided a way to better estimate an appropriate CTFP value and there are a few TSQL Scripts out there which do the same. Even if you see the SubtreeCost (like in my screenshot a few of them) and have some precalculated values you still have to decide based on the Output you see if a CTFP of (f.ex.) 118 is good on my OLTP Box or if other values might be better - due to f.ex. excessive plan generation from a few statements which can impact such calculations, too.

     

    No hard feelings, I just think CTFP does not receive it's deserved attention while MAXDOP has like all of it (yet many do manage to get it wrong but that's another discussion). The deterministic switch between single threaded and parallel execution is CTFP, so having to use something like MAXDOP(x) Query Hint is either really one query that has been determined to - despite all other efforts - has to be forced to run single-threaded OR which I think is more likely: People have not understood CTFP and how to work with it.

  • I didn't that 65 should be the maximum, I just said that this was the highest I have set it. I don't care what my data warehouses do in the middle of night as long as their daily job and maintenance plans complete with decent margin before people start coming in to use them. On this particular box, 65 was just a good spot where the larger third-party reports weren't eating up all the parallelism constantly and before it started making some other reports annoyingly slow. Nothing more, nothing less. When it was at Max DOP 0 and ctfp 5 initially, idera was regularly logging cpu resource waits per query in tens of thousands of milliseconds and reports the run in 20-30 seconds now were running for 5-7 minutes. (Reports run on absolutely horrible third party views that create execution plans with more than 128 steps)

    I provided exactly how I got to my numbers: start somewhere and play with it. We have typically only evaluated a few of our highest impact applications and don't test reports that the finance department runs once every 4.2 years. I would advocate that you at least choose a starting point that subjectively seems it is in the same galaxy as what you need so you don't torture your users for a few hours if it ends up being not even close.

    I don't really agree with the notion either that the mean cost of the queries running on a box should be the sole driver of your default cost threshold. It absolutely 100% impacts it, but if you push it up far enough, you are going to reach a point where you have plans running single-threaded for an unacceptable amount of time just to keep the box from having to wait on CPU resources. A perfect situation is unlikely ever to happen and you will almost always have to make some compromise, but if you get to the point where your waits are good but whatever you are running is slow, you need more processors, not a higher cost threshold for parallelism.

    I had totally forgotten about the 10:1/4:1 recommendation from VMWare by the way. The old shop I worked at followed that back when it was the only way to make a VMWare virtual environment with NetApp back end cost competitive with physical servers, but it caused major problems in SQL, SharePoint, Exchange, terminal servers and major problems backing anything up that had a lot of data. We typically only go with 1:1 now if the processors dont have hyper-threading or if the virtual cluster contains a high variation of workload types and 2:1 if they have hyper-threading AND it is a workload that responds well to hyper-threading like modern versions of SQL server, Exchange or mainly .net web servers.

Viewing 8 posts - 1 through 7 (of 7 total)

You must be logged in to reply to this topic. Login to reply