Parallelism and MAXDOP
The pros and cons of parallelism have always been with us in SQL Server and I blogged about this a couple of years ago. This is an updated version of that post to include details of the new wait stat related to parallelism that was added in 2017 (CXCONSUMER), as well as to discuss the options available for cloud based SQL Server solutions.
There’s no doubt that parallelism in SQL is a great thing. It enables large queries to share the load across multiple processors and get the job done quicker.
However it’s important to understand that it has an overhead. There is extra effort involved in managing the separate streams of work and synchronising them back together to – for instance – present the results.
That can mean in some cases that adding more threads to a process doesn’t actually benefit us and in some cases it can slow down the overall execution.
We refer to the number of threads used in a query as the DOP (Degree of Parallelism) and in SQL Server we have the setting MAXDOP (Maximum Degree of Parallelism) which is the maximum DOP that will be used in executing a single query.
Microsoft generally recommend caution setting MAXDOP above 8:
Here’s a nice post from Kendra Little talking about how having higher settings can actually slow down your query execution time:
https://www.brentozar.com/archive/2013/12/q-can-high-maxdop-make-a-query-slower/
Out of the box, MAXDOP is set to 0, which means there is no limit to the DOP for an individual query. It is almost always worth changing this to a more optimal setting for your workload.
Cost Threshold for Parallelism
This is another setting available to us in SQL Server and defines the cost level at which SQL will consider a parallel execution for a query. Out of the box this is set to 5 which is actually a pretty low number. Query costing is based on Algorithm’s from “Nick’s machine” the box used by the original developer who benchmarked queries for Microsoft.
(Nick’s Machine)
Compared to modern servers Nick’s machine was pretty slow and as the Cost Threshold hasn’t changed for many years, it’s now generally considered too low for modern workloads/hardware. In reality we don’t want all our tiny queries to go parallel as the benefit is negligible and can even be negative, so it’s worth upping this number. Advice varies but generally recommendations say to set this somewhere in the range from 30 to 50 (and then tuning up and down based on your production workload).
There are many articles in the SQL Server community about how the out of the box setting is too low, and asking Microsoft to change it. Here’s a recent one:
http://www.scarydba.com/2017/03/13/change-the-cost-threshold-for-parallelism/
CXPACKET and CXCONSUMER waits
Often in tuning a SQL Server instance we will look at wait stats – which tell us what queries have been waiting for when they run. CXPACKET waits are usually associated with parallelism and particularly the case where multi-threaded queries have been stuck waiting for one or more of the threads to complete – i.e. the threads are taking different lengths of time because the load hasn’t been split evenly. Brent Ozar talks about that here:
https://www.brentozar.com/archive/2013/08/what-is-the-cxpacket-wait-type-and-how-do-you-reduce-it/
High CXPACKET waits can be – but aren’t necessarily – a problem. You can cure CXPACKET waits by simply setting MAXDOP to 1 at a server level (thus preventing parallelism) – but this isn’t necessarily the right solution. Though in some cases in can be, SharePoint for instance is best run with MAXDOP set to 1.
What you can definitely deduce from high CXPACKET waits however is that there is a lot of parallelism going on and that it is worth looking at your settings.
To make it easier to identify issues with parallelism, with SQL Server 2017 CU3 Microsoft added a second wait type related to parallelism – CXCONSUMER. This wait type was also added to SQL Server 2016 in SP2.
Waits related to parallelism are now split between CXPACKET and CXCONSUMER.
Here’s the original announcement from Microsoft regarding the change and giving more details:
https://blogs.msdn.microsoft.com/sql_server_team/making-parallelism-waits-actionable/
In brief, moving forward CXPACKET waits are the ones you might want to worry about, and CXCONSUMER waits are generally benign, encountered as a normal part of parallel execution.
Tuning Parallelism
In tuning parallelism we need to think about how we want different sized queries to act on our server.
Small Queries
In general we don’t want these to go parallel so we up the Cost Threshold to an appropriate number to avoid this. As discussed above 30 is a good number to start with. You can also query your plan cache and look at the actual costs of queries that have been executed on your SQL Instance to get a more accurate idea of where you want to set this. Grant Fritchey has an example of how to do that here:
http://www.scarydba.com/2017/02/20/estimated-costs-queries/
As he mentions in the post, this assessment can be quite expensive to run – so do it when things are quiet.
Medium to Large Queries
This is where we want to take advantage of parallelism, and do so by setting MAXDOP to an appropriate level. Follow the guidelines from the Microsoft article referenced above. Here it is again:
Often the answer is going to be simply to set it to 8 – but then experiment with tuning it up and down slightly to see whether that makes things better or worse.
Very Large Queries
If we have a mixed workload on our server which includes some very expensive queries – possibly for reporting purposes – then we may want to look at upping the MAXDOP for these queries to allow them to take advantage of more processors. One thing to consider though is – do we really want these queries running during the day when things are busy? Ideally they should run in quieter times. If they must run during the day, then do we want to avoid them taking over all the server power and blocking our production workload? In which case we might just let them run at the MAXDOP defined above.
If we decide we want to let them have the extra power then we can override the server MAXDOP setting with a query hint OPTION(MAXDOP n):
https://docs.microsoft.com/en-us/sql/t-sql/queries/hints-transact-sql-query
You will want to experiment to find the “best” value for the given query. As discussed above and as shown in Kendra Little’s article, just setting it to the maximum number of cores available isn’t necessarily going to be the fastest option.
Exceptions to the Rule
Regardless of the size, there are some queries that just don’t benefit from parallelism so you may need to assess them on an individual basis to find the right degree of parallelism to use.
With SQL server you can specify the MAXDOP at the server level, but also override it at the database level using a SCOPED CONFIGURATION or for individual queries using a query hint. There are even other ways you can control this:
https://www.brentozar.com/archive/2016/12/ten-ways-set-maxdop/
Options in the Cloud
If your SQL Server is hosted in the cloud, then most of this still applies. You still need to think about tuning parallelism – it isn’t done for you, and the defaults are the same – so probably not optimal for most workloads.
There are in general two flavours of cloud implementation. The first is Infrastructure as a Service (IaaS) where you simply have a VM provided by your cloud provider and run an OS with SQL server on top of it in that VM. Regardless of your cloud provider (e.g. Azure, AWS etc.), if you’re using IaaS for SQL Server then the same rules apply, and you go about tuning parallelism in exactly the same way.
The other type of cloud approach is Platform as a Service (PaaS). This is where you use a managed service for SQL Server. This would include Azure SQL Database, Azure SQL Database Managed Instance, and Amazon RDS for SQL Server. In these cases, the rules still apply, but how you manage these settings may differ. Let’s look at that for the three PaaS options mentioned above.
Azure SQL Database
This is a single SQL Server database hosted in Azure. You don’t have access to server level settings, so you can’t change MAXDOP or the cost threshold. You can however specify MAXDOP at the database level e.g.
ALTER DATABASE SCOPED CONFIGURATION SET MAXDOP = 4;
Cost threshold for Parallelism however is unavailable to change in Azure SQL Database.
Azure SQL Database Managed Instance
This presents you with something that looks very much like the SQL Server you are used to, you just can’t access the box behind it. And similar to your regular SQL instance, you can set MAXDOP and the Cost threshold as normal.
Amazon RDS for SQL Server
This is similar to managed instance. It looks and acts like SQL Server but you can’t access the machine or OS. You access your RDS instance through an account that has permissions that are more limited than your usual sa account or sysadmin role allows. And one of the things you can’t do with your limited permissions is to change the parallelism settings.
Amazon have provided a way around this though and you can change both settings using something called a parameter group:
Closing Thoughts
Parallelism is a powerful tool at our disposal, but like all tools it should be used wisely and not thrown at every query to its maximum – and this is often what happens with the out of the box settings on SQL Server. Tuning parallelism is not a knee-jerk reaction to high CXPACKET waits, but something we should be considering carefully in all our SQL Server implementations.
Acknowledgements
I wanted to update my original article to include the cloud options noted above, but didn’t have access to an Azure SQL Database Managed Instance to check the state of play. Thanks to TravisGarland via Twitter (@RockyTopDBA) and Chrissy LeMaire via the SQL community slack (@cl) for checking this and letting me know!