Series Overview
This series of articles looks at how the optimizer builds up an executable query plan using rules. To illustrate the process performed by the optimizer, we'll configure it to produce incrementally better plans by progressively applying its internal exploration rules. You can read Part 1, Part 2, and Part 3 to understand how the plan is corrected.
Part 4 - All Is Revealed
Documented Rule-Affecting Options
It might surprise you to learn that most T-SQL users have written queries that disable one or more of the optimizer rules already.
Join hints { LOOP | HASH | MERGE | REMOTE } and the query hints shown below are all implemented by disabling one or more optimizer rules:
For example, a logical GROUP BY operation (including DISTINCT) can be physically implemented as either a Hash Aggregate or a Stream Aggregate. The two implementation rules involved are GbAggToHS (Group By Aggregate to Hash) and GbAggToStrm (Group By Aggregate to Stream). Normally both of these implementation options are available to the optimizer, and it might consider one or both when optimizing a logical operator tree.
When the query hint OPTION (HASH GROUP) appears in a query, the GbAggToStrm implementation rule is disabled. This means that all GROUP BY operations are implemented by the GbAggToHS rule, and so appear as Hash Aggregates.
The join hint "INNER MERGE JOIN" works by disabling the implementation rules for nested loops and hash join. (An interesting side-effect is that using a join hint makes the query act as if OPTION (FORCE ORDER) had also been specified.)
If we want the optimizer to only consider using merge and hash join strategies for a query, we could use the query hint OPTION (MERGE JOIN, HASH JOIN). This works by disabling the JNtoNL implementation rule (JOIN to Nested Loops), leaving the optimizer the choice of sort-merge or hash joins (rules JNtoSM and JNtoHS). Hinting joins using a query hint also implies OPTION (FORCE ORDER).
Undocumented Features
Warning: This information is provided AS IS and for educational purposes only. Never use these tools on production systems. No documentation or support is available.
Join hints and query hints don't expose options to selectively enable or disable the full range of rules available to the optimizer. We can, however, use a couple of undocumented DBCC commands and the undocumented dynamic management view sys.dm_exec_query_transformation_stats (covered back in Part 3) to explore the way the optimizer uses rules.
The techniques shown below work best in SQL Server 2008, but will also function in SQL Server 2005 (with a number of important caveats). The specific versions used in writing this article were x86 Developer Editions 10.0.2775 (2008 SP1 CU8) and 9.0.4294 (2005 SP3 CU9).
Trace Flags and DBCC commands
As with other undocumented DBCC options, we need to enable trace flag 3604 for the current session, so that any output is returned to the client (Management Studio, for example).
DBCC TRACEON (3604);
We can disable one or more optimizer rules using DBCC RULEOFF. This command takes one or more rule names (or numbers) as its parameters. To disable the rules that implement a logical JOIN as a sort-merge or hash, we would execute:
DBCC RULEOFF('JNtoSM', 'JNtoHS');
To enable the rules again, we can use DBCC RULEON, with the same syntax:
DBCC RULEON('JNtoSM', 'JNtoHS');
Both RULEON and RULEOFF return confirmation messages (with trace flag 3604 on). The direct effects on the optimizer only apply to the current session, but the (sub-optimal) plans produced will be cached as normal - another great reason to only play with this stuff on a personal test system.
To reset to normal operation, enable any disabled rules, or simply disconnect and reconnect to the server. It is also a good idea to run DBCC FREEPROCCACHE to remove any sub-optimal plans from cache.
To see which rules are currently enabled or disabled, use the DBCC SHOWONRULES and DBCC SHOWOFFRULES commands. Neither of these commands take any parameters.
SQL Server 2005 Bugs
In SQL Server 2005, SHOWOFFRULES displays a list of rules that are ON, and SHOWONRULES displays rules that are OFF, which is actually quite funny.
DBCC SHOWONRULES also does not return any output unless you call DBCC SHOWOFFRULES immediately afterward. The list of disabled rules will be prefixed to the output of SHOWOFFRULES (which shows enabled rules, remember).
The list of disabled rules is also not formatted correctly in 2005: all rule names are concatenated without any separator.
Both commands work perfectly in SQL Server 2008.
Putting it all together
We now have all the tools we need to produce the partially-optimized plans seen in previous parts in this series. We can use the sys.dm_exec_query_transformation_stats DMV to identify rules invoked by the optimizer, and the new DBCC commands to selectively disable them to see the effect on the final plan.
Here's our sample AdventureWorks query's logical relational tree representation (from Part 1) again:
...and the very basic plan originally shown in part 1 of the series:
That query plan has an estimated cost of 3.60 - compare that to the fully-optimised plan's cost of 0.025:
Producing Partially-Optimized Plans
Using the DMV and query from Part 3, we saw that the optimizer used more than twenty different rules in finding the fully-optimized plan. To create the very basic plan, just four core implementation rules are needed (five in SQL Server 2005):
We need to disable the other rules normally considered:
We can now get the basic query plan by executing the AdventureWorks query on its own (or by requesting an estimated plan):
We have disabled so many important rules that we can no longer run the DMV batch code from Part 3. The optimizer is unable to produce any correct plan with the restricted range of rules now available to it, so the following error is returned:
.Net SqlClient Data Provider: Msg 8624, Level 16, State 1, Line 1
Internal Query Processor Error: The query processor could not produce a query plan.
For more information, contact Customer Support Services.
If you want to see the DMV statistics, you'll need to clear the plan cache using DBCC FREEPROCCACHE before running the following modified code:
You'll see results similar to these:
Refer back to part 3 for further details about that output.
A Spool Rule
Let's explore one more alternative plan by enabling the rule 'BuildSpool'. This is one of the many rules that can introduce a Table Spool operator into the plan to improve efficiency:
DBCC RULEON ('BuildSpool');
The estimated plan for our test query now looks like this:
It's still a terrible overall plan of course, but the introduction of a Lazy Spool on the inner side of the Nested Loops join has reduced the estimated cost of the plan from 3.60 to 3.12 - a worthwhile improvement.
Clean-Up
Don't forget to reset your session by re-enabling the disabled rules:
You can check that all rules are now enabled again by running DBCC SHOWOFFRULES (but see my previous remarks concerning the bugs in SQL Server 2005).
Final Thoughts
You might be wondering whether all this has any practical application. For me, a better understanding of optimizer internals enables me to write better queries, and more quickly debug poor plans.
There are also very rare (and advanced) uses where we can capture a 'customised' plan for use in a USE PLAN hint.
Paul White
Twitter: @PaulWhiteNZ
Blog: SQLblog.com