Join order changes query plan

Question

Join order changes query plan

pdanes

SSCrazy Eights

Points: 8497
More actions
January 13, 2020 at 3:19 pm

#3714923

I have a query that I've been trying to tune, and this has just come up.
Version 1:
select P.EvidenceLetter, P.EvidenceNumber, P.EvidenceExtra, P.LokalitaAutoID, tL.Lokalita, P.TaxonAutoID, T.CompleteTaxonText
from Podrobnosti P
inner join Taxonomy T
on P.TaxonAutoID = T.TaxonAutoID
left join TableOfLokalitas tL
on P.LokalitaAutoID = tL.LokalitaAutoID
where T.CompleteTaxonText like '%nov%'
and P.TaxonAutoID is not null
Version 2:
select P.EvidenceLetter, P.EvidenceNumber, P.EvidenceExtra, P.LokalitaAutoID, tL.Lokalita, P.TaxonAutoID, T.CompleteTaxonText
from Podrobnosti P
left join TableOfLokalitas tL
on P.LokalitaAutoID = tL.LokalitaAutoID
inner join Taxonomy T
on P.TaxonAutoID = T.TaxonAutoID
where T.CompleteTaxonText like '%nov%'
and P.TaxonAutoID is not null
The only difference is the order of the two join clauses, but the second one gives a much better estimate of the number of returned rows from table Podrobnosti. It's still wrong, but it's closer (est. - 80,477, actual - 78,320, total records in table Podrobnosti that have non-null in the join field - 78,320. In the first version, the estimated number of rows is 124,720, which is not a number I see anywhere in any counts of anything in the query.

Why would simply swapping the order of the joins make such a big difference in the estimate, and why would it be so wrong?

Viewing 15 posts - 1 through 15 (of 22 total)

You must be logged in to reply to this topic. Login to reply

Jonathan AC Roberts SSCoach Points: 18363 More actions · Answer 1

I don't think it should make any difference, as the optimiser will/should work out the best order to execute the joins. So both queries should have the same execution plan whichever order you write the joins.

Have you looked at the execution plans for each query?

crow1969 Hall of Fame Points: 3001 More actions · Answer 2

I think the order of operations regarding the left join is coming into play here. The optimizer is leaving the door open to the join between P and T filtering out some records that would not be considered in the left join to tL. It may be that if a foreign key constraint existed between P and T, then the two plans may become the same again.

pdanes SSCrazy Eights Points: 8497 More actions · Answer 3

Jonathan AC Roberts wrote:

I don't think it should make any difference, as the optimiser will/should work out the best order to execute the joins. So both queries should have the same execution plan whichever order you write the joins.
Have you looked at the execution plans for each query?

Yes, I have. That's where I got the estimates from. Both versions return the correct number of records in the end (mercifully), but the number of records being shoveled around while assembling the result is different. If one version correctly gathers 78K records, while another gathers 124K and subsequently throws away 46K, I think it's worth exploring why that is happening.

The graphical layout of the plans is the same, just the intermediate number of estimated and actual returned records from P differs.

This reply was modified 4 years, 11 months ago by pdanes.
This reply was modified 4 years, 11 months ago by pdanes.
This reply was modified 4 years, 11 months ago by pdanes.

pdanes SSCrazy Eights Points: 8497 More actions · Answer 4

crow1969 wrote:

I think the order of operations regarding the left join is coming into play here. The optimizer is leaving the door open to the join between P and T filtering out some records that would not be considered in the left join to tL. It may be that if a foreign key constraint existed between P and T, then the two plans may become the same again.

Clearly, the order is having some effect. What I don't understand is why, hence this post. I do have a relationship between P and T - TaxonAutoID. That is the field joined on, and is primary key in T and a foreign key in P.

Jonathan AC Roberts SSCoach Points: 18363 More actions · Answer 5

If you run the queries with

SET STATISTICS IO, TIME ON

Is there any difference in the results? If so can you paste them in here?

pdanes SSCrazy Eights Points: 8497 More actions · Answer 6

Jonathan AC Roberts wrote:

If you run the queries with
SET STATISTICS IO, TIME ON
Is there any difference in the results? If so can you paste them in here?

SQL Server parse and compile time:

CPU time = 0 ms, elapsed time = 0 ms.

SQL Server Execution Times:

CPU time = 0 ms, elapsed time = 0 ms.

SQL Server parse and compile time:

CPU time = 47 ms, elapsed time = 56 ms.

SQL Server Execution Times:

CPU time = 0 ms, elapsed time = 0 ms.

SQL Server Execution Times:

CPU time = 0 ms, elapsed time = 0 ms.

(255 row(s) affected)

Table 'Worktable'. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

Table 'Podrobnosti'. Scan count 1, logical reads 681, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

Table 'Taxonomy'. Scan count 1, logical reads 153, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

Table 'TableOfLokalitas'. Scan count 1, logical reads 139, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

(1 row(s) affected)

SQL Server Execution Times:

CPU time = 187 ms, elapsed time = 184 ms.

(255 row(s) affected)

Table 'Worktable'. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

Table 'Podrobnosti'. Scan count 1, logical reads 681, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

Table 'Taxonomy'. Scan count 1, logical reads 153, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

Table 'TableOfLokalitas'. Scan count 1, logical reads 139, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

(1 row(s) affected)

SQL Server Execution Times:

CPU time = 156 ms, elapsed time = 151 ms.

SQL Server parse and compile time:

CPU time = 0 ms, elapsed time = 0 ms.

SQL Server Execution Times:

CPU time = 0 ms, elapsed time = 0 ms.

pdanes SSCrazy Eights Points: 8497 More actions · Answer 7

I also swapped the order of the two queries and ran them again. I both cases, the query first in execution order took more time than the second. Here is the result with nothing different except the order of the two queries:

SQL Server parse and compile time:

CPU time = 0 ms, elapsed time = 0 ms.

SQL Server Execution Times:

CPU time = 0 ms, elapsed time = 0 ms.

SQL Server parse and compile time:

CPU time = 56 ms, elapsed time = 56 ms.

SQL Server Execution Times:

CPU time = 0 ms, elapsed time = 0 ms.

SQL Server Execution Times:

CPU time = 0 ms, elapsed time = 0 ms.

(255 row(s) affected)

Table 'Worktable'. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

Table 'Podrobnosti'. Scan count 1, logical reads 681, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

Table 'Taxonomy'. Scan count 1, logical reads 153, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

Table 'TableOfLokalitas'. Scan count 1, logical reads 139, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

(1 row(s) affected)

SQL Server Execution Times:

CPU time = 203 ms, elapsed time = 196 ms.

(255 row(s) affected)

Table 'Worktable'. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

Table 'Podrobnosti'. Scan count 1, logical reads 681, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

Table 'Taxonomy'. Scan count 1, logical reads 153, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

Table 'TableOfLokalitas'. Scan count 1, logical reads 139, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

(1 row(s) affected)

SQL Server Execution Times:

CPU time = 141 ms, elapsed time = 154 ms.

SQL Server parse and compile time:

CPU time = 0 ms, elapsed time = 0 ms.

SQL Server Execution Times:

CPU time = 0 ms, elapsed time = 0 ms.

Jonathan AC Roberts SSCoach Points: 18363 More actions · Answer 8

There is no difference in the number of reads, so maybe (as the optimiser is a bit of a black box), in that case, the optimiser doesn't change the order of the tables from how they are written.

Is there any difference in the execution plans? If so can you paste those in?

pdanes SSCrazy Eights Points: 8497 More actions · Answer 9

pdanes

SSCrazy Eights

Points: 8497

January 13, 2020 at 5:58 pm

#3715011

Here they are.

pdanes SSCrazy Eights Points: 8497 More actions · Answer 10

Okay, that didn't take. Here is a zipped version.

Attachments:

You must be logged in to view attached files.

crow1969 Hall of Fame Points: 3001 More actions · Answer 11

Think of "Left Join" as a different operator from "Inner Join". That is why SQL Server is needing to generate a different plan. Unless you put in parentheses to force the order of these operations, the optimizer will (logically) do them in the order given.

crow1969 Hall of Fame Points: 3001 More actions · Answer 12

Actually, my first thought is probably a little off.

The difference is in the predicate on the table Podrobnosti (P). Since the (first) inner join is on the TaxonAutoID, it can be taken as read, so the where clause P.TaxonAutoID is discarded. (Where clause conditions are applied after the FROM clause conditions).

In the second, query, the inner join is (logically at least) not to the Podrobnosti table itself, but to a join of Podrobnosti and TableOfLokalitas. In this case, the optimizer may be making sure that the TaxonAutoID is not null, and applying the condition from the WHERE clause.

pdanes SSCrazy Eights Points: 8497 More actions · Answer 13

crow1969 wrote:

Think of "Left Join" as a different operator from "Inner Join". That is why SQL Server is needing to generate a different plan. Unless you put in parentheses to force the order of these operations, the optimizer will (logically) do them in the order given.

But the join between P and T is in the lower right corner of the graphic plan presentation in both cases, indication that it is done BEFORE the join to tL, and separately - only the results of the hash match (255) are then joined. The initial joining of P and T look like they should be the same in both cases. Is there something in the XML of the plan that leads you to your conclusion?

Jonathan AC Roberts SSCoach Points: 18363 More actions · Answer 14

There is no difference in the execution plan. Just the number of rows estimated.