Outer join with condition

Question

Post reply

Outer join with condition

shaqqy

Grasshopper

Points: 17
More actions
December 26, 2012 at 8:34 pm

#272386

Comments posted to this topic are about the item Outer join with condition
December 27, 2012 at 1:28 am

This was removed by the editor as SPAM

Viewing 15 posts - 1 through 15 (of 17 total)

You must be logged in to reply to this topic. Login to reply

PChiragS SSCarpal Tunnel Points: 4965 More actions · Answer 1

PChiragS

SSCarpal Tunnel

Points: 4965

December 26, 2012 at 8:35 pm

#1571728

Nice QotD and nice explanation..:-)

Thanks..

Lokesh Vij SSChampion Points: 10836 More actions · Answer 2

Very good question. Thanks!

~ Lokesh Vij

Guidelines for quicker answers on T-SQL question[/url]
Guidelines for answers on Performance questions

Link to my Blog Post --> www.SQLPathy.com[/url]

Follow me @Twitter

Koen Verbeeck SSC Guru Points: 259115 More actions · Answer 3

Nice question, thanks.

Need an answer? No, you need a question
My blog at https://sqlkover.com.
MCSE Business Intelligence - Microsoft Data Platform MVP

Dineshbabu Hall of Fame Points: 3220 More actions · Answer 4

Known concept. But took some time understand the full qstn.. nice one..

--
Dineshbabu
Desire to learn new things..

ronmoses@gmail.com SSCarpal Tunnel Points: 4480 More actions · Answer 5

Yeah, I'm gonna need someone to draw me a picture for this one. 😉

ron

-----
a haiku...

NULL is not zero
NULL is not an empty string
NULL is the unknown

Hugo Kornelis SSC Guru Points: 64790 More actions · Answer 6

ronmoses (12/27/2012)
Yeah, I'm gonna need someone to draw me a picture for this one. 😉
ron

I'm bad at pictures, so how about a step-by-step description of the query evaluation?

(Note that this describes the LOGICAL queury evaluation - the physicial processing order may be completely different, as long as the same end result is achieved)

Query evaluation always starts with the FROM clause, and unless there are parentheses or nested joins, always works left to right. So the first step is the join of service and doc_serv. That's a standard inner join, with this internal table as intermediate result:

service_id name price doctor_id service_id

1 serv1 1000 2 1

2 serv2 500 3 2

2 serv2 500 1 2

3 serv3 700 1 3

3 serv3 700 3 3

4 serv4 1200 2 4

The next step is the join with doctors. This is a right outer join, so the right table (doctors) is preserved. If any row from the doctors table would not survive the result of an inner join on the same condition, it will still be there, but with a NULL in all the columns from the left table (the intermediate result above). Let's first look at how an inner join with the same conditions would look:

service_id name price doctor_id service_id doctor_id name

2 serv2 500 3 2 3 doc3

2 serv2 500 1 2 1 doc1

3 serv3 700 1 3 1 doc1

3 serv3 700 3 3 3 doc3

All rows in the intermediate table have a matching row in the doctors table, so no rows are eliminated because of that requirement. But rows in the intermediate table with a price of 1000 or more are lost. And as a result, we only have rows for two of the four doctors. The outer join says that all doctors have to be retained, so rows for the missing doctors are added back in:

service_id name price doctor_id service_id doctor_id name

2 serv2 500 3 2 3 doc3

2 serv2 500 1 2 1 doc1

3 serv3 700 1 3 1 doc1

3 serv3 700 3 3 3 doc3

NULL NULL NULL NULL NULL 2 doc2

NULL NULL NULL NULL NULL 4 doc4

This would be the result if the query would ask for SELECT *. There are no WHERE, GROUP BY, or HAVING clauses, so the next step is the SELECT list. This asks for an aggregate. Without a GROUP BY, that means aggregate over the whole result set. The rows are counted and the result (6) is returned.

Hugo Kornelis, SQL Server/Data Platform MVP (2006-2016)
Visit my SQL Server blog: https://sqlserverfast.com/blog/
SQL Server Execution Plan Reference: https://sqlserverfast.com/epr/

ronmoses@gmail.com SSCarpal Tunnel Points: 4480 More actions · Answer 7

Hugo Kornelis (12/27/2012)
I'm bad at pictures, so how about a step-by-step description of the query evaluation?

Ahh... you know, I actually had the mechanics straight, it was the fine details I screwed up. Somehow in my head I managed to lump the one excluded left record in with the duplicate 1s and 3s, instead of the 2s and 4s where it belonged, if that makes sense. So I subtracted one of the duplicates rather than NULLing the appropriate record. At least I understood the concept. Forest for the trees, etc.

thanks!

ron

-----
a haiku...

NULL is not zero
NULL is not an empty string
NULL is the unknown

Revenant SSC-Forever Points: 42467 More actions · Answer 8

Revenant

SSC-Forever

Points: 42467

December 27, 2012 at 9:31 am

#1571919

A good one -- thanks!

SQLRNNR SSC Guru Points: 281334 More actions · Answer 9

Thanks for the question.

Jason...AKA CirqueDeSQLeil
_______________________________________________
I have given a name to my pain...MCM SQL Server, MVP
SQL RNNR
Posting Performance Based Questions - Gail Shaw[/url]
Learn Extended Events

johnf_amic Mr or Mrs. 500 Points: 542 More actions · Answer 10

This example shows that the query isn't logically the same as if it would contain filtering in "where" which is executed after joins are made.

This example shows how difficult it is to work out a join on a piece of paper, instead of firing up SSMS and using copy and paste.

TomThomson SSC Guru Points: 104773 More actions · Answer 11

A very good question.

Personally I would avoid writing a condition on one of the joined table expressions in the ON clause, but instead write it as an explicit subquery (ie modify the table-expression concerned). That way there is no scope for anyone mistaking the meaning. I'm tempted to say that it would have been better language design to preclude such conditions in the ON clause, but the time for that argument is long past. I'm sure it would have been better language design to exclude such conditions from the ON clause of the MERGE statement; BoL has a nice big caveat that makes it pretty clear that they should be avoided there.

Tom

john.arnott SSChampion Points: 11882 More actions · Answer 12

I'm with you on that, Tom. I get confused enough by RIGHT joins, never mind a join with conditions in the ON clause that don't directly define the relationship. Here's the query re-structured, first to use a LEFT join, and then the simple move of the price<1000 condition to the sub-query's WHERE clause.--All queries return * instead of just count(*) so we can see what's counted.

-- Original join layout

select *--quantity = count(*)

from @service s

inner join @doc_serv ds on s.service_id = ds.service_id

right outer join @doctor d on ds.doctor_id = d.doctor_id and s.price < 1000

order by d.doctor_id,ds.service_id

-- Turn it over to use LEFT join to a subquery

select *--quantity = count(*)

from @doctor d

left outer join

(Select s.service_id, s.price, ds.doctor_id

from @doc_serv ds

inner join @service s

on s.service_id = ds.service_id

and s.price < 1000

) x

on x.doctor_id = d.doctor_id

order by d.doctor_id,x.service_id

-- Move the condition (price < 1000) to WHERE clause of subquerey

select *--quantity = count(*)

from @doctor d

left outer join

(Select s.service_id, s.price, ds.doctor_id

from @doc_serv ds

inner join @service s

on s.service_id = ds.service_id

where s.price < 1000

) x

on x.doctor_id = d.doctor_id

order by d.doctor_id,x.service_id

I'd prefer the longer code as shown here to the compact right join as it's much easier (at least for me) to understand.

sipas Hall of Fame Points: 3828 More actions · Answer 13

Hugo Kornelis (12/27/2012)
I'm bad at pictures, so how about a step-by-step description of the query evaluation?

Thanks for the explanation, I should have gone through it step by step just like that instead of trying to work it out in my head.

john.arnott (12/27/2012)
I'm with you on that, Tom. I get confused enough by RIGHT joins, never mind a join with conditions in the ON clause that don't directly define the relationship. Here's the query re-structured, first to use a LEFT join, and then the simple move of the price<1000 condition to the sub-query's WHERE clause.

Much prefer the version with the price condition in a WHERE clause, seems much clearer to me.