October 19, 2014 at 9:19 pm
KoldCoffee (10/19/2014)
Hi Lynn, thanks for bearing with me.the persontype comes from Person.Person. I aliased. Can you revise the explanation?
freakin' frustrating.
This looks simple enough and I understand the sentences and logic, but when I look at output of Adventureworks sample I get confused.
Think of it this way, the AND will satisfy all records on the right hand side and return NULL on the left hand side for those records not satisfying the condition, the WHERE will exclude the records on both sides of the join which have NULL on the left hand side.
But this returns everything from the left side
SELECT
MT.MT_ID
,MT.MT_DATE
,MT.REF_ID
,RF.REF_VAL
FROM MAIN_TABLE MT
LEFT OUTER JOIN REF RF
ON MT.REF_ID = RF.REF_ID
AND RF.REF_ID < 4;
And when I flip the tables I'm left joining on, this also only returns everything from the left side
SELECT
MT.MT_ID
,MT.MT_DATE
,MT.REF_ID
,RF.REF_VAL
FROM REF RF
LEFT OUTER JOIN MAIN_TABLE MT
ON MT.REF_ID = RF.REF_ID
AND RF.REF_ID < 4;
the persontype comes from Person.Person.
This is why all columns should be aliased in a multi-table query. Easy to know where each column comes from without having to see the DDL for the tables.
Remember, in a left join, the table on the LEFT is the one in which ALL rows will be returned even when there is no matching row in the table on the right. The table on the RIGHT of the LEFT OUTER JOIN will only return rows that match the join criteria.
Without having the database I can't really tell you much more without being able to actually run the code and variants to tell you what it is doing. I suggest that you play with the query and learn what it is doing.
October 19, 2014 at 9:28 pm
My current latest understanding:
In Left Outer JOIN everything is returned, period. That's because it's not an INNER JOIN but a LEFT OUTER.
Next, if there is a WHERE clause, results are filtered. But if there is no WHERE clause then everything on the LEFT is returned, PERIOD.
From the right are returned those things which satisfy the filtering criteria.
Good morning in Afghanistan. I imagine it is shaping up to be another not too cold day. 🙂 Thanks Lynn and best.
October 19, 2014 at 10:00 pm
KoldCoffee (10/19/2014)
My current latest understanding:In Left Outer JOIN everything is returned, period. That's because it's not an INNER JOIN but a LEFT OUTER.
Next, if there is a WHERE clause, results are filtered. But if there is no WHERE clause then everything on the LEFT is returned, PERIOD.
From the right are returned those things which satisfy the filtering criteria.
Good morning in Afghanistan. I imagine it is shaping up to be another not too cold day. 🙂 Thanks Lynn and best.
In Left Outer JOIN everything is returned, period. That's because it's not an INNER JOIN but a LEFT OUTER.
That is what I said above, all rows from the table on the left are returned.
From the right are returned those things which satisfy the filtering criteria.
Correct, except it isn't the filtering criteria, it's the join criteria. You may call this semantics, but just making sure that the correct terms are used. Filter criteria are in the WHERE clause as this reduces the result set returned from the join between tables.
Actually, right now it is cold and wet having rained most of yesterday and into the night. I think the rain stopped around midnight, and it is now 08:30 (8:30 AM) Afghan time.
October 19, 2014 at 10:23 pm
I guess you're not in Kabul, then.
OK.
so I incremented the JOIN condition in Eirikur's LEFT JOIN example and found that the entire left table is returned, whether the JOIN condition is <1, <2, <3, <4
SELECT
MT.MT_ID
,MT.MT_DATE
,MT.REF_ID
,RF.REF_VAL
FROM MAIN_TABLE MT
LEFT OUTER JOIN REF RF
ON MT.REF_ID = RF.REF_ID
AND RF.REF_ID < 4;
The AND is joining to only those rows in the right table that satisfy the JOIN criteria.
I think I understand finally.
In the case of Adventurework, everything from Person.Person is being returned, and only the rows from Sales.SalesTerritoryHistory that satisfy the JOIN condition.
and a WHERE would turn the LEFT OUTER JOIN back to an inner.
Yes?? Did you pull me through and stick it out with me to the bitter end? I think so.
Thank you Lynn. Have a wonderful day.
October 19, 2014 at 10:27 pm
KoldCoffee (10/19/2014)
...I incremented the JOIN condition in Eirikur's LEFT JOIN example and found that the entire left table is returned, whether the JOIN condition is <1, <2, <3, <4
SELECT
MT.MT_ID
,MT.MT_DATE
,MT.REF_ID
,RF.REF_VAL
FROM MAIN_TABLE MT
LEFT OUTER JOIN REF RF
ON MT.REF_ID = RF.REF_ID
AND RF.REF_ID < 4;
The AND is joining to only those rows in the right table that satisfy the JOIN criteria.
I think I understand finally.
Spot on;-)
😎
October 20, 2014 at 12:45 am
If you are using a LEFT Join with a restricting WHERE then you need to be careful that your WHERE condition does not remove the additional rows your LEFT join was retrieving, that would defeat the purpose of using LEFT join..
Remember , the order of processing is :
1) FROM
2) JOIN ON
3) Add Outer rows (Only in case of Outer Joins)
4) WHERE
5) SELECT
ON is processed first followed by WHERE( and in case of LEFT join, in fact any Outer Join for that matter,WHERE could remove the outer rows)
So if you really want a LEFT Join, place that WHERE condtion in the JOIN ON clause.
But check first what you actually want to do in your logic..
October 20, 2014 at 12:47 am
.
October 20, 2014 at 4:51 am
Eirikur Eiriksson (10/19/2014)
On the contrary, it is very useful as it respects the left outer join, which in essence return all records from the right hand side and only those on the left hand side that satisfy the condition. Using a where clause will turn the join into a inner join, returning only those record combinations which satisfy the conditions. It is not an "output" filter, it's a join specification.😎
Absolutely.
It's that conversion of the JOIN from OUTER to INNER because of the placement of the filtering criteria that can really mess up data sets. Logically it makes sense too.
"The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
- Theodore Roosevelt
Author of:
SQL Server Execution Plans
SQL Server Query Performance Tuning
October 20, 2014 at 5:08 am
Eirikur Eiriksson (10/19/2014)
On the contrary, it is very useful as it respects the left outer join, which in essence return all records from the right hand side and only those on the left hand side that satisfy the condition. Using a where clause will turn the join into a inner join, returning only those record combinations which satisfy the conditions. It is not an "output" filter, it's a join specification.😎
I think you mean a left outer join returns all records from the left hand side and only those on the right hand side that satisfy the condition. And a right outer join returns all records from the right hand side and only those on the left hand side that satisfy the condition.
I suppose an alternative to the method used by the original query would be:
LEFT JOIN Sales.SalesTerritoryHistory st
ON pp.BusinessEntityID = st.BusinessEntityID
WHERE ((st.modifieddate between '2005-06-24 00:00:00.000' and '2006-06-24 00:00:00.000' AND persontype = 'sp') OR st.BusinessEntityID IS NULL)
This gets the filter rules into the WHERE clause which you may find more readable, and the extra rule allowing st.BusinessEntityID to be NULL will allow the LEFT JOIN to continue behaving as expected.
October 20, 2014 at 7:13 am
KoldCoffee (10/19/2014)
I guess you're not in Kabul, then.OK.
...
I was in Kabul initially, but we got "kicked out" of ISAF back in November 2013. I am now at Bagram Airfield. Not sure how far that is from Kabul.
Glad to be able to help.
October 20, 2014 at 7:35 am
KoldCoffee (10/18/2014)
using Adventureworks2008R2 database I'd like to illustrate syntax I ran into and am unfamiliar with and I want to know if it's a mistake that SSMS doesn't catch.
SELECT TOP 1000 p.[BusinessEntityID]
,[PersonType]
,[NameStyle]
,[Title]
,[FirstName]
,[MiddleName]
,[LastName]
,[Suffix]
,[EmailPromotion]
,[AdditionalContactInfo]
,st.ModifiedDate
FROM [AdventureWorks2008R2].[Person].[Person] p
JOIN Person.PersonPhone pp
ON p.BusinessEntityID = pp.BusinessEntityID
LEFT JOIN Sales.SalesTerritoryHistory st
ON pp.BusinessEntityID = st.BusinessEntityID
--at this point the JOIN ON clause is followed by AND instead of WHERE
AND st.modifieddate between '2005-06-24 00:00:00.000' and '2006-06-24 00:00:00.000'
AND persontype = 'sp'
order by modifieddate desc
The above, to my surprise, doesn't error. It also doesn't constrain the st.modifieddate from returning NULLS.
When, if ever, would you follow a JOIN ON clause with an AND instead of WHERE? Is it exactly for the purpose of returning everything between the dates PLUS any modifieddate values having NULLS?
The technical side of this question has been thoroughly answered here, I just wanted to touch on the practical application. Imagine you are creating a report for the sales manager of Adventureworks. This company sells bicycles and accessories. The manager wants to see a list of all the contact info for their sales persons who have a record in the salesterritoryhistory table. If they don't have a record in that table, the manager still wants to see the person on the report, i.e. they shouldn't be excluded.
I don't have Adventureworks in front of me so I don't know what's in that salesterritoryhistory table, but perhaps it records when sales people change territory or if they are new and have a sales territory assigned to them. That would mean a report showing people who had changed territory in that year range, or null for that date if they stayed in the same territory. If you were running this on 2006-06-24, then the question might have been "Who are all the current sales people, and who is new or changed territory in the last year, and show that on the report as the date they started or changed territory".
This same principle could be used to e.g. show people who had sales in the last month, including those who had no sales at all. Sales is a cliche example, but it applies to so many other domains.
October 20, 2014 at 9:44 am
davoscollective (10/20/2014)
KoldCoffee (10/18/2014)
using Adventureworks2008R2 database I'd like to illustrate syntax I ran into and am unfamiliar with and I want to know if it's a mistake that SSMS doesn't catch.
SELECT TOP 1000 p.[BusinessEntityID]
,[PersonType]
,[NameStyle]
,[Title]
,[FirstName]
,[MiddleName]
,[LastName]
,[Suffix]
,[EmailPromotion]
,[AdditionalContactInfo]
,st.ModifiedDate
FROM [AdventureWorks2008R2].[Person].[Person] p
JOIN Person.PersonPhone pp
ON p.BusinessEntityID = pp.BusinessEntityID
LEFT JOIN Sales.SalesTerritoryHistory st
ON pp.BusinessEntityID = st.BusinessEntityID
--at this point the JOIN ON clause is followed by AND instead of WHERE
AND st.modifieddate between '2005-06-24 00:00:00.000' and '2006-06-24 00:00:00.000'
AND persontype = 'sp'
order by modifieddate desc
The above, to my surprise, doesn't error. It also doesn't constrain the st.modifieddate from returning NULLS.
When, if ever, would you follow a JOIN ON clause with an AND instead of WHERE? Is it exactly for the purpose of returning everything between the dates PLUS any modifieddate values having NULLS?
The technical side of this question has been thoroughly answered here, I just wanted to touch on the practical application. Imagine you are creating a report for the sales manager of Adventureworks. This company sells bicycles and accessories. The manager wants to see a list of all the contact info for their sales persons who have a record in the salesterritoryhistory table. If they don't have a record in that table, the manager still wants to see the person on the report, i.e. they shouldn't be excluded.
I don't have Adventureworks in front of me so I don't know what's in that salesterritoryhistory table, but perhaps it records when sales people change territory or if they are new and have a sales territory assigned to them. That would mean a report showing people who had changed territory in that year range, or null for that date if they stayed in the same territory. If you were running this on 2006-06-24, then the question might have been "Who are all the current sales people, and who is new or changed territory in the last year, and show that on the report as the date they started or changed territory".
This same principle could be used to e.g. show people who had sales in the last month, including those who had no sales at all. Sales is a cliche example, but it applies to so many other domains.
The main concern might be a readability issue. Knowing the difference in behavior is something that could get less senior developers into trouble. While I personally like the AND notation, just because I know others will end up having to read and maintain my code, I tend to rewrite that to be clearer.
Something like:
SELECT TOP (1000) p.BusinessEntityID, PersonType, NameStyle, Title, FirstName, MiddleName, LastName, Suffix, EmailPromotion, AdditionalContactInfo,
st.ModifiedDate
FROM AdventureWorks2008R2.Person.Person AS p INNER JOIN
Person.PersonPhone AS pp ON p.BusinessEntityID = pp.BusinessEntityID
LEFT OUTER JOIN
(
select *
from Sales.SalesTerritoryHistory
where st.modifieddate between '2005-06-24 00:00:00.000' and '2006-06-24 00:00:00.000'
AND persontype = 'sp'
) AS st ON pp.BusinessEntityID = st.BusinessEntityID
order by modifieddate desc
----------------------------------------------------------------------------------
Your lack of planning does not constitute an emergency on my part...unless you're my manager...or a director and above...or a really loud-spoken end-user..All right - what was my emergency again?
October 20, 2014 at 1:07 pm
thank you gward, davo collective, and matt miller for further clarification.
This experience has led me with conviction to understand that in the scenario where I was seeing the LEFT JOIN with AND, it was actually an error. I could see the error but I was tremendously bothered by not understanding this syntax and applicability.
Thanks to everyone who helped me come up to speed on this advanced join type ....well I do think it is. It's beguiling, at the very least.
October 20, 2014 at 3:28 pm
Absolutely beguiling. I think everyone knows it so well because we've all been bitten by it at one point or another.
"The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
- Theodore Roosevelt
Author of:
SQL Server Execution Plans
SQL Server Query Performance Tuning
October 21, 2014 at 7:56 am
To me it looks like someone was looking to add the code persontype = 'sp', but just put it in the wrong place. I've seen code like that before, when you tell a new person to just add that code to the join and they put it on the wrong one. Doesn't fail, but sure was slower.
My thoughts that it should be more written like one of these, depending on the results needed as other have stated.
SELECT TOP 1000 p.BusinessEntityID,
PersonType,
NameStyle,
Title,
FirstName,
MiddleName,
LastName,
Suffix,
EmailPromotion,
AdditionalContactInfo,
st.ModifiedDate
FROM AdventureWorks2008R2.Person.Person p
JOIN Person.PersonPhone pp ON
p.BusinessEntityID = pp.BusinessEntityID
AND p.persontype = 'sp' -- moved to the correct join
LEFT JOIN Sales.SalesTerritoryHistory st ON
pp.BusinessEntityID = st.BusinessEntityID
AND st.modifieddate BETWEEN '2005-06-24 00:00:00.000' AND '2006-06-24 00:00:00.000'
ORDER BY modifieddate DESC
-- OR
SELECT TOP 1000 p.BusinessEntityID,
PersonType,
NameStyle,
Title,
FirstName,
MiddleName,
LastName,
Suffix,
EmailPromotion,
AdditionalContactInfo,
st.ModifiedDate
FROM AdventureWorks2008R2.Person.Person p
JOIN Person.PersonPhone pp ON
p.BusinessEntityID = pp.BusinessEntityID
LEFT JOIN Sales.SalesTerritoryHistory st ON
pp.BusinessEntityID = st.BusinessEntityID
AND st.modifieddate BETWEEN '2005-06-24 00:00:00.000' AND '2006-06-24 00:00:00.000'
WHERE p.persontype = 'sp' -- or moved to the WHERE
ORDER BY modifieddate DESC
-------------------------------------------------------------
we travel not to escape life but for life not to escape us
Don't fear failure, fear regret.
Viewing 15 posts - 16 through 29 (of 29 total)
You must be logged in to reply to this topic. Login to reply