October 18, 2014 at 11:37 pm
using Adventureworks2008R2 database I'd like to illustrate syntax I ran into and am unfamiliar with and I want to know if it's a mistake that SSMS doesn't catch.
SELECT TOP 1000 p.[BusinessEntityID]
,[PersonType]
,[NameStyle]
,[Title]
,[FirstName]
,[MiddleName]
,[LastName]
,[Suffix]
,[EmailPromotion]
,[AdditionalContactInfo]
,st.ModifiedDate
FROM [AdventureWorks2008R2].[Person].[Person] p
JOIN Person.PersonPhone pp
ON p.BusinessEntityID = pp.BusinessEntityID
LEFT JOIN Sales.SalesTerritoryHistory st
ON pp.BusinessEntityID = st.BusinessEntityID
--at this point the JOIN ON clause is followed by AND instead of WHERE
AND st.modifieddate between '2005-06-24 00:00:00.000' and '2006-06-24 00:00:00.000'
AND persontype = 'sp'
order by modifieddate desc
The above, to my surprise, doesn't error. It also doesn't constrain the st.modifieddate from returning NULLS.
When, if ever, would you follow a JOIN ON clause with an AND instead of WHERE? Is it exactly for the purpose of returning everything between the dates PLUS any modifieddate values having NULLS?
October 19, 2014 at 12:12 am
Quick thought, conditioning a left outer join query on a column on the left side of the join effectively turns the join into an inner join, when the same predicates are part of the join specification it remains as a left outer join.
😎
Quick demo
USE tempdb;
GO
DECLARE @MAIN_TABLE TABLE
(
MT_ID INT IDENTITY(1,1) NOT NULL
,MT_DATE DATE NOT NULL
,REF_ID INT NULL
);
DECLARE @REF TABLE
(
REF_ID INT NOT NULL
,REF_VAL VARCHAR(12) NOT NULL
);
INSERT INTO @MAIN_TABLE (MT_DATE,REF_ID)
VALUES
('2014/01/01',1)
,('2014/02/01',1)
,('2014/03/01',2)
,('2014/04/01',NULL)
,('2014/05/01',3)
,('2014/06/01',NULL);
INSERT INTO @REF(REF_ID,REF_VAL)
VALUES
(1,'REF 1')
,(2,'REF 2')
,(3,'REF 3');
/* Left outer join turned to inner join */
SELECT
MT.MT_ID
,MT.MT_DATE
,MT.REF_ID
,RF.REF_VAL
FROM @MAIN_TABLE MT
LEFT OUTER JOIN @REF RF
ON MT.REF_ID = RF.REF_ID
WHERE RF.REF_ID < 4;
/* Left outer join "untouched" */
SELECT
MT.MT_ID
,MT.MT_DATE
,MT.REF_ID
,RF.REF_VAL
FROM @MAIN_TABLE MT
LEFT OUTER JOIN @REF RF
ON MT.REF_ID = RF.REF_ID
AND RF.REF_ID < 4;
Results 1
MT_ID MT_DATE REF_ID REF_VAL
----------- ---------- ----------- ------------
1 2014-01-01 1 REF 1
2 2014-02-01 1 REF 1
3 2014-03-01 2 REF 2
5 2014-05-01 3 REF 3
Results 2
MT_ID MT_DATE REF_ID REF_VAL
----------- ---------- ----------- ------------
1 2014-01-01 1 REF 1
2 2014-02-01 1 REF 1
3 2014-03-01 2 REF 2
4 2014-04-01 NULL NULL
5 2014-05-01 3 REF 3
6 2014-06-01 NULL NULL
October 19, 2014 at 6:55 am
First is the LEFT JOIN as described above, but second, compound primary keys. The JOIN criteria could simply require multiple columns. I've seen lots and lots of uses for multiple sets of JOIN criteria because of these two requirements.
I've also seen a few edge cases, in queries that were actually quite a mess and very problematic anyway, where moving the filtering criteria from the WHERE clause to the JOIN clause resulted in better execution plans (again, serious edge cases, in most circumstances this wouldn't work at all) for INNER JOINs.
"The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
- Theodore Roosevelt
Author of:
SQL Server Execution Plans
SQL Server Query Performance Tuning
October 19, 2014 at 8:28 am
The best thing you can do is see what is the difference between the query you posted and one where you believe the WHERE clause should begin.
I don't have the sample database available out here in Afghanistan so you will have try these two queries and report the differences:
SELECT TOP 1000
p.[BusinessEntityID],
[PersonType],
[NameStyle],
[Title],
[FirstName],
[MiddleName],
[LastName],
[Suffix],
[EmailPromotion],
[AdditionalContactInfo],
st.ModifiedDate
FROM
[AdventureWorks2008R2].[Person].[Person] p
INNER JOIN Person.PersonPhone pp
ON p.BusinessEntityID = pp.BusinessEntityID
LEFT OUTER JOIN Sales.SalesTerritoryHistory st
ON pp.BusinessEntityID = st.BusinessEntityID
--at this point the JOIN ON clause is followed by AND instead of WHERE
AND st.modifieddate between '2005-06-24 00:00:00.000' and '2006-06-24 00:00:00.000'
AND persontype = 'sp'
order by
modifieddate desc;
GO
SELECT TOP 1000
p.[BusinessEntityID],
[PersonType],
[NameStyle],
[Title],
[FirstName],
[MiddleName],
[LastName],
[Suffix],
[EmailPromotion],
[AdditionalContactInfo],
st.ModifiedDate
FROM
[AdventureWorks2008R2].[Person].[Person] p
INNER JOIN Person.PersonPhone pp
ON p.BusinessEntityID = pp.BusinessEntityID
LEFT OUTER JOIN Sales.SalesTerritoryHistory st
ON pp.BusinessEntityID = st.BusinessEntityID
WHERE
st.modifieddate between '2005-06-24 00:00:00.000' and '2006-06-24 00:00:00.000'
AND persontype = 'sp'
order by
modifieddate desc;
GO
I fully expect you to see a difference in the result sets returned by the two queries.
October 19, 2014 at 10:32 am
in Eirikur's example the LEFT JOIN with the AND returns everything.
In the Adventureworks example:
The LEFT JOIN with the WHERE returns 9 records.
The LEFT JOIN with the AND returns 19973 rows.
Total possible records 19975.
I don't know how to find the two records that are not returned with the LEFT JOIN with AND. I exported to excel and concatenated names but there are many duplicate names. Any ideas on how I may figure out the difference between what AND returns and the total result set?
I don't understand the use of AND <some filter requirement> following a JOIN. It seems a mistake. In the case of my example, all person types are returned, even though 'sp' is specified.
October 19, 2014 at 11:19 am
KoldCoffee (10/19/2014)
in Eirikur's example the LEFT JOIN with the AND returns everything.In the Adventureworks example:
The LEFT JOIN with the WHERE returns 9 records.
The LEFT JOIN with the AND returns 19973 rows.
Total possible records 19975.
I don't know how to find the two records that are not returned with the LEFT JOIN with AND. I exported to excel and concatenated names but there are many duplicate names. Any ideas on how I may figure out the difference between what AND returns and the total result set?
I don't understand the use of AND <some filter requirement> following a JOIN. It seems a mistake. In the case of my example, all person types are returned, even though 'sp' is specified.
Can't really help as I don't have the database available to me out here in Afghanistan. As for seeing the differences in results sets try using the EXCEPT operator between the two queries. You will need to flip the queries depending on which one you want the records from where they don't exist in the other.
October 19, 2014 at 11:59 am
Think of it this way, the AND will satisfy all records on the right hand side and return NULL on the left hand side for those records not satisfying the condition, the WHERE will exclude the records on both sides of the join which have NULL on the left hand side.
😎
October 19, 2014 at 12:19 pm
OK, that's what it looked like but it appears *not useful*. Why use a filter, only to use AND in front of it, and make the usage of a filter misleading?
....everything from Right side PLUS all NULLS on left? I fail to see application for this.
October 19, 2014 at 12:26 pm
On the contrary, it is very useful as it respects the left outer join, which in essence return all records from the right hand side and only those on the left hand side that satisfy the condition. Using a where clause will turn the join into a inner join, returning only those record combinations which satisfy the conditions. It is not an "output" filter, it's a join specification.
😎
October 19, 2014 at 12:45 pm
Eirikur Eiriksson (10/19/2014)
On the contrary, it is very useful as it respects the left outer join, which in essence return all records from the right hand side and only those on the left hand side that satisfy the condition. Using a where clause will turn the join into a inner join, returning only those record combinations which satisfy the conditions. It is not an "output" filter, it's a join specification.😎
Very well said.
October 19, 2014 at 3:24 pm
(1) the LEFT JOIN to Person.Person table returns everything from Person.Person that satisfies the JOIN condition (p.BusinessEntityID = st.BusinessEntityID) and the filters for persontype and dates.
(2) the AND usage retrieves all remaining records from Person.Person after satisfying the requirements of the filters?
October 19, 2014 at 5:34 pm
nope, it literally returns all records from the right hand side regardless of join or filter, right?
October 19, 2014 at 6:27 pm
Lynn, when I try the EXCEPT I get error:
"
Msg 421, Level 16, State 1, Line 3
The xml data type cannot be selected as DISTINCT because it is not comparable.
"
[/
SELECT
p.[BusinessEntityID],
[PersonType],
[NameStyle],
[Title],
[FirstName],
[MiddleName],
[LastName],
[Suffix],
[EmailPromotion],
[AdditionalContactInfo],
st.ModifiedDate
FROM
[AdventureWorks2008R2].[Person].[Person] p
INNER JOIN Person.PersonPhone pp
ON p.BusinessEntityID = pp.BusinessEntityID
LEFT OUTER JOIN Sales.SalesTerritoryHistory st
ON pp.BusinessEntityID = st.BusinessEntityID
--at this point the JOIN ON clause is followed by AND instead of WHERE
AND st.modifieddate between '2005-06-24 00:00:00.000' and '2006-06-24 00:00:00.000'
AND persontype = 'sp'
except
SELECT
p.[BusinessEntityID],
[PersonType],
[NameStyle],
[Title],
[FirstName],
[MiddleName],
[LastName],
[Suffix],
[EmailPromotion],
[AdditionalContactInfo],
st.ModifiedDate
FROM
[AdventureWorks2008R2].[Person].[Person] p
INNER JOIN Person.PersonPhone pp
ON p.BusinessEntityID = pp.BusinessEntityID
LEFT OUTER JOIN Sales.SalesTerritoryHistory st
ON pp.BusinessEntityID = st.BusinessEntityID
code]
October 19, 2014 at 6:32 pm
Hope this helps:
...
FROM
[AdventureWorks2008R2].[Person].[Person] p
INNER JOIN Person.PersonPhone pp
ON p.BusinessEntityID = pp.BusinessEntityID -- << Return all rows from Person.Person and Person.PersonPhone
-- where EntityID's match
LEFT OUTER JOIN Sales.SalesTerritoryHistory st
ON pp.BusinessEntityID = st.BusinessEntityID
AND st.modifieddate between '2005-06-24 00:00:00.000' and '2006-06-24 00:00:00.000'
AND persontype = 'sp' -- << First assumpation since I don't have the database available
-- is that persontype is a column in Sales.SalesTerritoryHistory
-- This returns all rows from the above INNER JOIN plus the rows
-- of data from Sales.SalesTerritoryHistory where BusunessEntityIDs match
-- and modifieddate in Sales.SalesTerritoryHistory is between
-- '2005-06-24 00:00:00.000' and '2006-06-24 00:00:00.000' and
-- persontype = 'sp', nulls where there is no match
...
October 19, 2014 at 6:41 pm
Hi Lynn, thanks for bearing with me.
the persontype comes from Person.Person. I aliased. Can you revise the explanation?
freakin' frustrating.
This looks simple enough and I understand the sentences and logic, but when I look at output of Adventureworks sample I get confused.
Think of it this way, the AND will satisfy all records on the right hand side and return NULL on the left hand side for those records not satisfying the condition, the WHERE will exclude the records on both sides of the join which have NULL on the left hand side.
But this returns everything from the left side
SELECT
MT.MT_ID
,MT.MT_DATE
,MT.REF_ID
,RF.REF_VAL
FROM MAIN_TABLE MT
LEFT OUTER JOIN REF RF
ON MT.REF_ID = RF.REF_ID
AND RF.REF_ID < 4;
And when I flip the tables I'm left joining on, this also only returns everything from the left side
SELECT
MT.MT_ID
,MT.MT_DATE
,MT.REF_ID
,RF.REF_VAL
FROM REF RF
LEFT OUTER JOIN MAIN_TABLE MT
ON MT.REF_ID = RF.REF_ID
AND RF.REF_ID < 4;
Viewing 15 posts - 1 through 15 (of 29 total)
You must be logged in to reply to this topic. Login to reply