January 9, 2006 at 6:48 am
If I use an INNER JOIN to gather records from table A that have corresponding records in table B, what would I use to get records in table A that DON'T have corresponding records in table B?
--Paul
January 9, 2006 at 7:06 am
Have you tried using a LEFT or RIGHT JOIN with a WHERE eliminating all the values where the two table join. Your resultset would be something like this
TABLEA | TABLEB |
Name | Name |
FOO | NULL |
BOO | NULL |
ABC | NULL |
You ANSI SQL would be something like the following:
SELECT TABLEA.name,TABLEBName
FROM tablea LEFT JOIN tableb
ON tablea.name = tableb.name
WHERE tableb IS NULL
You'll need to play around with the SQL
Thanks,
January 10, 2006 at 1:39 am
For the same result you could try this:
select *
from tableA
where not exists (Select Field1
from TableB
where Field1=TableA.Field1)
January 10, 2006 at 6:15 am
You could use "not exists", until you compare query plans and execution time, and then you'd go back to using an OUTER JOIN...
January 10, 2006 at 8:14 am
Do you have anything specific to back that up with? For instance, take a look at this old thread where the same thing was discussed:
January 10, 2006 at 8:49 am
From personal experience, outer joins mostly give better performance.
I can't recall where it has been worse - the same, yes, but not worse.
Indexing, output columns (from one table or both), overall and relative row counts, other filters in the where clause, and other factors all affect the query plan and hence performance.
SQL Server 2000 unleashed, page 626 for an alternative view.
January 10, 2006 at 8:59 am
Paul,
An INNER JOIN is always left to right. So the opposite of:
FROM TableA INNER JOIN TableB
is:
FROM TableB INNER JOIN TableA.
-SQLBill
January 10, 2006 at 9:24 am
I am still not sure. But anyway, the only correct thing to do is to try a few different alternatives and choose the one that performs the best.
I do not have SQL Server 2000 Unleashed so I cannot check what it says there.
January 10, 2006 at 9:26 am
Uhm, I would say those two are equivalent. Are you saying they would produce different results?
January 10, 2006 at 10:47 am
I have never observed a LEFT JOIN checking for NULL behave any differently than a NOT EXISTS in terms of query plans.
What I have observed are some nasty insidious bugs caused by using LEFT JOIN in code that is later modified.
All it takes is someone to modify the code at a later date to add an additional filter to the LEFT JOIN'ed table and all of a sudden you have SQL server doing an implicit INNER JOIN behind your back and changing the resultset.
eg:
SELECT a.name, b.name
FROM tablea As a
LEFT JOIN tableb As b
ON a.name = b.name
WHERE b.name IS NULL
-- Joe the new hire who doesn't know SQL joins very well is asked
-- to modify the code to add an additional check on tableb
AND b.SomeOtherColumn = 'SomeFilter' -- Bug!!
If the requirement is to Select some rows where they don't exist somewhere else, using the NOT EXISTS makes the code more readable, makes the code's intention more easily understood and prevents bugs like above from creeping in.
January 10, 2006 at 11:10 am
Not if it's written as:
SELECT a.name, b.name
FROM tablea As a
LEFT JOIN tableb As b
ON a.name = b.name
WHERE b.name IS NULL
-- Joe the new hire who doesn't know SQL joins very well is asked
-- to modify the code to add an additional check on tableb
AND ( b.SomeOtherColumn = 'SomeFilter' or b.SomeOthercolumn is null)
You wrote:
If the requirement is to Select some rows where they don't exist somewhere else, using the NOT EXISTS makes the code more readable, makes the code's intention more easily understood and prevents bugs like above from creeping in.
It *can* make the query more readable, but not if it's buried within a bunch of other stuff like other subqueries, if the correlated subquery is a PITA to write, any parts of either the outer or subquery change dramatically, etc...
I did have a query in Oracle that used Outer Joins that was a dog (took a couple of minutes to run), that when the outer join was replaced with a NOT EXISTS, ran in about 2 seconds. The app's developers were happy...
January 10, 2006 at 11:27 am
Thanks for proving my point. In the example below, you get the wrong resultset using a LEFT JOIN and a filter in the WHERE, even if you do phrase it as "AND ( b.SomeOtherColumn = 'SomeFilter' or b.SomeOthercolumn is null)" :
Declare @TableA Table
(
KeyColumn int not null,
SomeData varchar(50)
)
Declare @TableB Table
(
KeyColumn int not null,
SomeCode char(1) not null
)
Insert into @TableA
Select 1, 'row 1' union
Select 2, 'row 2' union
Select 3, 'row 3' union
Select 4, 'row 4'
Insert into @TableB
Select 1, 'A' union
Select 1, 'B' union
Select 2, 'A' union
Select 2, 'C' union
Select 3, 'B' union
Select 4, 'D'
-- Requirements: Show me all rows in TableA, where there is
-- no row in TableB with a code of B
--
-- Expected result: Rows 2 and 4
SELECT a.KeyColumn
FROM @tablea As a
LEFT JOIN @tableb As b
ON a.KeyColumn = b.KeyColumn
WHERE b.KeyColumn IS NULL
AND ( b.SomeCode = 'B' or b.SomeCode is null ) -- BUG!!
SELECT a.KeyColumn
FROM @tablea As a
WHERE NOT EXISTS (
SELECT * FROM @TableB As b
WHERE a.KeyColumn = b.KeyColumn
AND b.SomeCode = 'B'
)
January 10, 2006 at 1:39 pm
SELECT a.KeyColumn
FROM @tablea As a
LEFT JOIN (SELECT * FROM @tableb WHERE b.SomeColumn = 'B') b
ON a.KeyColumn = b.KeyColumn OR b.KeyColumn IS NULL
Use this (I have no SQL box to test on).
This is a known feature of outer joins.
http://support.microsoft.com/kb/176480/en-us
Using the derived table reduces the scope for error.
IIRC (from an Itzak Ben Gan presentation and elsewhere), MSSQL may apply the WHERE or JOIN in the order it thinks best which is why this kind of thing happens..
Also, see http://www.sql-server-performance.com/tuning_joins.asp and search for "If you have the choice of using a JOIN or a subquery"
PW said "I have never observed a LEFT JOIN checking for NULL behave any differently than a NOT EXISTS in terms of query plans" - I have. What I haven't seen is it where it is *worse*... a few rows in the subquery = OK, but with 1000s on both sides, it can change.
Try the previous post code with 1000s of rows in each and test the NOT EXISTS and my re-written code above. Is it worse? Is it better?
January 10, 2006 at 2:01 pm
>>SELECT a.KeyColumn
>>FROM @tablea As a
>>LEFT JOIN (SELECT * FROM @tableb WHERE b.SomeColumn = 'B') b
>>ON a.KeyColumn = b.KeyColumn OR b.KeyColumn IS NULL
That returns all four rows. Which is incorrect.
January 10, 2006 at 3:12 pm
Good point. I also agree that using NOT EXISTS often makes the queries more readable. Furthermore, I think the most readable query in this example is one using the new EXCEPT keyword from SQL Server 2005. And as I wrote in the other thread I referred to, that query uses the exact same execution plan as the one using NOT EXISTS.
Viewing 15 posts - 1 through 15 (of 16 total)
You must be logged in to reply to this topic. Login to reply