Why are these results different (SELECT In versus JOIN <>)

Question

Post reply

Why are these results different (SELECT In versus JOIN <>)

bleroy

Hall of Fame

Points: 3052
More actions
May 8, 2012 at 6:34 am

#256087

Hi,
I do not understand why I get different results when running the two SQL statements below:
SELECT DISTINCT a.ID FROM T1 a
INNER JOIN T2 b on b.ID <> a.ID
SELECT DISTINCT a.ID FROM T1 a
WHERE a.ID NOT IN (SELECT DISTINCT ID FROM T2)
(obviously, the ID values are same (when same businessentity) in both tables and I should have a bunch of rows in one table but not in the other)
Hoping someone can show me the light 🙂
Kind regards
B

Viewing 4 posts - 1 through 3 (of 3 total)

You must be logged in to reply to this topic. Login to reply

Gail Shaw SSC Guru Points: 1004504 More actions · Answer 1

Because they are two completely different queries.

The <> join matches anything that isn't equal, so let's say we have these tables

Tbl1

ID Col1

1 'a'

2 'b'

5 'e'

Tbl2

ID Col1

1 'a'

3 'c'

4 'd'

The first query says that columns that are not equal are to be considered a match, so the row from table 1 that has an ID of 1 matches rows 3 and 4 in table 2. The row with ID 2 matches rows 1, 3 and 4, the row with ID 5 matches 1,3 and 4, so your results would be

1, 'a', 3, 'c'

1, 'a', 4, 'd'

2, 'b', 1, 'a'

2, 'b', 3, 'c'

2, 'b', 4, 'd'

5, 'e', 1, 'a'

5, 'e', 3, 'c'

5, 'e', 4, 'd'

It's a slightly restricted cartesian product.

The not in says to find all the rows in table 1 that to not have an exact match in table 1.

Now, the ID 1 is in both, so that doesn't match. There's no ID 2 or 5 in table 2, so both of those are returned

2, 'b'

5, 'e'

The almost equivalent query to the not in would be a left outer join

SELECT t1.ID, 1t.col1 FROM Tbl1 t1 left outer join Tbl2 t2 on t1.ID = t2.ID where t2.ID is NULL.

p.s. Drop the distincts, they're not necessary in the IN and they shouldn't be necessary in the outer query unless you have a bad design or bad data

Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

We walk in the dark places no others will enter
We stand on the bridge and no one may pass

carthik SSC Veteran Points: 273 More actions · Answer 2

Obviously it will be different....

Suppose

T1 has data ( 1, 2,3,4,5)

T2 has data( 2,3,5)

In the join condition which you wrote, it will check like

1 <> 2 -> true display 1

2<> 2 -> flase no result

3 <> 2 -> true display 3

4 <> 2 -> true display 4

5 <> 2 -> true display 5

and so on

If u remove distinct first 4 values will be 1,3,4,5...

To achive the same result as in the query

"SELECT DISTINCT a.A FROM #T1 a

WHERE a.A NOT IN (SELECT DISTINCT A FROM #T2)"

Use the below query using Joins

SELECT a.A FROM #T1 a

LEFT OUTER JOIN #T2 b on b.A = a.A where b.a IS NULL

----------------------

Create table #T1( A int)

Create table #T2( A int)

insert into #T1 values(1)

insert into #T1 values(2)

insert into #T1 values(3)

insert into #T1 values(4)

insert into #T1 values(5)

insert into #T2 values(2)

insert into #T2 values(3)

insert into #T2 values(5)

select * from #T1

select * from #T2

-----------------------------

bleroy Hall of Fame Points: 3052 More actions · Answer 3

....

Thanks a lot guys ... I needed this ....

and yes, different queries give different results ... no further comment.

I'm off to get a cold shower, a can of beer, and stand by the BBQ to do something completely different with my life ... 🙂