May 8, 2012 at 6:34 am
Hi,
I do not understand why I get different results when running the two SQL statements below:
SELECT DISTINCT a.ID FROM T1 a
INNER JOIN T2 b on b.ID <> a.ID
SELECT DISTINCT a.ID FROM T1 a
WHERE a.ID NOT IN (SELECT DISTINCT ID FROM T2)
(obviously, the ID values are same (when same businessentity) in both tables and I should have a bunch of rows in one table but not in the other)
Hoping someone can show me the light 🙂
Kind regards
B
May 8, 2012 at 6:44 am
Because they are two completely different queries.
The <> join matches anything that isn't equal, so let's say we have these tables
Tbl1
ID Col1
1 'a'
2 'b'
5 'e'
Tbl2
ID Col1
1 'a'
3 'c'
4 'd'
The first query says that columns that are not equal are to be considered a match, so the row from table 1 that has an ID of 1 matches rows 3 and 4 in table 2. The row with ID 2 matches rows 1, 3 and 4, the row with ID 5 matches 1,3 and 4, so your results would be
1, 'a', 3, 'c'
1, 'a', 4, 'd'
2, 'b', 1, 'a'
2, 'b', 3, 'c'
2, 'b', 4, 'd'
5, 'e', 1, 'a'
5, 'e', 3, 'c'
5, 'e', 4, 'd'
It's a slightly restricted cartesian product.
The not in says to find all the rows in table 1 that to not have an exact match in table 1.
Now, the ID 1 is in both, so that doesn't match. There's no ID 2 or 5 in table 2, so both of those are returned
2, 'b'
5, 'e'
The almost equivalent query to the not in would be a left outer join
SELECT t1.ID, 1t.col1 FROM Tbl1 t1 left outer join Tbl2 t2 on t1.ID = t2.ID where t2.ID is NULL.
p.s. Drop the distincts, they're not necessary in the IN and they shouldn't be necessary in the outer query unless you have a bad design or bad data
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
May 8, 2012 at 6:49 am
Obviously it will be different....
Suppose
T1 has data ( 1, 2,3,4,5)
T2 has data( 2,3,5)
In the join condition which you wrote, it will check like
1 <> 2 -> true display 1
2<> 2 -> flase no result
3 <> 2 -> true display 3
4 <> 2 -> true display 4
5 <> 2 -> true display 5
and so on
If u remove distinct first 4 values will be 1,3,4,5...
To achive the same result as in the query
"SELECT DISTINCT a.A FROM #T1 a
WHERE a.A NOT IN (SELECT DISTINCT A FROM #T2)"
Use the below query using Joins
SELECT a.A FROM #T1 a
LEFT OUTER JOIN #T2 b on b.A = a.A where b.a IS NULL
----------------------
Create table #T1( A int)
Create table #T2( A int)
insert into #T1 values(1)
insert into #T1 values(2)
insert into #T1 values(3)
insert into #T1 values(4)
insert into #T1 values(5)
insert into #T2 values(2)
insert into #T2 values(3)
insert into #T2 values(5)
select * from #T1
select * from #T2
-----------------------------
May 8, 2012 at 7:35 am
....
Thanks a lot guys ... I needed this ....
and yes, different queries give different results ... no further comment.
I'm off to get a cold shower, a can of beer, and stand by the BBQ to do something completely different with my life ... 🙂
Viewing 4 posts - 1 through 3 (of 3 total)
You must be logged in to reply to this topic. Login to reply