January 9, 2014 at 9:57 am
I must be missing something silly.
I have Table_A with 488 unique email addresses
select distinct(email) from Table_A -- 488
I have Table_B with several million records, including duplicate emails.
Table_A email varchar(255)
Table_B Email1 varchar(50)
Zero Rcds:
select A.Email from TableA A -- (0 row(s) affected)
where A.Email NOT in (select email1 from Table_B)
445 Rcds
select A.Email from TableA A -- (445 row(s) affected)
where A.Email in (select email1 from Table_B)
So I have 43 records that are neither IN nor NOT IN Table_B
(Query #2 is a copy of #1, just removed NOT. And Yes, the queries really are that simple, just changed the table names.)
Please administer a dope slap.
Edit: tweaked example a bit
January 9, 2014 at 10:02 am
If see 2 options:
A. You're using a different column from Table1 (email & email1), but it could be a transcription error.
B. You have null values on email column, check this:
select email from TableA A
where A.Email NOT in (select B.email from Table_B B WHERE B.email IS NOT NULL)
January 9, 2014 at 10:12 am
"B" also gives me 445.
January 9, 2014 at 10:24 am
What if you try this in both directions?
select a.Email
from TableA a
where not exists (select email1
from Table_B b
where b.email1 = a.email);
If it gives you the desired result, then you have nulls in your referenced table. See Gail's article at http://sqlinthewild.co.za/index.php/2009/08/17/exists-vs-in/ for an excellent explanation.
Edit: Oops. Hit {Enter} when typing the SQL. :crazy:
January 9, 2014 at 10:37 am
Ed, That gives me the 43 missing records.
January 9, 2014 at 10:52 am
So are they really missing? Check the first few and do a LIKE query to see if they really are in the second table or not? If not and the tables are supposed to match, then you have to ask where the data being populated from. If, as Luis pointed out, it's a transcription error, then you're never going to get the counts to match unless you correct the data.
All we've shown with the exists is that you have NULLs. I prefer EXISTS over IN most of the time. I forgot who said it, but I read it in a book written by someone smarter than I am "True is never equal to false, but NULL is always unknown, which is neither true nor false." The EXISTS doesn't care about the NULLs, where IN does.
January 9, 2014 at 11:22 am
Yes, Table_B has 45,000 nulls in Email1, but that shouldn't be related to the 43 records. I can't see what is different about them. All 488 came from an Excel spreadsheet. Some funny hidden characters ? But they should still either be IN or NOT IN.
I picked a few of the 43 missing, and did select * from Table_B where email1 like '%Joe@MSN.com%' and get 0
So, I guess the Nulls are the culprint. Either go with "not Exists" or omit Nulls.
January 10, 2014 at 1:43 pm
Your first query returns 0 rows because of the null email addresses in tableb.
IN is essentially a shorthand for a series of equality clauses, so
WHERE A NOT IN (1,2,3)
becomes
WHERE
A <> 1
AND A <> 2
AND A <> 3
The problem is when there is a null in your set, it becomes the equivalent of
WHERE
A <> 1
AND A <> 2
AND A <> 3
AND A <> NULL
Any evaluation against a null like that returns FALSE (actually 'unknown'), which is ANDed with the other clauses so the evaluation always returns FALSE no matter what the value of A is.
January 11, 2014 at 7:42 am
sestell1 (1/10/2014)
Your first query returns 0 rows because of the null email addresses in tableb.IN is essentially a shorthand for a series of equality clauses, so
WHERE A NOT IN (1,2,3)
becomes
WHERE
A <> 1
AND A <> 2
AND A <> 3
The problem is when there is a null in your set, it becomes the equivalent of
WHERE
A <> 1
AND A <> 2
AND A <> 3
AND A <> NULL
Any evaluation against a null like that returns FALSE (actually 'unknown'), which is ANDed with the other clauses so the evaluation always returns FALSE no matter what the value of A is.
Ah... be careful with that. That actually forms a "double triangular join" which is basically an (N-1)2 lookup. In other words, it's almost a full blown Cartesian Product.
--Jeff Moden
Change is inevitable... Change for the better is not.
Viewing 9 posts - 1 through 8 (of 8 total)
You must be logged in to reply to this topic. Login to reply