Alternate technique to "NOT IN" operator

Question

Post reply

Alternate technique to "NOT IN" operator

pro

SSC Veteran

Points: 262
More actions
July 21, 2003 at 7:46 am

#80536

Hi all,
I would like to know, if any method exists to substitute a NOT IN comparison. Here is an example of the situation I encounter.
Table: EMP
EmpcodeSalary
10011500.00
10021250.00
Table: EMPLEAVE
EmpcodeNumDays
10013
10035
Here EMP is the master table and EMPLEAVE is the child table and they both can be linked using the column "Empcode". My requirement is to pick out those employee codes in EMPLEAVE that do not have a matching entry in EMP. The output should be like the one given below
EmpcodeNumDays
10035
I tried a sub-query coupled with a 'NOT IN' comparison and the query is given below for your convenience. This technique works, but when the number of rows in the sub-query is too high, the performance decreases (especially when we use TOP n) in the outer most query.
SELECT empleave.* FROM empleave WHERE empcode NOT IN (SELECT empcode FROM emp)
Is there any other way to get this output without using 'NOT IN'.
TIA

Viewing 10 posts - 1 through 9 (of 9 total)

You must be logged in to reply to this topic. Login to reply

Andy Warren SSC Guru Points: 119902 More actions · Answer 1

Often an outer join will do it.

Andy

http://www.sqlservercentral.com/columnists/awarren/

Andy
SQLAndy - My Blog!
Connect with me on LinkedIn
Follow me on Twitter

fromnaija SSCrazy Points: 2876 More actions · Answer 2

Something like the following code is what you want.

SELECT EMPLEAVE.* FROM EMPLEAVE

left outer join EMP on EMPLEAVE.Empcode = EMP.Empcode

Where EMP.Empcode is null

Joseph

guarddata SSCertifiable Points: 6583 More actions · Answer 3

Even NOT EXISTS is better than NOT IN

SELECT empleave.* FROM empleave WHERE NOT EXISTS (SELECT * FROM emp WHERE Emp.EmpCode = empleave.EmpCode)

Guarddata-

Hans Lindgren SSChampion Points: 10496 More actions · Answer 4

Hi! I would say that EXISTS is not always better then IN. Ok, BOL says they are semantically equal but that does not mean they are the same performance wise...

BOL IN:

quote:
USE pubs
SELECT pub_name
FROM publishers
WHERE pub_id IN
(SELECT pub_id
FROM titles
WHERE type = 'business')
(...) is evaluated in two steps. First, the inner query returns the identification numbers of the publishers that have published business books (1389 and 0736). Second, these values are substituted into the outer query (...)

What does it mean? Well, it means the inner query is just executed once.

BOL EXISTS:

quote:
USE pubs
SELECT pub_name
FROM publishers
WHERE EXISTS
(SELECT *
FROM titles
WHERE pub_id = publishers.pub_id
AND type = 'business')
(...)first pub_id = 1389. Are there any rows in the titles table in which pub_id is 1389 and type is business? If so, Algodata Infosystems should be one of the values selected. The same process is repeated for each of the other publisher names

So, what does this mean? The query is executed for every outer record.

Semantics are important. While they produce the same result, the ways of doing this can differ tremendously.

Imagine table TITLES as a x GB table. Imagine the case when column 'pub_id' or 'type' aren't indexed. Say that type = 'business' returns 100 rows. In the first case (using the operator IN) we have to do on tablescan (with subsequest in-memory iterations).

In the second case we are left out with a table scan for every existance check.

So if the tables in the EXISTS subquery is badly indexed (or index use is impossible due to computations or similar) having a static precomputed IN clause can sometimes improve performance beyond belief.

True that EXISTS is more powerful in it's capability as a sub-query but it also has the same flaws as a sub-query. Hence, as always, it is almost impossible to say: Always Write Your Queries Like This.

I agree with Andy saying 'Often...' instead of 'Always...'

Regards, Hans!

Edited by - hanslindgren on 07/22/2003 02:41:26 AM

AnzioBake SSCarpal Tunnel Points: 4134 More actions · Answer 5

Often when using exists or in subqueries I use the "top 1" in the select so query eecution stops of first row retrieved.

Hans Lindgren SSChampion Points: 10496 More actions · Answer 6

AnzioBake:

quote:
Often when using exists or in subqueries I use the "top 1" in the select so query eecution stops of first row retrieved.

...which is unneeded since the execution still stops as soon as it can be evaluated as 'true'. The EXISTS query, still has to be executed.

It does not matter if you put SELECT *, SELECT TOP 1 * or SELECT TOP 1 1, it still just returns 'true' or 'false'. It runs the query until it can decide what to return Which in the case of SELECTs always is the first row found.

Regards, Hans!

hamishahern SSC-Addicted Points: 498 More actions · Answer 7

I agree with hanslindgren, if you look at the Execution Plan of both the "NOT EXISTS" and "NOT IN" you'll find it is exactly the same! sometimes on complex queries the optimiser will provide a plan slightly different for the two statements, leading to peoples opinions of one being faster than the other, depending on what the optimiser thinks is fastest.

As always - the user can optimise queries further by keeping an eye on the execution plans to make sure it is the most efficient execution of a specific statement.

guarddata SSCertifiable Points: 6583 More actions · Answer 8

Thank you, gentlemen, for catching my mistake. Nice to have a second pair of eyes to keep out mistakes.

Guarddata-

Johan Bijnens SSC Guru Points: 135376 More actions · Answer 9

You've had the alternatives in previous replies. Having Declarative Referential Integrity (DRI) in place would have prevented those rows from getting inserted.

Johan

Learn to play, play to learn !

Dont drive faster than your guardian angel can fly ...
but keeping both feet on the ground wont get you anywhere :w00t:

- How to post Performance Problems
- How to post data/code to get the best help[/url]

- How to prevent a sore throat after hours of presenting ppt

press F1 for solution, press shift+F1 for urgent solution 😀

Need a bit of Powershell? How about this

Who am I ? Sometimes this is me but most of the time this is me