'EXISTS' vs 'IN'

Question

Post reply

'EXISTS' vs 'IN'

kiran-670359

SSC Eights!

Points: 853
More actions
March 31, 2008 at 11:23 pm

#188628

Hi all,
:).I am confused with the usuage of 'EXISTS' and 'IN' in my queries.can anyone help me in this regard.
Thanks
Kiran

Viewing 15 posts - 1 through 15 (of 24 total)

You must be logged in to reply to this topic. Login to reply

Jeff Moden SSC Guru Points: 1004470 More actions · Answer 1

Yes... don't use either... use an INNER JOIN instead.

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

kiran-670359 SSC Eights! Points: 853 More actions · Answer 2

Consider i have two tables University(parent table) and Branch(child table) The primarty key for university table is id ,the foriegn key for the branch table is uid

i have a query like this

A:select b.name from branch b inner join university u on u.id = b.uid

rewriting the above query as

B:select b.name from branch where b.uid exists( select u.id from university where u.id = b.uid)

If i run both queries in toad and observe the explain plan the second query B is performing better than query A.

so how can i decide where to use joins and where to use exists.

Kiran

Anipaul SSC-Insane Points: 24681 More actions · Answer 3

Kiran

Avoid them in your queries. But if you want to learn them then go thru the following articles:

http://weblogs.sqlteam.com/mladenp/archive/2007/05/18/60210.aspx

http://articles.techrepublic.com.com/5100-9592-6180272.html

http://www.sqlteam.com/article/using-exists

http://www.java2s.com/Code/SQLServer/Select-Query/EXISTS.htm

🙂

RBarryYoung SSC Guru Points: 143329 More actions · Answer 4

Hmm, I just posted something like this yesterday...:

It is important to realize that there are several types of Subquery:

...

2. IN subqueries: These return any number of rows, each with a single column and they look like this:WHERE value IN(Select col1 From ... ) They are typically used in WHERE clauses to test if a particular value is in the list of values returned.

3. EXISTS subqueries: These return any number of rows with any number of columns and they look like this:WHERE EXISTS(Select * From ... ) They are typically used in WHERE clauses to test if corresponding rows in another table exist. Because the column values returned are not actually used, typicaly you use a "*" instead of bothering to name any columns.

...

And as far as I'm concerned, IN and EXISTS OK to use. They are just as fast as Joins when used correctly and greatly aid code clarity.

[font="Times New Roman"]-- RBarryYoung[/font], [font="Times New Roman"] (302)375-0451[/font] blog: MovingSQL.com, Twitter: @RBarryYoung [font="Arial Black"]
Proactive Performance Solutions, Inc. [/font][font="Verdana"] "Performance is our middle name."[/font]

Jeff Moden SSC Guru Points: 1004470 More actions · Answer 5

rbarryyoung (4/1/2008)
And as far as I'm concerned, IN and EXISTS OK to use. They are just as fast as Joins when used correctly and greatly aid code clarity.

WHERE IN is actually converted to an INNER JOIN by the optimizer and is treated exactly the same by the optimizer. Left to their own devices, I've found that Developers will improperly write triple and quadruple nested joins if allowed to use WHERE IN on a regular basis because it IS easy for them to read along with the resulting severe performance problems.

WHERE EXISTS is not... WHERE EXISTS creates a form of RBAR known as a CORRELATED SUB-QUERY. If written correctly, it's also quite fast but I don't allow people to use it in my shop because it makes the Execution Plan look especially effective (displays plan for first row only) when it's not (there are the rare exceptions in 2k but never needed in 2k5). Developers see that and start using it everywhere for every inner join.

I agree with Barry that WHERE IN is an easy read... but for the reasons above, I don't allow them in my shop (Yes, we code review everything).

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

Mauve SSChampion Points: 11316 More actions · Answer 6

Yes... don't use either... use an INNER JOIN instead

Not always. The matching values resulting from an INNER JOIN can be more than one (1). This will result in as many rows as there are matches. With an IN or EXISTS, only a single row will be returned per the match regardless of the number of values in the IN list (explicit or via a query) or that are present in the sub-table (EXISTS).

So there are three (3) different ways to perform matches -- INNER JOIN, IN, and EXISTS. Each query needs to be carefully examined for the proper use. I.e., the proper tool for the job...

[font="Arial Narrow"](PHB) I think we should build an SQL database. (Dilbert) What color do you want that database? (PHB) I think mauve has the most RAM.[/font]

tfifield SSCrazy Eights Points: 9655 More actions · Answer 7

I agree with Jeff on this one. I've had to go through 2 systems the used the WHERE IN syntax and replace them with an INNER JOIN. These were for reports that had to go through lots and lots of data. The performance increase was dramatic when using the INNER JOIN style.

When there were possibilities of multiple join conditions I used an INNER JOIN to a derived table that was either an aggregate or I used a DISTINCT in the SELECT for the derived table. Both of these were faster than WHERE IN in actual practice.

Todd Fifield

shamshudheen SSCrazy Points: 2390 More actions · Answer 8

Jeff Moden (3/31/2008)
Yes... don't use either... use an INNER JOIN instead.

hi jeff

your point is very usefull

as per my experience IN clause is usefull whene check Listed values

and EXISTS clause is for checking record exist in the table

INNER JOIN is very handy and powerfull

thanks

shamsudheen

SQLZ SSChampion Points: 12872 More actions · Answer 9

Here's another good reason why not to use IN:

Run this on the AdventureWorks database in SQL 2005:

begin tran

delete Sales.SalesOrderDetail where SalesOrderId in (select SalesOrderId from Doctor)

rollback tran

Whoops! You've just deleted 120,000 SalesOrders. Hopefully you included the "rollback tran" in that statement you just ran. Now consider the fact that the table Doctor has no column named SalesOrderId. If you run the inner query by itself it generates an error, but if you run the statement as a whole it deletes every row in SalesOrderDetail.

All it takes is for a developer to accidentally type in the wrong table name in the inner query and you could end up in serious trouble.

Karl
source control for SQL Server

shamshudheen SSCrazy Points: 2390 More actions · Answer 10

shamshudheen

SSCrazy

Points: 2390

April 3, 2008 at 2:27 am

#797670

nice example

shamsudheen

Jeff Moden SSC Guru Points: 1004470 More actions · Answer 11

JohnG (4/2/2008)
Yes... don't use either... use an INNER JOIN instead
Not always. The matching values resulting from an INNER JOIN can be more than one (1). This will result in as many rows as there are matches. With an IN or EXISTS, only a single row will be returned per the match regardless of the number of values in the IN list (explicit or via a query) or that are present in the sub-table (EXISTS).
So there are three (3) different ways to perform matches -- INNER JOIN, IN, and EXISTS. Each query needs to be carefully examined for the proper use. I.e., the proper tool for the job...

Thanks for the reminder (I forgot :blush:), John... you're correct... WHERE IN and WHERE EXISTS acts like a SELECT DISTINCT on the joined table.

The WHERE EXISTS has some benefit of usage like finding the TOP 2 of every group... but it does have the disadvantage of being RBAR in nature. Not many ways around that in 2k for things like the TOP 2 of each group but plenty of set based solutions in 2k5 for the same thing that are setbased. Both should be evaluated for performance in 2k5 when trying to decide which to use.

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

Mauve SSChampion Points: 11316 More actions · Answer 12

Here's another good reason why not to use IN:
Run this on the AdventureWorks database in SQL 2005:
begin tran
delete Sales.SalesOrderDetail where SalesOrderId in (select SalesOrderId from Doctor)
rollback tran
Whoops! You've just deleted 120,000 SalesOrders. Hopefully you included the "rollback tran" in that statement you just ran. Now consider the fact that the table Doctor has no column named SalesOrderId. If you run the inner query by itself it generates an error, but if you run the statement as a whole it deletes every row in SalesOrderDetail.

I got caught with this in a trigger referencing the deleted table. It was a syntax error that SQL Server should have caught and thrown an error. This is just a BAD BUG in SQL Server. Shame on you Microsoft!

That is why you should always qualify every column with the table name (or an alias).

[font="Arial Narrow"](PHB) I think we should build an SQL database. (Dilbert) What color do you want that database? (PHB) I think mauve has the most RAM.[/font]

SQLZ SSChampion Points: 12872 More actions · Answer 13

JohnG (4/3/2008)
I got caught with this in a trigger referencing the deleted table. It was a syntax error that SQL Server should have caught and thrown an error. This is just a BAD BUG in SQL Server. Shame on you Microsoft!
That is why you should always qualify every column with the table name (or an alias).

Actually this isn't a SQL Server issue rather than a SQL issue. I can't remember where I saw this but I remember reading about it and how it was a "feature" of SQL and how you'd get this same behaviour on Oracle or other RDMBS's.

I never went out to verify this so I might be wrong.

Can anyone confirm whether this is the case?

Karl
source control for SQL Server

Mauve SSChampion Points: 11316 More actions · Answer 14

Actually this isn't a SQL Server issue rather than a SQL issue. I can't remember where I saw this but I remember reading about it and how it was a "feature" of SQL and how you'd get this same behaviour on Oracle or other RDMBS's.

Confirmed.

Oracle exhibits the exact same behavior as per the example below. Note that the "users" table does not have a column named "content_id". However, if the column is properly qualified, it does throw an error.

SQL*Plus: Release 9.2.0.7.0 - Production on Thu Apr 3 09:13:08 2008

SQL> SELECT content_id FROM users;

SELECT content_id FROM users

*

ERROR at line 1:

ORA-00904: "CONTENT_ID": invalid identifier

SQL> DELETE FROM content WHERE content_id IN (SELECT content_id FROM users);

0 rows deleted.

SQL> DELETE FROM content WHERE content_id IN (SELECT x.content_id FROM users x);

DELETE FROM content WHERE content_id IN (SELECT x.content_id FROM users x)

*

ERROR at line 1:

ORA-00904: "X"."CONTENT_ID": invalid identifier

SQL>

However, notice that the following DOES throw an error:

SQL> DELETE FROM content WHERE content_id IN (SELECT foobar FROM users);

DELETE FROM content WHERE content_id IN (SELECT foobar FROM users)

*

ERROR at line 1:

ORA-00904: "FOOBAR": invalid identifier

SQL>

So the issue is not the IN syntax per se, but one of "scope" as it pertains to the column names being referenced.

[font="Arial Narrow"](PHB) I think we should build an SQL database. (Dilbert) What color do you want that database? (PHB) I think mauve has the most RAM.[/font]