EXISTS vs IN vs .... COUNT?!

Question

EXISTS vs IN vs .... COUNT?!

PiMané

SSChampion

Points: 13965
More actions
January 8, 2013 at 4:47 am

#269071

Hi,
I made the following test since some of our developers where using the COUNT similar to EXISTS:
-- criar a tabela e inserir 10000 linhas
CREATE TABLE seats (Id INT IDENTITY(1,1), Seat INT NOT NULL, Reserved TINYINT NOT NULL DEFAULT 0, CONSTRAINT pk_seats PRIMARY KEY (Id))
CREATE TABLE seats2 (ID2 INT IDENTITY(1,1), SeatH INT NOT NULL, CONSTRAINT pk_seats2 PRIMARY KEY (Id2))
DECLARE @i INT = 1
WHILE @i <= 10000
BEGIN
INSERT INTO seats (seat) VALUES (@i)
SET @i = @i+1
END
SET @i = 0
WHILE @i <= 10000
BEGIN
INSERT INTO seats2 (seath) VALUES (@i)
SET @i = @i+1
END
SELECT * FROM Seats s WHERE (SELECT COUNT(1) FROM Seats2 s2 WHERE s.id = s2.id2) > 0
SELECT * FROM Seats s WHERE EXISTS (SELECT 1 FROM Seats2 s2 WHERE s.id = s2.id2)
SELECT * FROM Seats s WHERE s.ID IN (SELECT Id2 FROM Seats2 s2)
They all have the same execution plans, statistics IO, ....
IN can not be used if you what to compare more than one column, EXISTS does, but also does COUNT(...).
Is it Ok to use the COUNT as a "replacement" for EXISTS in these cases or should EXISTS always be used?!
Thanks,
Pedro

If you need to work better, try working less...

Viewing 12 posts - 1 through 11 (of 11 total)

You must be logged in to reply to this topic. Login to reply

Jason-299789 SSC-Insane Points: 21601 More actions · Answer 1

Personally I would stick with the Exists solution as IMHO its far easiernd understand compared with the Count(*) method.

_________________________________________________________________________
SSC Guide to Posting and Best Practices

PiMané SSChampion Points: 13965 More actions · Answer 2

Jason-299789 (1/8/2013)
Personally I would stick with the Exists solution as IMHO its far easiernd understand compared with the Count(*) method.

yep.. it's more ANSI Standard but never the less a surprise for me knowing this...

I've read lots of post comparing IN vs EXISTS vs INNER JOIN but none mentioned the COUNT(1) trick..

Pedro

If you need to work better, try working less...

Sean Lange SSC Guru Points: 286590 More actions · Answer 3

Since you are using a correlated subquery why bother with any of those approaches?

This returns the same thing.

select *

from seats s

join seats2 s2 on s.ID = s2.id2

And FWIW, you should start using a tally table for generating data like that. It will make your life a lot easier if you learn how to use it. Using a tally table to generate the insert is insanely fast compared to the looping method. The loops you posted took nearly 14 seconds on my machine. Using the queries below it didn't even register a time in the status bar in SSMS.

insert seats (seat)

select * from Tally where N <= 10000

insert seats2 (seath)

select * from Tally where N <= 10000

_______________________________________________________________

Need help? Help us help you.

Read the article at http://www.sqlservercentral.com/articles/Best+Practices/61537/ for best practices on asking questions.

Need to split a string? Try Jeff Modens splitter http://www.sqlservercentral.com/articles/Tally+Table/72993/.

Cross Tabs and Pivots, Part 1 – Converting Rows to Columns - http://www.sqlservercentral.com/articles/T-SQL/63681/
Cross Tabs and Pivots, Part 2 - Dynamic Cross Tabs - http://www.sqlservercentral.com/articles/Crosstab/65048/
Understanding and Using APPLY (Part 1) - http://www.sqlservercentral.com/articles/APPLY/69953/
Understanding and Using APPLY (Part 2) - http://www.sqlservercentral.com/articles/APPLY/69954/

PiMané SSChampion Points: 13965 More actions · Answer 4

Sean Lange (1/8/2013)
Since you are using a correlated subquery why bother with any of those approaches?
This returns the same thing.
select *
from seats s
join seats2 s2 on s.ID = s2.id2

Cause, as it says in http://sqlinthewild.co.za/index.php/2010/01/12/in-vs-inner-join/:

If all you need is to check for matching rows in the other table but don’t need any columns from that table, use IN. If you do need columns from the second table, use Inner Join.

And if you need to compare more than one columns use EXISTS instead of IN.

JOIN should only be used if any field from the 2nd table is needed... and if the 2nd table has more than one row per each from the 1st table the row will repeat, with EXISTS or IN it doesn't..

Pedro

If you need to work better, try working less...

Sean Lange SSC Guru Points: 286590 More actions · Answer 5

PiMané (1/8/2013)
Sean Lange (1/8/2013)
Since you are using a correlated subquery why bother with any of those approaches?
This returns the same thing.
select *
from seats s
join seats2 s2 on s.ID = s2.id2
Cause, as it says in http://sqlinthewild.co.za/index.php/2010/01/12/in-vs-inner-join/:
If all you need is to check for matching rows in the other table but don’t need any columns from that table, use IN. If you do need columns from the second table, use Inner Join.
And if you need to compare more than one columns use EXISTS instead of IN.
JOIN should only be used if any field from the 2nd table is needed... and if the 2nd table has more than one row per each from the 1st table the row will repeat, with EXISTS or IN it doesn't..
Pedro

Good point, didn't realize you weren't using them. Back to get another cup of coffee now. :blush:

_______________________________________________________________

Need help? Help us help you.

Read the article at http://www.sqlservercentral.com/articles/Best+Practices/61537/ for best practices on asking questions.

Need to split a string? Try Jeff Modens splitter http://www.sqlservercentral.com/articles/Tally+Table/72993/.

Cross Tabs and Pivots, Part 1 – Converting Rows to Columns - http://www.sqlservercentral.com/articles/T-SQL/63681/
Cross Tabs and Pivots, Part 2 - Dynamic Cross Tabs - http://www.sqlservercentral.com/articles/Crosstab/65048/
Understanding and Using APPLY (Part 1) - http://www.sqlservercentral.com/articles/APPLY/69953/
Understanding and Using APPLY (Part 2) - http://www.sqlservercentral.com/articles/APPLY/69954/

GSquared SSC Guru Points: 260824 More actions · Answer 6

I've seen Count() kill a server where Exists() was fine.

A code-push was done on a Tuesday night. Everything went fine. Monday morning, however, the websites wouldn't come up, and the database servers were all completely slammed, to the point of crashing to the passive node in one case.

Followed the bug down, and it was a Count() > 0 check that was being run hundreds of times per second by the new code. (Turns out, QA tested the pages, but didn't load-test any of it.) I converted that to an Exists() check, on the fly, in production (don't do this), and immediately the database servers went down to about 20% load, with no other change in anything.

That was on big, heavy-traffic tables. Tens of millions of rows, that kind of thing.

Count() has to do more work, and requires more aggressive locking.

Think of it this way:

How fast can you find out if there's anyone named George in the phonebook?

vs

How long would it take you to count how many people named George there are in the exact same phonebook?

One, you do a quick scan till you find a George. The other, you need to sit down and keep a tally of every row. Plus you need to make sure there are no new editions (locking) while you're doing your count.

I believe SQL Server will sometimes shortcut a "Where (select count(*) blah blah blah) > 1" into an Exists() check. But don't count on the optimizer being smarter than you in that kind of way.

- Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
Property of The Thread

"Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon

PiMané SSChampion Points: 13965 More actions · Answer 7

Sean Lange (1/8/2013)
Good point, didn't realize you weren't using them. Back to get another cup of coffee now. :blush:

LOL 🙂

I'm on my 4th coffee today.... and still need some more caffeine.. probably a cappuccino 🙂

If you need to work better, try working less...

PiMané SSChampion Points: 13965 More actions · Answer 8

GSquared (1/8/2013)
How fast can you find out if there's anyone named George in the phonebook?
vs
How long would it take you to count how many people named George there are in the exact same phonebook?
One, you do a quick scan till you find a George. The other, you need to sit down and keep a tally of every row. Plus you need to make sure there are no new editions (locking) while you're doing your count.
I believe SQL Server will sometimes shortcut a "Where (select count(*) blah blah blah) > 1" into an Exists() check. But don't count on the optimizer being smarter than you in that kind of way.

Exactly... then the DEV department will ear from me 🙂

Just like the IF EXISTS (SELECT TOP 1 ... FROM)... useless 5 chars (TOP 1) 🙂

Thanks,

Pedro

If you need to work better, try working less...

Gail Shaw SSC Guru Points: 1004504 More actions · Answer 9

GSquared (1/8/2013)
I believe SQL Server will sometimes shortcut a "Where (select count(*) blah blah blah) > 1" into an Exists() check. But don't count on the optimizer being smarter than you in that kind of way.

It seems to have an optimisation for COUNT(*) ... > 0, but that's a special case and just for 0, and doesn't work always.

Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

We walk in the dark places no others will enter
We stand on the bridge and no one may pass

PiMané SSChampion Points: 13965 More actions · Answer 10

So, just to be on the safe side AVOID COUNT() and use EXISTS...

Thanks,

Pedro

If you need to work better, try working less...

GSquared SSC Guru Points: 260824 More actions · Answer 11

GilaMonster (1/8/2013)
GSquared (1/8/2013)
I believe SQL Server will sometimes shortcut a "Where (select count(*) blah blah blah) > 1" into an Exists() check. But don't count on the optimizer being smarter than you in that kind of way.
It seems to have an optimisation for COUNT(*) ... > 0, but that's a special case and just for 0, and doesn't work always.

Typo on my part. Started to type ">= 1", changed my mind to "> 0", ended up with half of each, which is just plain wrong. 🙂

Yeah, > 0 sometimes gets shortcut. Sometimes. Not reliably.

- Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
Property of The Thread

"Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon