Is there a performance gain using If Not exists versus If (select coun(*) from ... see example

Question

Post reply

Is there a performance gain using If Not exists versus If (select coun(*) from ... see example

Pam Ozer

SSCommitted

Points: 1722
More actions
January 18, 2010 at 12:39 pm

#144358

IF (SELECT COUNT(*) FROM AOE_WebPageControlsAvail WHERE WebPageId = @PageId AND UserControlId = @ControlId AND EndDate > @CreateDate) = 0
Should actually be this:
IF NOT EXISTS (SELECT COUNT(PageControlsAvailId) FROM AOE_WebPageControlsAvail WHERE WebPageId = @PageId AND UserControlId = @ControlId AND EndDate > @CreateDate)

Viewing 8 posts - 1 through 7 (of 7 total)

You must be logged in to reply to this topic. Login to reply

Conan The Canadian Ten Centuries Points: 1335 More actions · Answer 1

There is a performance gain by using the NOT EXISTS.

With NOT EXISTS, SQL Server will stop looking for any record matches after it finds the first one.

By using a count(*), SQL Server will continue to look for records after the first match to get an accurate count.

Kev -=Conan The Canadian=-
@ConanTheCdn

GOC-481399 SSC Veteran Points: 249 More actions · Answer 2

Also remember that the execution plan will tell you what the relative cost of the queries will be. If you execute both queries at the same time and look at the execution plan, it will tell you how expensive one query is relative to the other. For example if it says:

Query 1: Query cost (relative to the batch): 67%

.

Query 2: Query cost (relative to the batch): 33%

Then you know that query 1 is twice as expensive as query 2.

Gail Shaw SSC Guru Points: 1004504 More actions · Answer 3

As you have it written there, no. There is no performance difference and it's easy enough to test. SQL's smart enough to see that you've just checking if a row exists and it works accordingly

What does have a different (poorer) performance is when the count and the IF are not in the same line. Like this

SELECT @count = COUNT(*) FROM AOE_WebPageControlsAvail WHERE WebPageId = @PageId AND UserControlId = @ControlId AND EndDate > @CreateDate

IF @count = 0

...

As I said, trivial to prove. (the table Posts mentioned in the below query has 1.3 million rows)

Option 1

IF (SELECT COUNT(*) FROM Posts WHERE postdate> '2008/12/01') = 0

PRINT 'if 1'

Execution characteristics

Table 'Posts'. Scan count 1, logical reads 18245, physical reads 0.

SQL Server Execution Times:

CPU time = 125 ms, elapsed time = 128 ms.

Option 2

IF NOT EXISTS (SELECT 1 FROM Posts WHERE postdate> '2008/12/01')

PRINT 'if 2'

Execution characteristics

Table 'Posts'. Scan count 1, logical reads 18245, physical reads 0.

SQL Server Execution Times:

CPU time = 125 ms, elapsed time = 127 ms.

Option 3

DECLARE @count INT

SELECT @count = count(*) from Posts where postdate> '2008/12/01'

IF (@count=0)

PRINT 'if 3'

Execution characteristics

Table 'Posts'. Scan count 1, logical reads 199404, physical reads 0.

SQL Server Execution Times:

CPU time = 485 ms, elapsed time = 522 ms.

As can be seen, there's no difference between the count and the not exists when the count is specified inside the IF. If the count is done separately, the result assigned to a variable and then the variable checked, the performance is significantly worse.

Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

We walk in the dark places no others will enter
We stand on the bridge and no one may pass

Gail Shaw SSC Guru Points: 1004504 More actions · Answer 4

GOC-481399 (1/18/2010)
Also remember that the execution plan will tell you what the relative cost of the queries will be. If you execute both queries at the same time and look at the execution plan, it will tell you how expensive one query is relative to the other. For example if it says:
Query 1: Query cost (relative to the batch): 67%
.
.
.
Query 2: Query cost (relative to the batch): 33%
Then you know that query 1 is twice as expensive as query 2.

But bear in mind that those costs are estimates and hence may not be accurate. There are a number of things that can make those completely incorrect. I have a trivial example where two queries in a batch have relative costs of 1% and 99%. The one that's costed at 1% is orders of magnitude slower than the one costed at 99%

Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

We walk in the dark places no others will enter
We stand on the bridge and no one may pass

GOC-481399 SSC Veteran Points: 249 More actions · Answer 5

Ah. Thank you. I also failed to mention the value of using

DBCC DROPCLEANBUFFERS

before running the batch.

Gail Shaw SSC Guru Points: 1004504 More actions · Answer 6

That's something I never do. I prefer to run the queries a couple of times and ignore the first run (which is the one that would be affected by the compile and the loading of pages into cache)

Basically I prefer to test against a warm cache than a cold one.

Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

We walk in the dark places no others will enter
We stand on the bridge and no one may pass

Pam Ozer SSCommitted Points: 1722 More actions · Answer 7

thank you for all the information. It has been very helpful.