Which Query Is Performs Better

Question

Which Query Is Performs Better

K Currie

SSCommitted

Points: 1662
More actions
December 4, 2007 at 9:12 am

#119344

I'm trying to determine which query is better and when I say better, I mean will perform faster. I'm comparing data in two tables by using "not in" vs. "left join". My two queries are as follows:
select distinct tablea.docid, tablea.docname
from tablea
left join tableb on
tablea.parentid = tableb.docid
where tableb.docid is null
select distinct tablea.docid, tablea.docname
from tablea
where parentid not in (select docid from tableb)
I can see how the execution plans are different, but I'm not sure I can interpret these to determine which one is better. I'm guessing the left join is better, but I don't know if the nested loop I see in my execution plan is good or bad. Maybe I need a lesson in reading the execution plans...

Viewing 7 posts - 1 through 6 (of 6 total)

You must be logged in to reply to this topic. Login to reply

Matt Miller (4) SSC Guru Points: 124210 More actions · Answer 1

K Currie (12/4/2007)
I'm trying to determine which query is better and when I say better, I mean will perform faster. I'm comparing data in two tables by using "not in" vs. "left join". My two queries are as follows:
select distinct tablea.docid, tablea.docname
from tablea
left join tableb on
tablea.parentid = tableb.docid
where tableb.docid is null
select distinct tablea.docid, tablea.docname
from tablea
where parentid not in (select docid from tableb)
I can see how the execution plans are different, but I'm not sure I can interpret these to determine which one is better. I'm guessing the left join is better, but I don't know if the nested loop I see in my execution plan is good or bad. Maybe I need a lesson in reading the execution plans...

Why don't you just test it to find out?

Unless your server is jammed up on a specific resource - fastest usually =best in my mind. Assuming they're about the same speed - I then check resources.

There's a third syntax to check out - not EXISTS.

Try this to see:

declare @g datetime

select @g=getdate()

select distinct tablea.docid, tablea.docname

from tablea

left join tableb on

tablea.parentid = tableb.docid

where tableb.docid is null

select datediff(ms,@g,getdate())

select @g=getdate()

select distinct

tablea.docid,

tablea.docname

from

tablea

where

parentid not in (select docid from tableb)

select datediff(ms,@g,getdate())

select @g=getdate()

select distinct

tablea.docid,

tablea.docname

from

tablea

where

NOT EXISTS (select docid from tableb where tableb.docid=tablea.docid)

select datediff(ms,@g,getdate())

----------------------------------------------------------------------------------
Your lack of planning does not constitute an emergency on my part...unless you're my manager...or a director and above...or a really loud-spoken end-user..All right - what was my emergency again?

John Rowan SSC Guru Points: 56440 More actions · Answer 2

From my experience the LEFT JOIN typically performs better, but in reality it depends. I will usually start with the LEFT JOIN and only change it if the reads seem high for what I'm trying to do. Your best bet is to follow Matt's advice and to test the difference for each scenario. There is no hard and fast rule where one will always out perform the other.

John Rowan

======================================================
======================================================
Forum Etiquette: How to post data/code on a forum to get the best help[/url] - by Jeff Moden

K Currie SSCommitted Points: 1662 More actions · Answer 3

Thanks. Both of my queries took less than a second to run so I couldn't tell the difference, but according to your calculation the milliseconds are being displayed, right? And in my case, the left join runs the fastest. Not Exists is next, but double the time and the not in is the worst, triple the time. (I guess I could have turned on profiler too.)

Although this does answer my question, can anyone tell me more info regarding the execution plans? If I reviewed each plan, how would I know which way was better? That would help in reviewing more complex queries.

John Rowan SSC Guru Points: 56440 More actions · Answer 4

Post the plans and we can take a peek at them. Also, run each query in query analyzer using SET STATISTICS IO ON. Make sure to clean out the buffers and cache in between runs. Look at the logical reads value and compare the two.

John Rowan

======================================================
======================================================
Forum Etiquette: How to post data/code on a forum to get the best help[/url] - by Jeff Moden

Antares686 SSC Guru Points: 125444 More actions · Answer 5

I will add that unless you have a significant amount of test data you may not be able to tell a difference becuae of the way data can be in the cache making and reducing reads. Generally LEFT JOIN and NOT EXISTS will out perform NOT IN. But I will say from experience you sometimes can get unexpected occurrances just becuase. I alwasy test on a copy of the production data each method to be sure I don't overlook some un thought of advantage one may present.

Antares686 SSC Guru Points: 125444 More actions · Answer 6

Also it is common to see DISTINCT replaced by GROUP BY becuase of the way each is handled GROUP BY generally will perform faster. As well, you should always use the schema owner. Sorry I knew the answer as to why but forget and don't have an articl I can point you too right now.

SELECT

A.docid

, A.docname

FROM

dbo.tablea A

LEFT JOIN

dbo.tableb B

ON

A.parentid = B.docid

WHERE

B.docid is null

GROUP BY

A.docid

, A.docname