Improving performance on joins to large tables

Question

Improving performance on joins to large tables

Jim Underwood

SSChasing Mays

Points: 623
More actions
May 26, 2004 at 3:43 pm

#359820

I just reviewed a script posted to this web site in which the author claims will significantly improve perforamnce on joins to large tables:
http://www.sqlservercentral.com/scripts/contributions/824.asp
Basicly the author suggested an alternative approach of :
FROM (SELECT * FROM Contact WHERE ContactID = @ContactID) c
LEFT JOIN (SELECT * FROM CONAddress WHERE ContactID = @ContactID)
instead of the conventional approach of:
FROM Contact c
LEFT JOIN CONAddress a
ON c.ContactID = a.ContactID
I am very surprised if this is true. I would have thought that SQL Server would have been able to optimize the conventional approach to be the equivalent or better than the alternate approach.
Can anyone comment on this?
Best Regards,
Jim

Viewing 7 posts - 1 through 7 (of 7 total)

You must be logged in to reply to this topic. Login to reply

mSlava SSC Veteran Points: 258 More actions · Answer 1

I have mater and child tables. Mater table has primary key and detail table has index on detail field. Detail table is large.

select *

from mdan m

left join all_rlists r

on m.id = r.id__

where m.id = 20085

Both indexes are used. At first time used pk on master, then index on detail, than nested loops. This is fast.

select *

from (select * from mdan where id = 20085) m

left join (select * from all_rlists where id__ = 20085) d

on m.id = d.id__

It has equal plan.

My comment is what both tables have to be indexed. If there is no indexes separate scan is faster rather left join.

twolfinger SSC Enthusiast Points: 106 More actions · Answer 2

For those of you who used the script, try this way and you should see the same efficiency without losing the clear coding:

SELECT c.Name, a.Address, p.Phone

FROM Contact c

LEFT JOIN ContactAddress a

ON c.ContactID = @ContactID

and a.ContactID = @ContactID

and c.ContactID = a.ContactID

LEFT JOIN ContactPhone p

ON p.ContactID = @ContactID

and c.ContactID = p.ContactID

This effectively does the same without losing the Clear Join Syntax.

Steve SSCommitted Points: 1844 More actions · Answer 3

I do not have the expierence to say which method is better. But I do know how to definitively tell the difference using the execution plan in query analyzer. What I usually do when I have a question about the effects of a change to a tsql statement is to make two copies of it in QA. When you click the "display execution plan" icon you get back results saying that each query took 50% of the total batch cost. I then alter one of the queries to see if the execution plan gets better or worse for that particular query. This method would tell you if the

Select ?? from ?? inner join ??? on ...

or

Select ?? from (select ...) inner join (select ...) on

is the better method.

But just remember, the answer that you find is probably not an absolute for all statements just for the one you are testing.

Steve

Edward McGovern Ten Centuries Points: 1299 More actions · Answer 4

I gave it a try. I used a Solomon Accounting Database, I have some huge (10 gigs) and gave it a try. I used tables with a lot more complicated keys and foreign keys but still filtering the results by one field (reference number). Interesting enough the traditional execution plan was simpler (one less step).

The traditional was in this format...

Declare @refnbr char(10)

set @refnbr = '004640'

select *

from ardoc doc

left join artran trans

on doc.batnbr = trans.batnbr and doc.refnbr = trans.refnbr and

doc.doctype = trans.trantype and doc.custid = trans.custid

where doc.refnbr = @refnbr

The "new" (I'm sorry but derived tables are not new):

Declare @refnbr char(10)

set @refnbr = '004640'

select *

from (select * from ardoc where refnbr = @refnbr) doc

left join (select * from artran where refnbr = @refnbr) trans

on doc.batnbr = trans.batnbr and doc.refnbr = trans.refnbr and

doc.doctype = trans.trantype and doc.custid = trans.custid

Consistently in serveral databases the "traditional way" was faster. It took no seconds to perform in most cases while the derived tables took a second.

Lab results say, NOT!

I've had great luck using derived tables for doing some very complicated joins that i otherwise could not do, but to use it for speed improvment, I think not...

Calvin Lawson SSChampion Points: 11030 More actions · Answer 5

The author also says that UDF to UDF joins will be faster than straight joins. NOT! If this is occuring you've got some schema/index issues.

cl

Signature is NULL

TriggerMe SSChampion Points: 11752 More actions · Answer 6

Jim Underwood,

The author of this short spiel about Performance on large tables using JOINS needs to state just what exactly constitutes a large Table? AND, how many JOINS? He has opened up a can of worms here without some tangible facts!

Until we get some figures and test his theory on these figures, then its all conjecture!

Kindest Regards,