SQL JOINS BETTER WAY

Question

Post reply

SQL JOINS BETTER WAY

aditi.iyer

Old Hand

Points: 322
More actions
July 9, 2010 at 9:38 am

#223079

Which way is better way of joining two large tables :
Select a.col2
,b.col2
from table1 a
Inner join table2 b
on a.col1 = b.col1
Vs.
Select a.col2
,b.col2
from table1 a
Inner join ( select col1, col2 from table2 ) b
on a.col1 = b.col1
Will second query result in better perfornance ?

Viewing 10 posts - 1 through 9 (of 9 total)

You must be logged in to reply to this topic. Login to reply

Michael Valentine Jones SSC Guru Points: 64818 More actions · Answer 1

They will more than likely have identical execution plans.

You should look at the query plans for both so that you can tell for yourself.

Alan Vogan SSCrazy Points: 2513 More actions · Answer 2

I ran 3 quick tests over some tables on our test server...

Test 1: I ran with several columns selected (not exactly your query but one that made sense for the data). Query B ran 1 second faster than Query A for 2.5 million records.

Test 2: I ran over the same data set, with just 2 columns. Equal time.

Test 3: I ran over 24 million records, with just 2 columns selected. Query B ran 1 second faster than Query A.

I'm not seeing much of a difference. Is there a certain instance where we can expect a significant difference in this approach?

aditi.iyer Old Hand Points: 322 More actions · Answer 3

I think it would make difference where table2 will have a large number of columns . I am not sure where I read it but it seems to make a difference when you write Select col1... query rather than doing a join on entire table.

I suppose when JOIN is on entire table , internally it is translating as 'Select * from table2' and hence if we write 'Select col1, col2 from table2" where table2 has more than 50 columns , it would make a difference

What do you think ?

jeff.mason SSCrazy Eights Points: 9585 More actions · Answer 4

The optimizer won't grab a field if it doesn't have to. If you use, for instance, a covering index, it won't even hit the main table at all but just use the index. Neither query you were showing use all of the fields (assuming that table B has more fields). Therefore if there is a covering index for those two fields, it won't even use the base table and there would be no difference in performance. And if you DID need more fields, then it won't matter whether it's in a derived table or a straight join. The work to get the data should basically be the same.

Gail Shaw SSC Guru Points: 1004504 More actions · Answer 5

Alan Vogan (7/9/2010)
Test 1: I ran with several columns selected (not exactly your query but one that made sense for the data). Query B ran 1 second faster than Query A for 2.5 million records.
Test 2: I ran over the same data set, with just 2 columns. Equal time.
Test 3: I ran over 24 million records, with just 2 columns selected. Query B ran 1 second faster than Query A.

One second out of how many seconds total? How many times did you run each test? Are the execution plans different, if so how?

Small variations of execution time are expected and normal.

Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

We walk in the dark places no others will enter
We stand on the bridge and no one may pass

Alan Vogan SSCrazy Points: 2513 More actions · Answer 6

GilaMonster (7/9/2010)
Alan Vogan (7/9/2010)
Test 1: I ran with several columns selected (not exactly your query but one that made sense for the data). Query B ran 1 second faster than Query A for 2.5 million records.
Test 2: I ran over the same data set, with just 2 columns. Equal time.
Test 3: I ran over 24 million records, with just 2 columns selected. Query B ran 1 second faster than Query A.
One second out of how many seconds total? How many times did you run each test? Are the execution plans different, if so how?
Small variations of execution time are expected and normal.

Test 1: 14 vs. 13 seconds

Test 2: 14 vs. 14 seconds

Test 3: 5:51 vs. 5:50 (min:sec)

I only ran the test once each time. The execution plans were the same for each test pair.

Gail Shaw SSC Guru Points: 1004504 More actions · Answer 7

Alan Vogan (7/9/2010)
GilaMonster (7/9/2010)
Alan Vogan (7/9/2010)
Test 1: I ran with several columns selected (not exactly your query but one that made sense for the data). Query B ran 1 second faster than Query A for 2.5 million records.
Test 2: I ran over the same data set, with just 2 columns. Equal time.
Test 3: I ran over 24 million records, with just 2 columns selected. Query B ran 1 second faster than Query A.
One second out of how many seconds total? How many times did you run each test? Are the execution plans different, if so how?
Small variations of execution time are expected and normal.
Test 1: 14 vs. 13 seconds
Test 2: 14 vs. 14 seconds
Test 3: 5:51 vs. 5:50 (min:sec)
I only ran the test once each time. The execution plans were the same for each test pair.

The last one is no where close to statistically significant. Small variations in execution time are normal and I suspect a second run would produce different results. If the exec plans are identical, it means SQL will execute those two queries identically.

p.s. If you had the actual execution plan enabled for the tests, that would account for at least a portion of the execution time, generating and rendering of it.

Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

We walk in the dark places no others will enter
We stand on the bridge and no one may pass

The Dixie Flatline SSC Guru Points: 53354 More actions · Answer 8

If you look closely at the execution plan for your first example, you should see that it only pulls the column(s) necessary for filtering or output from the joined table. The optimizer is intelligent enough to know just what is needed without your having to specify them in advance, as you did in the second example.

__________________________________________________

Against stupidity the gods themselves contend in vain. -- Friedrich Schiller
Stop, children, what's that sound? Everybody look what's going down. -- Stephen Stills

Adam Haines SSC-Insane Points: 23217 More actions · Answer 9

There should be no difference in performance because the plans should be the same. The plans should be the same because the optimizer is smart enough to only output the columns you are selecting. So the reality is the derived table query and the traditional query will be treated the same; therefore, making the plans the same.

There are scenrarios where putting complex queries in derived tables can save IO and increase performance, but this simplistic example is not one.

My blog: http://jahaines.blogspot.com