July 9, 2010 at 9:38 am
Which way is better way of joining two large tables :
Select a.col2
,b.col2
from table1 a
Inner join table2 b
on a.col1 = b.col1
Vs.
Select a.col2
,b.col2
from table1 a
Inner join ( select col1, col2 from table2 ) b
on a.col1 = b.col1
Will second query result in better perfornance ?
July 9, 2010 at 9:46 am
They will more than likely have identical execution plans.
You should look at the query plans for both so that you can tell for yourself.
July 9, 2010 at 10:51 am
I ran 3 quick tests over some tables on our test server...
Test 1: I ran with several columns selected (not exactly your query but one that made sense for the data). Query B ran 1 second faster than Query A for 2.5 million records.
Test 2: I ran over the same data set, with just 2 columns. Equal time.
Test 3: I ran over 24 million records, with just 2 columns selected. Query B ran 1 second faster than Query A.
I'm not seeing much of a difference. Is there a certain instance where we can expect a significant difference in this approach?
July 9, 2010 at 11:06 am
I think it would make difference where table2 will have a large number of columns . I am not sure where I read it but it seems to make a difference when you write Select col1... query rather than doing a join on entire table.
I suppose when JOIN is on entire table , internally it is translating as 'Select * from table2' and hence if we write 'Select col1, col2 from table2" where table2 has more than 50 columns , it would make a difference
What do you think ?
July 9, 2010 at 11:17 am
The optimizer won't grab a field if it doesn't have to. If you use, for instance, a covering index, it won't even hit the main table at all but just use the index. Neither query you were showing use all of the fields (assuming that table B has more fields). Therefore if there is a covering index for those two fields, it won't even use the base table and there would be no difference in performance. And if you DID need more fields, then it won't matter whether it's in a derived table or a straight join. The work to get the data should basically be the same.
July 9, 2010 at 11:20 am
Alan Vogan (7/9/2010)
Test 1: I ran with several columns selected (not exactly your query but one that made sense for the data). Query B ran 1 second faster than Query A for 2.5 million records.Test 2: I ran over the same data set, with just 2 columns. Equal time.
Test 3: I ran over 24 million records, with just 2 columns selected. Query B ran 1 second faster than Query A.
One second out of how many seconds total? How many times did you run each test? Are the execution plans different, if so how?
Small variations of execution time are expected and normal.
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
July 9, 2010 at 11:53 am
GilaMonster (7/9/2010)
Alan Vogan (7/9/2010)
Test 1: I ran with several columns selected (not exactly your query but one that made sense for the data). Query B ran 1 second faster than Query A for 2.5 million records.Test 2: I ran over the same data set, with just 2 columns. Equal time.
Test 3: I ran over 24 million records, with just 2 columns selected. Query B ran 1 second faster than Query A.
One second out of how many seconds total? How many times did you run each test? Are the execution plans different, if so how?
Small variations of execution time are expected and normal.
Test 1: 14 vs. 13 seconds
Test 2: 14 vs. 14 seconds
Test 3: 5:51 vs. 5:50 (min:sec)
I only ran the test once each time. The execution plans were the same for each test pair.
July 9, 2010 at 1:36 pm
Alan Vogan (7/9/2010)
GilaMonster (7/9/2010)
Alan Vogan (7/9/2010)
Test 1: I ran with several columns selected (not exactly your query but one that made sense for the data). Query B ran 1 second faster than Query A for 2.5 million records.Test 2: I ran over the same data set, with just 2 columns. Equal time.
Test 3: I ran over 24 million records, with just 2 columns selected. Query B ran 1 second faster than Query A.
One second out of how many seconds total? How many times did you run each test? Are the execution plans different, if so how?
Small variations of execution time are expected and normal.
Test 1: 14 vs. 13 seconds
Test 2: 14 vs. 14 seconds
Test 3: 5:51 vs. 5:50 (min:sec)
I only ran the test once each time. The execution plans were the same for each test pair.
The last one is no where close to statistically significant. Small variations in execution time are normal and I suspect a second run would produce different results. If the exec plans are identical, it means SQL will execute those two queries identically.
p.s. If you had the actual execution plan enabled for the tests, that would account for at least a portion of the execution time, generating and rendering of it.
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
July 9, 2010 at 3:10 pm
If you look closely at the execution plan for your first example, you should see that it only pulls the column(s) necessary for filtering or output from the joined table. The optimizer is intelligent enough to know just what is needed without your having to specify them in advance, as you did in the second example.
__________________________________________________
Against stupidity the gods themselves contend in vain. -- Friedrich Schiller
Stop, children, what's that sound? Everybody look what's going down. -- Stephen Stills
July 13, 2010 at 12:50 pm
There should be no difference in performance because the plans should be the same. The plans should be the same because the optimizer is smart enough to only output the columns you are selecting. So the reality is the derived table query and the traditional query will be treated the same; therefore, making the plans the same.
There are scenrarios where putting complex queries in derived tables can save IO and increase performance, but this simplistic example is not one.
Viewing 10 posts - 1 through 9 (of 9 total)
You must be logged in to reply to this topic. Login to reply