Inner Join question

Question

Post reply

Inner Join question

Khades

SSCrazy

Points: 2067
More actions
January 13, 2011 at 6:57 pm

#245351

SELECT T1.col1 FROM Table1 T1
INNER JOIN Table2 T2
ON T2.col2 = 'somestring'
WHERE T1.col2 <> T2.col1
T1.Col2 and T2.col1 are INT columns. T2.col1 is the Primary Key for Table2.
There is only one row that matches T2.col2 = 'somestring'. And lets say the T2.col1 for that row is 111.
This query can return more than a million rows (maybe a few million). Is there a real performance hinderance when it's not trying to match the rows and it's just simply attaching the row from table2 to all the rows from table1?
Is there any reason at all why the code should be done the way it's done above? Other than that row in table2 getting updated later on. Wouldnt the following code be much better?
SELECT col1 FROM Table1
WHERE col2 <> 111

Viewing 8 posts - 1 through 7 (of 7 total)

You must be logged in to reply to this topic. Login to reply

Koen Verbeeck SSC Guru Points: 259135 More actions · Answer 1

The second code will surely be much faster, but you've hard coded the condition into your statement.

Are you sure there will always be one row in table2 with col2 = 'something'?

Will you always be sure it is row number 111?

I don't really see a business justification in taking a subset of a table (the set of table2 with col2 = 'something'), then joining this set to another table with a join condition that the ID's can't match (which is almost a cross-join), but in the end you only take data from table1, namely table1.col1. It just seems strange to me.

What are you actually trying to achieve?

Need an answer? No, you need a question
My blog at https://sqlkover.com.
MCSE Business Intelligence - Microsoft Data Platform MVP

Khades SSCrazy Points: 2067 More actions · Answer 2

I'm just trying to understand this code. It's not my own.

And no I dont know if the information in the table will stay the same, or if it will always be 111 as the ID.

dant12 SSCrazy Points: 2712 More actions · Answer 3

you should follow up the code to get context on whats actually being done and why. asking around is your best bet

--
Thiago Dantas
@DantHimself

Khades SSCrazy Points: 2067 More actions · Answer 4

dant12 (1/14/2011)
you should follow up the code to get context on whats actually being done and why. asking around is your best bet

Working on that.

In the meantime, I guess I can do:

SELECT T1.col1 FROM Table1 T1

LEFT JOIN Table2 T2

ON T1.col2 = T2.col1 AND T2.col2 = 'somestring'

WHERE T2.col1 IS NULL

dant12 SSCrazy Points: 2712 More actions · Answer 5

siamak.s16 (1/14/2011)
dant12 (1/14/2011)
you should follow up the code to get context on whats actually being done and why. asking around is your best bet
Working on that.
In the meantime, I guess I can do:
SELECT T1.col1 FROM Table1 T1
LEFT JOIN Table2 T2
ON T1.col2 = T2.col1 AND T2.col2 = 'somestring'
WHERE T2.col1 IS NULL

from some testings, that will perfom poorer than the first one

--
Thiago Dantas
@DantHimself

Khades SSCrazy Points: 2067 More actions · Answer 6

Koen (da-zero) (1/14/2011)
The second code will surely be much faster, but you've hard coded the condition into your statement.
Are you sure there will always be one row in table2 with col2 = 'something'?
Will you always be sure it is row number 111?
I don't really see a business justification in taking a subset of a table (the set of table2 with col2 = 'something'), then joining this set to another table with a join condition that the ID's can't match (which is almost a cross-join), but in the end you only take data from table1, namely table1.col1. It just seems strange to me.
What are you actually trying to achieve?

They are trying to exclude rows in table1 that meet the criteria in table2.

I produced a left join that does the same thing, but it seems to go slower, at least from a statistics standpoint. When I run them both of them together the execution plan says 67% of the resources go to the cross join and 33% of the resources go to the left join. However in the statistics, the left join has a much higher logical reads and execution time number.

dant12 SSCrazy Points: 2712 More actions · Answer 7

siamak.s16 (1/18/2011)
Koen (da-zero) (1/14/2011)
The second code will surely be much faster, but you've hard coded the condition into your statement.
Are you sure there will always be one row in table2 with col2 = 'something'?
Will you always be sure it is row number 111?
I don't really see a business justification in taking a subset of a table (the set of table2 with col2 = 'something'), then joining this set to another table with a join condition that the ID's can't match (which is almost a cross-join), but in the end you only take data from table1, namely table1.col1. It just seems strange to me.
What are you actually trying to achieve?
They are trying to exclude rows in table1 that meet the criteria in table2.
I produced a left join that does the same thing, but it seems to go slower, at least from a statistics standpoint. When I run them both of them together the execution plan says 67% of the resources go to the cross join and 33% of the resources go to the left join. However in the statistics, the left join has a much higher logical reads and execution time number.

same results here

--
Thiago Dantas
@DantHimself