Inner Join question

  • SELECT T1.col1 FROM Table1 T1

    INNER JOIN Table2 T2

    ON T2.col2 = 'somestring'

    WHERE T1.col2 <> T2.col1

    T1.Col2 and T2.col1 are INT columns. T2.col1 is the Primary Key for Table2.

    There is only one row that matches T2.col2 = 'somestring'. And lets say the T2.col1 for that row is 111.

    This query can return more than a million rows (maybe a few million). Is there a real performance hinderance when it's not trying to match the rows and it's just simply attaching the row from table2 to all the rows from table1?

    Is there any reason at all why the code should be done the way it's done above? Other than that row in table2 getting updated later on. Wouldnt the following code be much better?

    SELECT col1 FROM Table1

    WHERE col2 <> 111

  • The second code will surely be much faster, but you've hard coded the condition into your statement.

    Are you sure there will always be one row in table2 with col2 = 'something'?

    Will you always be sure it is row number 111?

    I don't really see a business justification in taking a subset of a table (the set of table2 with col2 = 'something'), then joining this set to another table with a join condition that the ID's can't match (which is almost a cross-join), but in the end you only take data from table1, namely table1.col1. It just seems strange to me.

    What are you actually trying to achieve?

    Need an answer? No, you need a question
    My blog at https://sqlkover.com.
    MCSE Business Intelligence - Microsoft Data Platform MVP

  • I'm just trying to understand this code. It's not my own.

    And no I dont know if the information in the table will stay the same, or if it will always be 111 as the ID.

  • you should follow up the code to get context on whats actually being done and why. asking around is your best bet

    --
    Thiago Dantas
    @DantHimself

  • dant12 (1/14/2011)


    you should follow up the code to get context on whats actually being done and why. asking around is your best bet

    Working on that.

    In the meantime, I guess I can do:

    SELECT T1.col1 FROM Table1 T1

    LEFT JOIN Table2 T2

    ON T1.col2 = T2.col1 AND T2.col2 = 'somestring'

    WHERE T2.col1 IS NULL

  • siamak.s16 (1/14/2011)


    dant12 (1/14/2011)


    you should follow up the code to get context on whats actually being done and why. asking around is your best bet

    Working on that.

    In the meantime, I guess I can do:

    SELECT T1.col1 FROM Table1 T1

    LEFT JOIN Table2 T2

    ON T1.col2 = T2.col1 AND T2.col2 = 'somestring'

    WHERE T2.col1 IS NULL

    from some testings, that will perfom poorer than the first one

    --
    Thiago Dantas
    @DantHimself

  • Koen (da-zero) (1/14/2011)


    The second code will surely be much faster, but you've hard coded the condition into your statement.

    Are you sure there will always be one row in table2 with col2 = 'something'?

    Will you always be sure it is row number 111?

    I don't really see a business justification in taking a subset of a table (the set of table2 with col2 = 'something'), then joining this set to another table with a join condition that the ID's can't match (which is almost a cross-join), but in the end you only take data from table1, namely table1.col1. It just seems strange to me.

    What are you actually trying to achieve?

    They are trying to exclude rows in table1 that meet the criteria in table2.

    I produced a left join that does the same thing, but it seems to go slower, at least from a statistics standpoint. When I run them both of them together the execution plan says 67% of the resources go to the cross join and 33% of the resources go to the left join. However in the statistics, the left join has a much higher logical reads and execution time number.

  • siamak.s16 (1/18/2011)


    Koen (da-zero) (1/14/2011)


    The second code will surely be much faster, but you've hard coded the condition into your statement.

    Are you sure there will always be one row in table2 with col2 = 'something'?

    Will you always be sure it is row number 111?

    I don't really see a business justification in taking a subset of a table (the set of table2 with col2 = 'something'), then joining this set to another table with a join condition that the ID's can't match (which is almost a cross-join), but in the end you only take data from table1, namely table1.col1. It just seems strange to me.

    What are you actually trying to achieve?

    They are trying to exclude rows in table1 that meet the criteria in table2.

    I produced a left join that does the same thing, but it seems to go slower, at least from a statistics standpoint. When I run them both of them together the execution plan says 67% of the resources go to the cross join and 33% of the resources go to the left join. However in the statistics, the left join has a much higher logical reads and execution time number.

    same results here

    --
    Thiago Dantas
    @DantHimself

Viewing 8 posts - 1 through 7 (of 7 total)

You must be logged in to reply to this topic. Login to reply