Join vs Where

  • I've been researching this for the past hour and I cannot find an explanation for what I'm seeing here (or I'm just not understanding it correctly).

    Can someone help me understand the difference between putting a filter in a LEFT JOIN vs the WHERE?

    Example Query:

    SELECT

    t1.*

    , t2.*

    FROM

    tableA t1

    LEFT JOIN tableB t2

    ON t1.ID = t2.ID

    WHERE

    t2.ID IS NULL

    AND t1.Status IN ('A', 'B')

    vs

    SELECT

    t1.*

    , t2.*

    FROM

    tableA t1

    LEFT JOIN tableB t2

    ON t1.ID = t2.ID

    AND t1.Status IN ('A', 'B')

    WHERE

    t2.ID IS NULL

    For some reason when I run this query I get many more records in the second query than I do in the first. I don't understand what is going on here. When I run the execution plan I see that the table scan for t1 in the first query has the filter applied to it whereas the hash match that joins the two tables in the second query has the filter applied to listed under Probe Residual. I thought it would have worked exactly opposite that.

    Any understanding would be much appreciated, it's driving me nuts.

    - John

  • it's the left join that is confusing you.

    in the first query, the left table has all rows, and matching records from the second table get joined.

    after that, you are filtering the RIGHT table to remove everything that doesn't match effectively changing that to an INNER JOIN, then filtering on status.

    in the second query, the rows in TableB that were eliminated with the IS NULL are still in the results, so the dataset is bigger. change the second query to an INNER JOIN, and the two sqls will be have the same results.

    Lowell


    --help us help you! If you post a question, make sure you include a CREATE TABLE... statement and INSERT INTO... statement into that table to give the volunteers here representative data. with your description of the problem, we can provide a tested, verifiable solution to your question! asking the question the right way gets you a tested answer the fastest way possible!

  • See Kimberly Tripp's blog for a detailed explanation of what Lowell is talking about: http://www.sqlskills.com/BLOGS/KIMBERLY/post/Determining-the-position-of-search-arguments-in-a-join.aspx

    It's the difference between only joining on the records with that condition (e.g. condition in the Left Join), or filtering your results after joining the two complete sets (condition in the WHERE, which ends up looking like an inner join, since you only return the ones that meet)

    ---------------------------------------------------------
    How best to post your question[/url]
    How to post performance problems[/url]
    Tally Table:What it is and how it replaces a loop[/url]

    "stewsterl 80804 (10/16/2009)I guess when you stop and try to understand the solution provided you not only learn, but save yourself some headaches when you need to make any slight changes."

  • The real issue here is that moving the criteria for TableA to the ON clause for the join changes what you are asking SQL Server.

    In the first query, you are asking for all rows from TableA that have a status of A or B that do not have a matching row in TableB.

    In the second query, you are asking for all rows from TableA that do not have a matching row from TableB for those rows in TableA that have a status of A or B.

    The reason you get more rows in the second query is because there are more rows in TableA with a status other than A or B which will not be filtered out because the join doesn't find a match.

    Jeffrey Williams
    “We are all faced with a series of great opportunities brilliantly disguised as impossible situations.”

    ― Charles R. Swindoll

    How to post questions to get better answers faster
    Managing Transaction Logs

  • Thank you all for the responses. I understand where the counts I'm seeing are coming from and I can confirm the #s now (and WHY!)

Viewing 5 posts - 1 through 4 (of 4 total)

You must be logged in to reply to this topic. Login to reply