When criteria in the where clause result is different than the same criteria in the on clause. Why?

  • Can someone explain why these two queries return different results? I have the criteria of the interfaceID in two different places and want to understand the processing differences between the two. Thank you.

    Query #1 interfaceID criteria in the join

    select ne.NotificationEventID,imp.NotificationEventID as impp,*

    from cad.NotificationEvent ne

    left join functional.InterfaceMonitorProcessLog imp

    on ne.NotificationEventID = imp.NotificationEventID

    and imp.InterfaceID = '256E9011-450A-4745-83DC-E01FF452CBA4'

    where NotificationEventTypeID = '0F81644D-BC41-4879-B3B5-397B1B94FE18'

    Query#2 interfaceID criteria in the where clause

    select ne.NotificationEventID,imp.NotificationEventID as impp,*

    from cad.NotificationEvent ne

    left join functional.InterfaceMonitorProcessLog imp

    on ne.NotificationEventID = imp.NotificationEventID

    where NotificationEventTypeID = '0F81644D-BC41-4879-B3B5-397B1B94FE18'

    and imp.InterfaceID = '256E9011-450A-4745-83DC-E01FF452CBA4'

  • I think we could figure this out a little easier if you included the query plan the each query produces. Could you run this with " include actual query plan" turned on and post it?

    Also, how do you know the result sets are in fact different? Are the record counts different?

    If so, to better understand why the two result sets are different - consider how SQL Server processes queries: The ON clause is evaluated before the WHERE clause. In your first query you are filtering for things where imp.InterfaceID = '256E9011-450A-4745-83DC-E01FF452CBA4' during the ON phase (during the first filter), in the second it is evaluated in the WHERE phase which is evaluated after the initial filter.

    "I cant stress enough the importance of switching from a sequential files mindset to set-based thinking. After you make the switch, you can spend your time tuning and optimizing your queries instead of maintaining lengthy, poor-performing code."

    -- Itzik Ben-Gan 2001

  • I am going to use a simplified set of tables to help explain what you are seeing.

    create table dbo.TableA(

    ID int,

    CustomValue int);

    create table dbo.TableB(

    ID int,

    CustomValue int);

    insert into dbo.TableA

    values (1,1),(1,2),(2,1);

    insert into dbo.TableB

    values (1,21),(2,20);

    SELECT *

    FROM

    dbo.TableA A

    INNER JOIN dbo.TableB B

    ON (A.ID = B.ID)

    WHERE

    B.CustomValue = 21;

    SELECT *

    FROM

    dbo.TableA A

    LEFT OUTER JOIN dbo.TableB B

    ON (A.ID = B.ID AND B.CustomValue = 21);

    SELECT *

    FROM

    dbo.TableA A

    LEFT OUTER JOIN dbo.TableB B

    ON (A.ID = B.ID)

    WHERE

    B.CustomValue = 21;

    go

    drop table dbo.TableA;

    drop table dbo.TableB;

    go

    The first query is a simple inner join returning all values from both tables where they are joined by a common ID value and B.CustomValue = 21.

    Now looking at the next two LEFT OUTER JOINs you will see queries similar to the ones you posted.

    The first LEFT OUTER JOIN returns all rows from TableA plus those from TableB where B.CustomValue = 21 and the ID values match. This filter on TableB, CustomValue = 21, occurs before the tables are joined on the ID column so you will get values from TableA where there is no matching ID in TableB.

    In the second query the join occurs first then the resulting dataset is filtered for values where B.CustomValue = 21.

    If you run the code you will notice that the first and third queries return the same result set. The third query is basically an INNER JOIN. The reason for this is that you will lose rows of data where the column being filtered is NULL since NULL does not equal 21 (nor is it NOT EQUAL (<>) 21).

    Does this help some?

  • http://www.sqlskills.com/blogs/kimberly/determining-the-position-of-search-arguments-in-a-join/

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass

Viewing 4 posts - 1 through 3 (of 3 total)

You must be logged in to reply to this topic. Login to reply