LEFT OUTER JOIN IS NULL or WHERE xx NOT IN(select..)

  • Which query is better for performance? It's not readily apparent to me by looking at the Execution Plans, neither method utilizes a Table Scan.. So is it a coin toss?

    Does anyone have any best practices or standards on this topic?

     

    Select inv.*

    FROM dbo.Inventory inv

    LEFT OUTER JOIN dbo.InvHist hst

    ON hst.InvID = inv.InvID

    WHERE hst.InvID IS NULL

    OR THIS QUERY

    Select inv.*

    FROM dbo.Inventory inv

    WHERE inv.IndID NOT IN(SELECT InvID FROM dbo.InvHist)

     

    - tia

    Mark

  • These queries are not equal, they potentially return different results.

    See http://www.sqlservercentral.com/forums/shwmessage.aspx?forumid=8&messageid=236443

    _____________
    Code for TallyGenerator

  • Actually, is this case only, the two SQL statements are logically equivalant and would produce identical results.

    The join criteria is "ON hst.InvID = inv.InvID" and since nulls = null evaluates to unknown, not true, the InvHist row with nulls would not be included. Then the restriction "WHERE hst.InvID IS NULL", elimates all rows that are joined.

    The SQL optimizer is smart enought to recognize this and changes the 'LEFT OUTER JOIN" to "LEFT SEMI JOIN" sometimes also known as "EXISTENTIAL JOIN".

    To the original question, although in this case, both SQL statements will have identical execution plans, there are a more complex case, such as when the join criteria has multiple columns, where neither solution is optimal.

    IMHO, the below SQL is easier to understand than either of the posted SQL statements and handles multiple join columns and where clauses in a clearer fashion.

    select inventory.*

    from inventory

    where NOT EXISTS

    (select 1

    from InvHist

    where InvHist.PK_Column_One = inventory.PK_Column_One

    and InvHist.PK_Column_Two = inventory.PK_Column_Two

    )

    SQL = Scarcely Qualifies as a Language

  • 2 Carl

    Did you open the link?

    Your mistake is very common.

    Comparing (Null = Null) returns FALSE, that's why lines with inv.InvID = NULL will not be included in resultset by NOT IN but will be included by LEFT JOIN.

     

    _____________
    Code for TallyGenerator

  • Yes, I did open the link but do not see how using a left outer join will help in those case where the join key in the dependent table is NULL. I also converted the SQL to use the Northwind database substiting the Products and [Order Details] tables.

    As to Comparing (Null = Null), we agree on the final result set, which is that the rows would not be joined.

    Can we first agree that Null = Null is a shorthand to mean that the we are comparing two variables that are set to "missing" (The variables could be columns) ?

    If so, we do disagree on the result of comparison operators which I believe to be a "three valued" Boolean with "values" of true, false or unknown. Then "where any comparison operator" in three value logic with either variable being null evaluates to UNKNOWN. Since UNKNOWN is NOT true, the combination of rows fails the comparison, would not be joined, and would not appear in the result set. Amusingly, NOT (null = null) also evaluates to UNKNOWN, which the means the not comparison is also not true.

    Was this the point of the referenced post?

    With nulls, Aristole's axiom of A or NOT A is FALSE (poor Ayn Rand, one of her book chapters is false).

    Using Northwind and after adding a new Product, try the below 3 SQL statements and look at the result set and the execution plans.

    Select Products.*

    FROM dbo.Products

    LEFT OUTER JOIN dbo.[Order Details] as OrderDetails

    ON Products.ProductId = OrderDetails.ProductId

    WHERE OrderDetails.ProductId IS NULL

    Select Products.*

    FROM dbo.Products

    WHERE Products.ProductId

    NOT IN (SELECT OrderDetails.ProductId from dbo.[Order Details] as OrderDetails)

    Select Products.*

    FROM dbo.Products

    WHERE NOT EXISTS

    (SELECT "AnyOldConstant"

    from dbo.[Order Details] as OrderDetails

    where OrderDetails.ProductId = Products.ProductId

    )

    Just for fun, imagine what the result set would be if the OrderDetails had a row where the productId was null and you then ran this SQL:

    Select Products.*

    FROM dbo.Products

    WHERE NOT EXISTS

    (SELECT "AnyOldConstant"

    from dbo.[Order Details] as OrderDetails

    where OrderDetails.ProductId = Products.ProductId

    )

    AND NOT EXISTS

    (SELECT "AnyOldConstant"

    from dbo.[Order Details] as OrderDetails

    where OrderDetails.ProductId IS NULL

    )

    Cheers

    SQL = Scarcely Qualifies as a Language

  • Yes, you made big effort, you are good educated, you even know who was Aristotel, but you are just mising the point.

    Hope this will explain the problem without long scientific disputes.

    create table #Thing (

     Name varchar(50),

     TypeId int NULL)

    create table #Type (

     Id int identity(1,1),

     Name varchar(50) NOT NULL)

    insert into #Type (Name)

    Select 'Type A'

    UNION

    Select 'Type B'

    insert into #Thing (Name, TypeId)

    Select 'Thing A', 1

    UNION

    Select 'Thing B', 2

    UNION

    Select 'Thing C', 3

    UNION

    Select 'Thing D', NULL

    UNION

    Select 'Thing E', NULL

    Select * from #Thing

    where TypeId not in (select Id from #Type)

    Select * from #Thing

     left join #Type on #Thing.TypeId = #Type.Id

    Where #Type.Id IS NULL

    DROP TABLE #THING

    DROP TABLE #TYPE

     

    _____________
    Code for TallyGenerator

  • Oh well, who is John Galt?


    Dutch Anti-RBAR League

  • just my 2ct

    how about ...

    Select inv.*

    FROM dbo.Inventory inv

    where not exists ( select *

                             FROM dbo.InvHist IH

                             where IH.InvID = inv.InvID)

     

    Johan

    Learn to play, play to learn !

    Dont drive faster than your guardian angel can fly ...
    but keeping both feet on the ground wont get you anywhere :w00t:

    - How to post Performance Problems
    - How to post data/code to get the best help[/url]

    - How to prevent a sore throat after hours of presenting ppt

    press F1 for solution, press shift+F1 for urgent solution 😀

    Need a bit of Powershell? How about this

    Who am I ? Sometimes this is me but most of the time this is me

  • Sergiy, you are absolutely correct that the "LEFT OUTER JOIN" produces a different result compared to the "NOT IN"

    I did not think about the case where the columns could be null. 

    There is SQL below for "what suppliers do not have a customer in the same country?".

    For the cases of "LEFT OUTER JOIN" versus "NOT EXISTS", the results are the same, but the "NOT EXISTS" performs fewer logical reads, although when there is an index on customer.country, the difference is small.

    The logical reads are:

      Without an index on customer.country After adding an index on customer.country

    Table        "LEFT OUTER JOIN" "NOT EXISTS"  "LEFT OUTER JOIN" "NOT EXISTS"

    Supplier         6     3    6     3

    Customer    230    83   31    25

    TOTAL   236  86   37  28

    Here the reproduction SQL:

    Insert into dbo.Customers

     (CustomerID, CompanyName, Country, ContactName, ContactTitle, Address, City, Region, PostalCode,  Phone, Fax)

    SELECT 'NCnty'

    , 'Customer with No Country'

    , null

    , ContactName, ContactTitle, Address, City, Region, PostalCode,  Phone, Fax

    FROM  dbo.Customers

    where CustomerID = N'ALFKI'

    go

    Insert into dbo.Customers

     (CustomerID, CompanyName, Country, ContactName, ContactTitle, Address, City, Region, PostalCode,  Phone, Fax)

    SELECT 'NCnt2'

    , 'Customer with No Country - Another'

    , null

    , ContactName, ContactTitle, Address, City, Region, PostalCode,  Phone, Fax

    FROM  dbo.Customers

    where CustomerID = N'ALFKI'

    go

    insert into  dbo.suppliers

    (CompanyName, Country, ContactName, ContactTitle, Address, City, Region, PostalCode,  Phone, Fax, HomePage)

    select 'Supplier with No Country'

    , null

    , ContactName, ContactTitle, Address, City, Region, PostalCode,  Phone, Fax, HomePage

    from  dbo.suppliers

    where supplierId = 1

    go

    dbcc FREEPROCCACHE

    go

    update statistics dbo.suppliers

    go

    update statistics dbo.Customers

    go

    set statistics io on

    go

    select  suppliers.supplierId, suppliers.country, suppliers.CompanyName

    from  dbo.suppliers

    left outer join customers

    on customers.country = suppliers.country

    where customers.country is null

    go

    select  suppliers.supplierId, suppliers.country, suppliers.CompanyName

    from  dbo.suppliers

    where suppliers.country NOT IN

     (select customers.country

     from dbo.customers

    &nbsp

    go

    select  suppliers.supplierId, suppliers.country, suppliers.CompanyName

    from  dbo.suppliers

    where NOT EXISTS

     (select 1

     from dbo.customers

     where customers.country = suppliers.country

    &nbsp

    go

    set statistics io off

    go

    SQL = Scarcely Qualifies as a Language

Viewing 9 posts - 1 through 8 (of 8 total)

You must be logged in to reply to this topic. Login to reply