Query to return parent records with no child records

  • Hi all,

    In my DB i have a 1:M relationship between contact and incident tables(a contact can have many incidents). I would like to know what query can return all the contacts without incidents.

    Many Thanks

    P.

  • You can use a LEFT OUTER JOIN or WHERE NOT EXISTS or WHERE NOT IN. The method is really up to you.

    I have typically used LEFT OUTER JOIN's for this, but have moved more towards WHERE NOT EXISTS because there is the possibility for a slight performance gain. Here are 2 examples:

    SELECT

    C.contact_id

    FROM

    contacts C LEFT JOIN

    incidents I ON

    C.contact_id = I.contact_id

    WHERE

    I.contact_id IS NULL -- this eliminates the ones that exist

    Or

    SELECT

    C.contact_id

    FROM

    contacts C

    WHERE

    NOT EXISTS (SELECT 1 FROM incidents I WHERE C.contact_id = I.contact_id)

  • Select ContactId from Contacts

    EXCEPT

    Select ContactId from Incidents

    ----------------------------------------------------------------------------------------------------------------------------------------------------------------------
    Sometimes, winning is not an issue but trying.
    You can check my BLOG
    [font="Arial Black"]here[/font][/url][/right]

  • Hi all,

    It worked, thanks.

  • Out of interest which method did you take?

    I'm with Jack on this, I have started moving to NOT EXISTS as I also find that I often get better performance...

    ----------------------------------------------
    Try to learn something about everything and everything about something. - Thomas Henry Huxley

    :w00t:
    Posting Best Practices[/url]
    Numbers / Tally Tables[/url]

    SQL-4-Life
  • I tried both but ended up using NOT EXISTS because like you it perfomed better for me. I think its always best to use it.

  • be careful, because it isn't always the best option...

    ----------------------------------------------
    Try to learn something about everything and everything about something. - Thomas Henry Huxley

    :w00t:
    Posting Best Practices[/url]
    Numbers / Tally Tables[/url]

    SQL-4-Life
  • Atif Sheikh (4/24/2009)


    Select ContactId from Contacts

    EXCEPT

    Select ContactId from Incidents

    I always forget about EXCEPT as it was new to 2005 and the majority of my experience has been in 2000. I'll need to do some tests.

  • I took a couple of minutes to load up the AdventureWorks database Product and SalesOrderDetail tables with 100,000 and 1,000,000 rows respectively followed by update statistics full scan and then ran the 3 scenarios like this:

    SET STATISTICS IO ON

    SET STATISTICS TIME ON

    DBCC DropCleanBuffers()

    GO

    PRINT 'Left Join'

    SELECT

    P.ProductID

    FROM

    Production.Product AS P LEFT JOIN

    Sales.SalesOrderDetail AS SOD ON P.ProductID = SOD.ProductID

    WHERE

    SOD.ProductID IS NULL

    DBCC DropCleanBuffers()

    GO

    PRINT 'Not Exists'

    SELECT

    P.ProductID

    FROM

    Production.Product AS P

    WHERE

    NOT EXISTS (SELECT 1 FROM Sales.SalesOrderDetail AS SOD WHERE P.ProductID = SOD.ProductID)

    DBCC DropCleanBuffers()

    GO

    PRINT 'Except'

    SELECT

    P.ProductID

    FROM

    Production.Product AS P

    EXCEPT

    SELECT

    SOD.ProductID

    FROM

    Sales.SalesOrderDetail AS SOD

    DBCC DropCleanBuffers()

    GO

    PRINT 'Not Exists no Parallelism'

    SELECT

    P.ProductID

    FROM

    Production.Product AS P

    WHERE

    NOT EXISTS (SELECT 1 FROM Sales.SalesOrderDetail AS SOD WHERE P.ProductID = SOD.ProductID)

    OPTION (MAXDOP 1)

    DBCC DropCleanBuffers()

    GO

    PRINT 'Except no Parallelism'

    SELECT

    P.ProductID

    FROM

    Production.Product AS P

    EXCEPT

    SELECT

    SOD.ProductID

    FROM

    Sales.SalesOrderDetail AS SOD OPtion (MAXDOP 1)

    SET STATISTICS IO OFF

    SET STATISTICS TIME OFF

    Here are the results:

    Left Join

    Table 'SalesOrderDetail'. Scan count 1, logical reads 3510, physical reads 3, read-ahead reads 3645, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    Table 'Product'. Scan count 1, logical reads 2407, physical reads 2, read-ahead reads 2398, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    SQL Server Execution Times:

    CPU time = 344 ms, elapsed time = 5894 ms.

    Not Exists

    Table 'SalesOrderDetail'. Scan count 3, logical reads 3878, physical reads 6, read-ahead reads 3621, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    Table 'Product'. Scan count 3, logical reads 1464, physical reads 3, read-ahead reads 384, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    Table 'Worktable'. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    SQL Server Execution Times:

    CPU time = 344 ms, elapsed time = 6251 ms.

    Except

    Table 'SalesOrderDetail'. Scan count 3, logical reads 3878, physical reads 6, read-ahead reads 3621, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    Table 'Product'. Scan count 3, logical reads 1464, physical reads 3, read-ahead reads 384, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    Table 'Worktable'. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    SQL Server Execution Times:

    CPU time = 375 ms, elapsed time = 6063 ms.

    Not Exists no Parallelism

    Table 'Worktable'. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    Table 'Product'. Scan count 1, logical reads 369, physical reads 2, read-ahead reads 392, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    Table 'SalesOrderDetail'. Scan count 1, logical reads 3510, physical reads 3, read-ahead reads 3645, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    SQL Server Execution Times:

    CPU time = 375 ms, elapsed time = 6092 ms.

    Except no Parallelism

    Table 'Worktable'. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    Table 'Product'. Scan count 1, logical reads 369, physical reads 2, read-ahead reads 392, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    Table 'SalesOrderDetail'. Scan count 1, logical reads 3510, physical reads 3, read-ahead reads 3645, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    SQL Server Execution Times:

    CPU time = 422 ms, elapsed time = 5741 ms.

    They all took about the same amount of time. LEFT JOIN seems to be a bit more I/O intensive (More Reads), and NOT EXISTS/EXCEPT appear to be more CPU intensive.

    Attached also are the query plan(s) generated which show the LEFT JOIN being the most expensive and the EXCEPT/NOT EXISTS generating the same plan. Just remove the ".txt" and it will open in SSMS.

  • I think it all depends upon the indexes on your data. The frist choice, as per my opinion, should be LEFT OUTER JOIN for consistent performance.

    ----------------------------------------------------------------------------------------------------------------------------------------------------------------------
    Sometimes, winning is not an issue but trying.
    You can check my BLOG
    [font="Arial Black"]here[/font][/url][/right]

  • Hello Atif Sheikh/Jack Corbett,

    Thanks for the useful posts, it helped me a lot.

    I'm doing the classic stuff in SQL Server update existing rows from Table1 to Table2 and Insert new rows in Table1 into Table2.

    while working with 100,000 rows I found the performance in the following descending order

    CPU time

    EXCEPT < LEFT OUTER JOIN < NOT EXISTS

    Can you guys help me out here, suggest what would be the best thing to do?

    -- update

    update Table2

    FROM TABLE1 t1, Table2 t2 where T1.Id = T2.Id

    --insert (LEFT OUTER JOIN)

    INSERT INTO Table2 (Id ... Column List)

    SELECT from Table1 T1

    LEFT OUTER JOIN Table2 T2

    ON T1.Id = T2.Id

    WHERE T2.Id IS NULL

    --insert (EXCEPT)

    INSERT INTO Table2 (Id ... )

    SELECT from Table T1,

    ((SELECT Id from Table1)

    EXCEPT

    (SELECT Id from Table2)) E2

    WHERE T1.Id = E2.Id

    --insert (NOT EXISTS)

    INSERT INTO Table2 (Id ... )

    SELECT from Table1 T1

    WHERE NOT EXIST (SELECT 1 FROM Table2 NT WHERE T1.Id = NT.Id)

  • AA Hamed,

    I have done multiple updates/inserts using left outer joins and they worked like a charm. I could have used EXCEPT but.. I have 9 update/inserts executing in 19 seconds. That's not too bad. They scan through the first table which contains about 78k records and then update/insert into my second table. Hope this helps you.

    Thanks,

    S

    --
    :hehe:

  • Thanks for reply, yea looks like LEFT OUTER JOIN works better in this scenario.

Viewing 13 posts - 1 through 12 (of 12 total)

You must be logged in to reply to this topic. Login to reply