January 21, 2009 at 1:25 pm
I have a parent child relation between to tables. Not all parents have children, some parents have 1 or more children. I want all of the Parent rows that do not have any Child rows.
There are about 1 million Parent rows, and about 500 thousand Child rows, so I am sure there are at least 500 thousand Parent rows (there are several more because some Parents have multiple children). I am sure that my query is correct, it is in the form;
SELECT *
FROM Parent
WHERE ParentID NOT IN (SELECT ParentID FROM Child)
But, I am getting 0 results. My query plan follows as best I can visualize it here;
SELECT <-- HASH Match (Right Anti Semi Join) <-- Index Scan (Child non-clustered Index)
^-- Nested Loops (Left Anti Semi Join [Warning: No Join Predicate]) <-- Table Scan (Parent)
^-- Row Count Spool (Lazy Spool) <-- Clustered Index Seek (Child)
My query plan seems to show a broken query plan, but I have no clue what is causing it to choose this plan nor how to fix the plan.
Any suggestions would be greatly appreciated!
Dennis
January 21, 2009 at 1:40 pm
Dennis D. Allen (1/21/2009)
I have a parent child relation between to tables. Not all parents have children, some parents have 1 or more children. I want all of the Parent rows that do not have any Child rows.There are about 1 million Parent rows, and about 500 thousand Child rows, so I am sure there are at least 500 thousand Parent rows (there are several more because some Parents have multiple children). I am sure that my query is correct, it is in the form;
SELECT *
FROM Parent
WHERE ParentID NOT IN (SELECT ParentID FROM Child)
But, I am getting 0 results. My query plan follows as best I can visualize it here;
SELECT <-- HASH Match (Right Anti Semi Join) <-- Index Scan (Child non-clustered Index)
^-- Nested Loops (Left Anti Semi Join [Warning: No Join Predicate]) <-- Table Scan (Parent)
^-- Row Count Spool (Lazy Spool) <-- Clustered Index Seek (Child)
My query plan seems to show a broken query plan, but I have no clue what is causing it to choose this plan nor how to fix the plan.
Any suggestions would be greatly appreciated!
Dennis
Try this:
select
p.*
from
Parent p
left outer join Child c
on (p.ParentID = c.ParentID)
where
c.ParentID is null
Let us know if this works.
January 21, 2009 at 1:42 pm
Depending on your indexes and such this could be rather a bit on the long side but why not something like the following???
SELECT Parent.*
FROM Parent Left Join Child ON Parent.ParentID = Child.ParentID
WHERE child.ParentID IS NULL
January 21, 2009 at 1:43 pm
January 21, 2009 at 1:46 pm
Use the Force Luke...
Actually, nothing wrong with posting an answer that's already been posted. One thing you'll see is a difference in formating styles.
January 21, 2009 at 1:52 pm
Yeah, I used to run all my posts through a SQL formatter, but My HIPS software doesn't like any of the online ones anymore (keeps thinking I'm trying to run a SQL injection attack against them) so I just went with a copy and paste from the OP and added the appropriate syntax. Typically I'd end up having mine look eerily similar to yours, at least in my production code.
-Luke.
January 21, 2009 at 1:55 pm
Format by hand do I.
January 21, 2009 at 1:57 pm
Greetings Lynn,
Yes that does work and is a fine workaround.
I want to understand why this does not work as it should work correctly and the fact that it is not might point to an error in my system that I would like to find and fix. Also, I am trying to learn to read the execution plan better.
Any ideas?
January 21, 2009 at 1:58 pm
January 21, 2009 at 2:06 pm
Dennis D. Allen (1/21/2009)
Greetings Lynn,Yes that does work and is a fine workaround.
I want to understand why this does not work as it should work correctly and the fact that it is not might point to an error in my system that I would like to find and fix. Also, I am trying to learn to read the execution plan better.
Any ideas?
Hi Dennis, the reason is really simple. When you run the query:
SELECT ParentID FROM Child;
You are returning at least one row where the ParentID is NULL. That forces the IN statement to evaluate to UNKNOWN which is not TRUE and you get no rows.
Jeffrey Williams
“We are all faced with a series of great opportunities brilliantly disguised as impossible situations.”
― Charles R. Swindoll
How to post questions to get better answers faster
Managing Transaction Logs
January 21, 2009 at 2:13 pm
Dennis D. Allen (1/21/2009)
Greetings Lynn,Yes that does work and is a fine workaround.
I want to understand why this does not work as it should work correctly and the fact that it is not might point to an error in my system that I would like to find and fix. Also, I am trying to learn to read the execution plan better.
Any ideas?
Not that what I gave you is a workaround. I actually think that it is more efficient than using NOT IN. The execution plan showed a nested loop. Think about this for a moment. For each row in Parent, you have to loop through 1,000,000 values returned from Child before knowing if it does not exist in Child. Not very efficient.
Now, the left outer join returns all records from Parent regardless of a matching record in Child, and you then filter on the Child ParentID, which will be null if there was no match to the ParentID from Parent.
January 21, 2009 at 2:46 pm
Jeffrey Williams (1/21/2009)
Hi Dennis, the reason is really simple. When you run the query:SELECT ParentID FROM Child;
You are returning at least one row where the ParentID is NULL. That forces the IN statement to evaluate to UNKNOWN which is not TRUE and you get no rows.
That was indeed the issue. Since the child table should not allow NULLs in the ParentID column, I never thought to look for that.
Thank you so much!
Dennis
January 21, 2009 at 3:10 pm
Lynn Pettis (1/21/2009)
Not that what I gave you is a workaround. I actually think that it is more efficient than using NOT IN. The execution plan showed a nested loop. Think about this for a moment. For each row in Parent, you have to loop through 1,000,000 values returned from Child before knowing if it does not exist in Child. Not very efficient.Now, the left outer join returns all records from Parent regardless of a matching record in Child, and you then filter on the Child ParentID, which will be null if there was no match to the ParentID from Parent.
I intended workaround in the sense that the technique I was employing was causing unexpected results, and using a different technique was avoiding finding the cause of the unexpected behavior.
Thank you for pointing out the efficiency gain in your technique. In comparing the executing plan and both the IO and TIME statistics, it is evident you are correct. Your query produces a 50% reduction in table scans on the Child and 15% reduction in logical reads on the child and a 15% gain in execution time.
Thank you Lynn 🙂
January 21, 2009 at 3:19 pm
Lynn Pettis (1/21/2009)
Use the Force Luke...
Doggone It! I've been waiting for weeks for an excuse to use that line on Luke!
😎
[font="Times New Roman"]-- RBarryYoung[/font], [font="Times New Roman"] (302)375-0451[/font] blog: MovingSQL.com, Twitter: @RBarryYoung[font="Arial Black"]
Proactive Performance Solutions, Inc. [/font][font="Verdana"] "Performance is our middle name."[/font]
January 21, 2009 at 3:23 pm
Ya know, I'd say that "I'd never join you" in my whiniest text possible, however who knows what the future will bring...
And I've been wondering for weeks why we have 2 Vaders running around here, or are you and Perry one in the same?
-Luke.
Viewing 15 posts - 1 through 15 (of 17 total)
You must be logged in to reply to this topic. Login to reply