June 15, 2005 at 4:06 pm
This is not a critical production question, only something I have found that I can't figure out.
I have this query that returns 103,038 records:
Select
a.RecId
From
dbo.TableA a
Left Outer Join dbo.TableB b
on b.TableA_RecId = a.RecId
Where
b.RecId is NULL
If I modify this query by adding a filter in the right join table, the query now returns 103,353 records:
Select
a.RecId
From
dbo.TableA a
Left Outer Join dbo.TableB b
on b.TableA_RecId = a.RecId
and
b.Flag = 'Y'
Where
b.RecId is NULL
Since it's a Left Outer Join, and I'm only selecting those records where the right table returns a NULL record (b.RecId is a non-nullable identity column) shouldn't both queries return the same number of records? But, where clause aside, shouldn't a query with a filter return the same or lesser amount of records? Maybe having the filter in the join causes some interaction that I don't understand.
Is there a guru around who can tell me what is happening, here?
June 15, 2005 at 4:19 pm
Can you post the table DDL and a small sample set of data that causes the problem ?
I can't explain it, but IMO, it helps maintenance-wise to use NOT EXISTS to make the purpose of the query clearer. Left-Joining and checking for NULL on the table Left Joined to is a common coding construct, but I find NOT EXISTS to be a much clearer in expressing the intention of the query:
Select
a.RecId
From
dbo.TableA a
Where Not Exists (
Select *
From dbo.TableB b
Where b.TableA_RecId = a.RecId
And b.Flag = 'Y'
)
June 15, 2005 at 6:06 pm
Another reason for the # of records to go up is that NOT all of the records from table B have a Y flag. It stands to reason that if you add the criteria of where the flag = Y and there are records where it is NOT Y then wouldn't that change the counts?
Good Hunting!
AJ Ahrens
webmaster@kritter.net
June 15, 2005 at 6:41 pm
I have an answer. You're going to kick yourself when you realize what happens here... so allow me to explain.
Consider the simplified dataset below:
TableA
ID
1
3
4
6
7
8
TableB
ID Flag
1 Y
4 N
7 N
8 Y
The first query (Left outer join on ID only) would return:
2 (3 & 6 have no matching records)
The second query (LOJ on ID and Flag) would return:
4 (3 & 6 have no matching records: 4 & 7 do not match the 'Y' criteria for flag)
Got it?
June 15, 2005 at 7:35 pm
Your WHERE clause means that you only get the rows in tableA that don't match something in tableB. Since you're being MORE RESTRICTIVE in finding matching rows in your second queries, there are more records in tableA that don't match something in tableB, hence the number of records in your result goes up.
RobF
Rob Farley
LobsterPot Solutions & Adelaide SQL Server User Group
Company: http://www.lobsterpot.com.au
Blog: http://blogs.lobsterpot.com.au
June 16, 2005 at 7:50 am
OK. I can see now that I was so used to using queries to return matched records that I was thinking of this query in the same way. But it isn't the same, its the opposite. This query is returning unmatched records. So if I insert a filter which decreases the number of matched records, the number of unmatched records must, obviously (since the total number of records in TableA remains static), increase.
Viewing 6 posts - 1 through 5 (of 5 total)
You must be logged in to reply to this topic. Login to reply