March 24, 2008 at 2:12 pm
Hello,
I have an assigned_properties and a properties table.
I want to select all property_Ids that are not found in assigned_properties. This sounds so simple but for some reason I can't get my mind around it.
Examples?
Thanks!
March 24, 2008 at 2:23 pm
What have you tried?
______________________________________________________________________
Personal Motto: Why push the envelope when you can just open it?
If you follow the direction given HERE[/url] you'll likely increase the number and quality of responses you get to your question.
Jason L. SelburgMarch 24, 2008 at 2:30 pm
Select ID
From Properties
Where ID Not IN (Select Assigned.PropertyID From Assigned)
[font="Times New Roman"]-- RBarryYoung[/font], [font="Times New Roman"] (302)375-0451[/font] blog: MovingSQL.com, Twitter: @RBarryYoung[font="Arial Black"]
Proactive Performance Solutions, Inc. [/font][font="Verdana"] "Performance is our middle name."[/font]
March 24, 2008 at 2:31 pm
I know this is completely wrong as it selects the assigned records (and sorta duplicates efforts) but just so you can have an example:
Select * From Properties INNER JOIN (Select P1.* From Properties As P1 INNER JOIN Assigned_Project_Properties
ON Assigned_Project_Properties.Property_Id = P1.Property_Id) As P2 ON P2.Property_id = Properties.Property_Id
March 24, 2008 at 2:35 pm
Thanks! I'm on the right track now.
March 24, 2008 at 2:35 pm
SELECT *
FROM Property a
left JOIN Assigned_properties b
on a.propertyid = b.propertyid
where b.property_id is null
Also, you could use except
SELECT Property_id
from Property
except
select property_id
from assigned_properties
March 24, 2008 at 2:42 pm
FYI, I only asked what you've tried to make sure this wasn't homework. It still may be, but you've shown that you've put forth effort.
So ....
SELECT
*
FROM
Properties AS p
LEFT JOIN Assigned_Project_Properties AS ap
ON Assigned_Project_Properties.Property_Id = P1.Property_Id
WHERE
p.Property_Id IS NULL
Think of it this way, You are combining "Properties" and "Assigned_Project_Properties" data together on the "Property_Id". The records that do not have a cooresponding record in the "Assigned_Project_Properties" or RIGHT table because it's on the RIGHT side of the JOIN, will have a vaule of NULL for each field. So limiting the results to those gets you what you want.
Take of the WHERE clause, and it may become more apparent.
______________________________________________________________________
Personal Motto: Why push the envelope when you can just open it?
If you follow the direction given HERE[/url] you'll likely increase the number and quality of responses you get to your question.
Jason L. SelburgMarch 25, 2008 at 6:06 am
Just my 2 cents on this one... I've found that the WHERE NOT IN that Barry used is usually more performant than outer joins. The speed difference isn't huge over a million rows, but it is faster. Compare the execution plans to see why.
I've not tried the EXCEPT intersection (or rather lack of intersection) yet.
--Jeff Moden
Change is inevitable... Change for the better is not.
March 25, 2008 at 6:38 am
Jeff Moden (3/25/2008)
Just my 2 cents on this one... I've found that the WHERE NOT IN that Barry used is usually more performant than outer joins. The speed difference isn't huge over a million rows, but it is faster. Compare the execution plans to see why.I've not tried the EXCEPT intersection (or rather lack of intersection) yet.
Really? I'll have to keep that in mind and do some tests. It's not that I don't believe you, I'm just a see for myself kind of guy.
Now you've got me curious enought to see why this is the case, and to look deeper into the EXCEPT keyword.
Thanks for the 2 cents Jeff. Keep it up and I will have a whole $3.27 from you. 😉
______________________________________________________________________
Personal Motto: Why push the envelope when you can just open it?
If you follow the direction given HERE[/url] you'll likely increase the number and quality of responses you get to your question.
Jason L. SelburgMarch 25, 2008 at 7:03 am
Except operates as an outer join. It compares the results of both queries and returns distinct values of the left table. These are the values that exist in the left table, but not in the right. Thus, it would provide the same execution plan as the left outer join.
March 25, 2008 at 9:05 am
Adam Haines (3/25/2008)
Except operates as an outer join. It compares the results of both queries and returns distinct values of the left table. These are the values that exist in the left table, but not in the right. Thus, it would provide the same execution plan as the left outer join.
Thanks for the info, Adam... have you actually done a comparison/performance test on EXCEPT?
--Jeff Moden
Change is inevitable... Change for the better is not.
March 25, 2008 at 9:06 am
Jason Selburg (3/25/2008)
Really? I'll have to keep that in mind and do some tests. It's not that I don't believe you, I'm just a see for myself kind of guy.Now you've got me curious enought to see why this is the case, and to look deeper into the EXCEPT keyword.
I've got some old tests laying around somewhere... I'll see if I can dig them up, as well.
--Jeff Moden
Change is inevitable... Change for the better is not.
March 25, 2008 at 10:19 am
I have done a small scale test (not the million row type you do) and it did in fact use the same query plan. Obviously, the main difference here is the tables are not joined. The results of each query is compared, not just key columns.
For every result that differs, it is considered distinct and thus returned. To avoid this behavior on keys you should only list values that should be common among the two queries. For example, if a column named update_dt is different but everything else in the row is the same, the row is still considered distinct, therefore, the update_dt column should not be included.
March 25, 2008 at 10:38 am
After a little research it seems that it is equivlant to a left join, but the actual plan is more like a NOT IN statement.
See the link below:
http://blog.sqlauthority.com/2007/05/22/sql-server-2005-comparison-except-operator-vs-not-in/
March 25, 2008 at 8:53 pm
Thanks for the link, Adam... that reminded me where I put my test code... and I added "EXCEPTION" to it...
First, I just don't trust execution plans, estimated or actual... especially where the % of batch comes in.
Remember I said that WHERE NOT IN was better than LEFT OUTER JOIN with a null search? Here's the proof... with a caveat that I'll explain in a minute...
--===== Select the database to use
USE AdventureWorks
GO
--===== Test the WHERE NOT IN method
PRINT REPLICATE('=',80)
PRINT '========== Test the WHERE NOT IN method =========='
SET STATISTICS IO ON
SET STATISTICS TIME ON
SELECT ProductID
FROM Production.Product
WHERE ProductID
NOT IN (SELECT ProductID
FROM Production.WorkOrder)
SET STATISTICS TIME OFF
SET STATISTICS IO OFF
GO
--===== Test the WHERE NOT EXISTS method
PRINT REPLICATE('=',80)
PRINT '========== Test the WHERE NOT EXISTS method =========='
SET STATISTICS IO ON
SET STATISTICS TIME ON
SELECT p.ProductID
FROM Production.Product p
WHERE NOT EXISTS (SELECT 1 FROM Production.WorkOrder w WHERE w.ProductID = p.ProductID)
SET STATISTICS TIME OFF
SET STATISTICS IO OFF
GO
--===== Test the EXCEPT method
PRINT REPLICATE('=',80)
PRINT '========== Test the EXCEPT method =========='
SET STATISTICS IO ON
SET STATISTICS TIME ON
SELECT ProductID
FROM Production.Product
EXCEPT --------------------->>>
SELECT ProductID
FROM Production.WorkOrder
SET STATISTICS TIME OFF
SET STATISTICS IO OFF
GO
--===== Test the LEFT OUTER JOIN method
PRINT REPLICATE('=',80)
PRINT '========== Test the LEFT OUTER JOIN method =========='
SET STATISTICS IO ON
SET STATISTICS TIME ON
SELECT p.ProductID
FROM Production.Product p
LEFT OUTER JOIN Production.WorkOrder w
ON p.ProductID = w.ProductID
WHERE w.ProductID IS NULL
SET STATISTICS TIME OFF
SET STATISTICS IO OFF
GO
... the caveat is that the fewer number of rows there are, the worse the LEFT OUTER JOIN does... but use a million rows looking for only 50,000 missing rows, and everyone becomes equal...
--===== Create the test tables
SELECT TOP 1000000
RowNum = IDENTITY(INT,1,1)
INTO #JBMTestBig
FROM Master.dbo.SysColumns t1,
Master.dbo.SysColumns t2 --Lack of join criteria makes this a CROSS-JOIN
ALTER TABLE #JBMTestBig
ADD CONSTRAINT PK_JBMTestBig_RowNum PRIMARY KEY CLUSTERED (RowNum)
SELECT TOP 950000
RowNum = IDENTITY(INT,1,1)
INTO #JBMTestSmall
FROM Master.dbo.SysColumns t1,
Master.dbo.SysColumns t2 --Lack of join criteria makes this a CROSS-JOIN
ALTER TABLE #JBMTestSmall
ADD CONSTRAINT PK_JBMTestSmall_RowNum PRIMARY KEY CLUSTERED (RowNum)
GO
--===== Rerun from here down for addition testing in same connection
--===== Test the WHERE NOT IN method
PRINT REPLICATE('=',80)
PRINT '========== Test the WHERE NOT IN method =========='
SET STATISTICS IO ON
SET STATISTICS TIME ON
SELECT RowNum
FROM #JBMTestBig
WHERE RowNum
NOT IN (SELECT RowNum
FROM #JBMTestSmall)
SET STATISTICS TIME OFF
SET STATISTICS IO OFF
GO
--===== Test the WHERE NOT EXISTS method
PRINT REPLICATE('=',80)
PRINT '========== Test the WHERE NOT EXISTS method =========='
SET STATISTICS IO ON
SET STATISTICS TIME ON
SELECT p.RowNum
FROM #JBMTestBig p
WHERE NOT EXISTS (SELECT 1 FROM #JBMTestSmall w WHERE w.RowNum = p.RowNum)
SET STATISTICS TIME OFF
SET STATISTICS IO OFF
GO
--===== Test the EXCEPT method
PRINT REPLICATE('=',80)
PRINT '========== Test the EXCEPT method =========='
SET STATISTICS IO ON
SET STATISTICS TIME ON
SELECT RowNum
FROM #JBMTestBig
EXCEPT --------------------->>>
SELECT RowNum
FROM #JBMTestSmall
SET STATISTICS TIME OFF
SET STATISTICS IO OFF
GO
--===== Test the LEFT OUTER JOIN method
PRINT REPLICATE('=',80)
PRINT '========== Test the LEFT OUTER JOIN method =========='
SET STATISTICS IO ON
SET STATISTICS TIME ON
SELECT p.RowNum
FROM #JBMTestBig p
LEFT OUTER JOIN #JBMTestSmall w
ON p.RowNum = w.RowNum
WHERE w.RowNum IS NULL
SET STATISTICS TIME OFF
SET STATISTICS IO OFF
GO
If they're all the same on the big tables and the OUTER JOIN is the slowest on smaller tables, why would you ever use the OUTER JOIN method to find the exceptions?
Also, on the link that Adam was good enough to share, there may be some problems with using TOP with EXCEPTION (dunno why you might want to do that, anyway, but who knows).
For a single column comparison, WHERE NOT IN is the easiest to write.
Like I said before, WHERE NOT IN is better than the LEFT OUTER JOIN... if more than 1 column is involved, WHERE NOT EXISTS would be my next choice. I doubt if I'd use EXCEPTION because I don't trust it yet, especially when folks say it has problems in certain areas.
But, as long as this post is, it's just my opinion. 😉 Use what you feel more comfortable with.
--Jeff Moden
Change is inevitable... Change for the better is not.
Viewing 15 posts - 1 through 15 (of 20 total)
You must be logged in to reply to this topic. Login to reply