LEFT JOIN vs EXCEPT

Question

LEFT JOIN vs EXCEPT

Viewing 8 posts - 16 through 22 (of 22 total)

You must be logged in to reply to this topic. Login to reply

Jeff Moden SSC Guru Points: 1004356 More actions · Answer 1

Christopher Stobbs (8/1/2008)
Jeff,
How would you best two queries to see which is performing better?
thanks
Chris

One way is already demonstrated in the code I posted previously in this thread. Another way is to set up Profiler for the SPID I'm using to test through and have each code snippet separated from the other using GO.

Of course, you can't test something for performance unless you have lot's of data, so I use a "standard" test table for certain things. The code to generate the "standard" test table can be easily modified to suit a wide variety of requirements. The code generator is based on the same principle as the code I use to very quickly generate a Tally table. Here it is...

DROP TABLE JBMTest

GO

--===== Create and populate a 1,000,000 row test table.

-- Column "RowNum" has a range of 1 to 1,000,000 unique numbers

-- Column "SomeInt" has a range of 1 to 50,000 non-unique numbers

-- Column "SomeLetters2" has a range of "AA" to "ZZ" non-unique 2 character strings

-- Column "SomeMoney has a range of 0.0000 to 99.9999 non-unique numbers

-- Column "SomeDate" has a range of >=01/01/2000 and <01/01/2010 non-unique date/times

-- Column "SomeCSV" contains 'Part01,Part02,Part03,Part04,Part05,Part06,Part07,Part08,Part09,Part10'

-- for all rows.

-- Column "SomeHex12" contains 12 random hex characters (ie, 0-9,A-F)

-- Jeff Moden

SELECT TOP 1000000

RowNum = IDENTITY(INT,1,1),

SomeInt = ABS(CHECKSUM(NEWID()))%50000+1,

SomeLetters2 = CHAR(ABS(CHECKSUM(NEWID()))%26+65)

+ CHAR(ABS(CHECKSUM(NEWID()))%26+65),

SomeCSV = CAST('Part01,Part02,Part03,Part04,Part05,Part06,Part07,Part08,Part09,Part10' AS VARCHAR(80)),

SomeMoney = CAST(ABS(CHECKSUM(NEWID()))%10000 /100.0 AS MONEY),

SomeDate = CAST(RAND(CHECKSUM(NEWID()))*3653.0+36524.0 AS DATETIME),

SomeHex12 = RIGHT(NEWID(),12)

INTO dbo.JBMTest

FROM Master.dbo.SysColumns t1,

Master.dbo.SysColumns t2 --Lack of join criteria makes this a CROSS-JOIN

--===== A table is not properly formed unless a Primary Key has been assigned

-- Takes about 1 second to execute.

ALTER TABLE dbo.JBMTest

ADD PRIMARY KEY CLUSTERED (RowNum)

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

RBarryYoung SSC Guru Points: 143329 More actions · Answer 2

Jeff Moden (8/1/2008)
THAT was a lot of fun! Barry is a heck of a good sport!

Likewise, Jeff!

[font="Times New Roman"]-- RBarryYoung[/font], [font="Times New Roman"] (302)375-0451[/font] blog: MovingSQL.com, Twitter: @RBarryYoung [font="Arial Black"]
Proactive Performance Solutions, Inc. [/font][font="Verdana"] "Performance is our middle name."[/font]

Carl Federl One Orange Chip Points: 25384 More actions · Answer 3

It is interesting that the result of EXCEPT may be different than a NOT EXISTS.

EXCEPT considers NULLS to equal but NOT EXISTS considers NULLS to be unequal. Here is a reproduction:

After populating JBMTest per the provide SQL, add some additional rows with null.

INSERT INTO dbo.JBMTest

(SomeLetters2)

SELECT TOP 1000

CHAR(ABS(CHECKSUM(NEWID()))%26+65) + CHAR(ABS(CHECKSUM(NEWID()))%26+65)

FROM Master.dbo.SysColumns t1,

Master.dbo.SysColumns t2

EXCEPT returns zero rows:

select *

from dbo.JBMTest

except

select *

from dbo.JBMTest

NOT EXISTS returns all of the rows with NULLs

select *

from dbo.JBMTest as T1

where NOT EXISTS

(select 1

fromdbo.JBMTest as T2

whereT2.RowNum= T1.RowNum

andT2.SomeInt= T1.SomeInt

andT2.SomeCSV= T1.SomeCSV

andT2.SomeMoney= T1.SomeMoney

andT2.SomeDate= T1.SomeDate

andT2.SomeHex12= T1.SomeHex12

)

SQL = Scarcely Qualifies as a Language

Jeff Moden SSC Guru Points: 1004356 More actions · Answer 4

Good point, Carl. Thanks.

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

sharon.bender SSCrazy Points: 2853 More actions · Answer 5

I'd like to add something here... I think a few people have touched on it, but it's pretty important.

If you are pulling out a very small amount of records, say, just one or two, then using Except would probably be the better way to go, if you can filter to this level in the except part of your clause.

If you are, however, pulling out a rather large set of data, the left outer joins will be the better choice.

At least, that's what I generally find with nested queries rather than joined queries.

If you can filter down to only a few records early in the game, then nesting is good. Otherwise, it makes your DB take too much of a time hit to query all the records twice, rather than things like hash matches.

Carl Federl One Orange Chip Points: 25384 More actions · Answer 6

Regarding using LEFT OUTER JOIN instead of EXCEPT or NOT EXISTS, how do you write such a SQL Statement ? Please use the AdventureWorks demo database to answer the following question: What Vendors do NOT supply Products whose color is Blue or Grey?

EXCEPT SQL

select*

fromPurchasing.Vendor

JOIN(SELECT Purchasing.Vendor.VendorID

FROMPurchasing.Vendor

EXCEPT

(Select Purchasing.ProductVendor.VendorID

FROMProduction.Product

JOINPurchasing.ProductVendor

ONProduction.Product.ProductID= Purchasing.ProductVendor.ProductID

WHEREProduction.Product.color in ( 'Blue','Grey')

)

) as VendorNonColor

on VendorNonColor.VendorID = Purchasing.Vendor.VendorID

;

NOT EXISTS SQL:

select *

fromPurchasing.Vendor

whereNOT EXISTS

(Select 1

FROMProduction.Product

JOINPurchasing.ProductVendor

ONProduction.Product.ProductID= Purchasing.ProductVendor.ProductID

WHEREProduction.Product.colorin ( 'Blue','Grey')

ANDPurchasing.ProductVendor.VendorID = Purchasing.Vendor.VendorID

)

The schema:

CREATE TABLE [Purchasing].[Vendor]

([VendorID] [int] IDENTITY(1,1) NOT NULL

,[Name] varchar(255) NOT NULL

, CONSTRAINT [PK_Vendor_VendorID] PRIMARY KEY CLUSTERED

([VendorID] ASC)

)

CREATE TABLE [Production].[Product]

([ProductID] [int] IDENTITY(1,1) NOT NULL

,[Color] [nvarchar](15) NULL

,CONSTRAINT [PK_Product_ProductID] PRIMARY KEY CLUSTERED ([ProductID] ASC)

)

CREATE TABLE [Purchasing].[ProductVendor]

([ProductID] [int] NOT NULL

,[VendorID] [int] NOT NULL

,CONSTRAINT [PK_ProductVendor_ProductID_VendorID] PRIMARY KEY CLUSTERED

([ProductID] ASC,[VendorID] ASC)

,CONSTRAINT [FK_ProductVendor_Product_ProductID] FOREIGN KEY([ProductID])

REFERENCES [Production].[Product] ([ProductID])

,CONSTRAINT [FK_ProductVendor_Vendor_VendorID] FOREIGN KEY([VendorID])

REFERENCES [Purchasing].[Vendor] ([VendorID])

)

GO

CREATE NONCLUSTERED INDEX [IX_ProductVendor_VendorID] ON [Purchasing].[ProductVendor] ([VendorID] ASC)

go

SQL = Scarcely Qualifies as a Language

J Frizzle Valued Member Points: 57 More actions · Answer 7

Can you use EXCEPT ALL like you can with UNION ALL to display even the non-distinct rows?

Dwain Camps SSC Guru Points: 86948 More actions · Answer 8

J Frizzle (12/21/2012)
Can you use EXCEPT ALL like you can with UNION ALL to display even the non-distinct rows?

I'm pretty sure you cannot. Neither can use use INTERSECT ALL.

INTERSECT I'm pretty sure also returns only DISTINCT values.

Thus UNION, INTERSECT and EXCEPT all operate in a similar fashion, returning only DISTINCT values.

My mantra: No loops! No CURSORs! No RBAR! Hoo-uh![/I]
My thought question: Have you ever been told that your query runs too fast?

My advice:
INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.

Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
Since random numbers are too important to be left to chance, let's generate some![/url]
Learn to understand recursive CTEs by example.[/url]
[url url=http://www.sqlservercentral.com/articles/St