Unmatched Records query performance problems

Question

Unmatched Records query performance problems

philipdub

SSC Enthusiast

Points: 116
More actions
February 22, 2011 at 2:02 pm

#234811

I think this should be an easy thing, but I am not really sure.
Using SQL Server 2008r2, Core I7 950 @ 4.0 ghz, 12GB memory, and primary storage is a RevoR2 Solid State PCI device.
I have table A and table B
Table A has all records. Table B has a subset of those records. (Aproximatly 150,000 records in each table)
I need to find all records in table A that do not have a coresponding entry in B.
I have indexes on the comparison field.
I did a Join where the matching field is null. Average run time is 5 minutes. Execution plan shows proper parallelism and index scans.
I did a join where not in subselect. Average run time is 3 minutes (same data). Execution plan shows a full table scan an no parallelism.
I was expecting the join on null to run faster, especially since the execution plan shows a full table scan.
I was also expecting the entire thing to run faster all around.

Viewing 8 posts - 1 through 7 (of 7 total)

You must be logged in to reply to this topic. Login to reply

Evil Kraig F SSC Guru Points: 100851 More actions · Answer 1

Since you're in SQL 2k8, look up the syntax for the EXCEPT clause in a SELECT. It's like the LEFT JOIN/NULL setup, but built in and optimized, you may find better performance out of that.

- Craig Farrell

Never stop learning, even if it hurts. Ego bruises are practically mandatory as you learn unless you've never risked enough to make a mistake.

For better assistance in answering your questions[/url] | Forum Netiquette
For index/tuning help, follow these directions.[/url] |Tally Tables[/url]
Twitter: @AnyWayDBA

Mike John SSCertifiable Points: 7216 More actions · Answer 2

For that data volume it seems far too slow either way.

Can you post the DDL of both tables, specifically what data type is/are the join columns? What constraints are there on the joining column(s), and what else do you have in the query - any functions, multiple OR conditions etc. Best bet - post the full text of the query.

Mike

philipdub SSC Enthusiast Points: 116 More actions · Answer 3

I think we found the issue. The select statement was wrapped in an IF that was behaving erraticly. We corrected the IF statement and it runs sub-second times now.

A bit of testing lead us to use the except statement. Thank you for the help!

HowardW One Orange Chip Points: 29109 More actions · Answer 4

Depending on the width of the join columns (e.g. if it's just on a PK field or a large combination of columns you're performing the match on) you could potentially get the best performance from using NOT EXISTS.

E.g.

Select ID, column1, column2 from Table1

WHERE NOT EXISTS (SELECT 1 FROM TABLE2 WHERE TABLE2.ID=TABLE1.ID)

As always, worth testing a few different ways.

SwePeso SSC-Dedicated Points: 39757 More actions · Answer 5

I did a series of tests here

http://weblogs.sqlteam.com/peterl/archive/2009/06/12/Timings-of-different-techniques-for-finding-missing-records.aspx

http://weblogs.sqlteam.com/peterl/archive/2007/09/20/Finding-records-in-one-table-not-present-in-another-table.aspx

http://weblogs.sqlteam.com/peterl/archive/2007/09/24/Finding-records-in-one-table-not-present-in-another-tableAgain.aspx

N 56°04'39.16"
E 12°55'05.25"

Paul White SSC Guru Points: 150468 More actions · Answer 6

There are many ways to write these types of query - not all of which are strictly the same. There are frequently slight semantic differences which mean that LEFT JOIN is asking a different question from NOT EXISTS, or EXCEPT for example. Usually, these differences concern the treatment of duplicates and NULLs. There is nothing magic or optimized about EXCEPT by the way.

The only thing that matters is the execution plan. Different TSQL constructions that are exactly logically equivalent may still be optimized slightly differently, resulting in different plans. It really makes no sense to compare 'performance' of different TSQL constructions that do not have the same semantic.

Paul

Paul White
All articles available on SQL.kiwi
@SQL_Kiwi

tfifield SSCrazy Eights Points: 9655 More actions · Answer 7

Gail Shaw did some interesting bench marks on this. I found them quite informative.

http://sqlinthewild.co.za/index.php/2010/03/23/left-outer-join-vs-not-exists/

Todd Fifield