Best possible join

Question

Post reply

Best possible join

SQLRO

SSCrazy

Points: 2923
More actions
June 3, 2011 at 6:14 am

#240784

Hello,
We have two tables from two different sources. Say, EMP1 and EMP2
Both the tables have billions of rows. The columns are first name and last name in both the tables.
The need is to find out all the records in EMP2 which are not there in EMP1 What should be the best way to do it? Will the simple outer join suffice?

Viewing 4 posts - 1 through 4 (of 4 total)

You must be logged in to reply to this topic. Login to reply

spaghettidba SSC Guru Points: 105732 More actions · Answer 1

Something like this should do:

SELECT *

FROM EMP2

WHERE NOT EXISTS (

SELECT 1

FROM EMP1

WHERE EMP2.Name = EMP1.Name

AND EMP2.[Last Name] = EMP1.[Last Name]

The physical join operator will be determined by the optimizer based on indexes and stats, don't worry about that. It should turn into a LEFT ANTI SEMI-JOIN.

Hope this helps

Gianluca

-- Gianluca Sartori

SQLRO SSCrazy Points: 2923 More actions · Answer 2

Thanks for the reply. What should be the ideal index plan for these two tables?

spaghettidba SSC Guru Points: 105732 More actions · Answer 3

It should be an index with both fields in it. It should not matter which one first, especially when both tables have billions of rows, as it would probably end up with a SCAN + HASH.

If one of the tables has a significant amount of rows less than the other one, the optimizer could decide to implement the join with a MERGE operator, and then the order of the fields should be the same on both indexes, having the most selective column first.

-- Gianluca Sartori