Why would I get a difference in using a inner join vs intersect?

  • Hi,
     I have two tables that I want to see what records are in both tables. So, I used an intersect and my boss looked at the results and did not feel that it was right. He used an inner join and got a different number. I always thought that they did the same thing and that you could use either and get the same results.
    Does anyone know why  we would get different results?
    Below is the two methods that we used Name in the one table holds the first and the last name and that is why it is separated out.

    --Here it returns 590

      SELECT  DISTINCT
                    LEFT(CAST(al.Name AS VARCHAR(25)), CASE WHEN charindex(' ', CAST(al.Name AS VARCHAR(25))) = 0 THEN
                    LEN(CAST(al.Name AS VARCHAR(25))) ELSE charindex(' ', CAST(al.Name AS VARCHAR(25))) - 1 END) as FristName,
                    RIGHT(al.Name, CASE WHEN charindex(' ', al.Name) = 0 THEN
                    LEN(CAST(al.Name AS VARCHAR(25))) ELSE charindex(' ', CAST(al.Name AS VARCHAR(25))) - 1 END) as LastNmae,
                    Right(al.[SSN],4) AS SSN  
    FROM            [dbo].[AllLoans] al

    intersect
    SELECT  DISTINCT
                    [first_name],
                    [last_name],
                    RIGHT([ssn_no],4)
    FROM            [DbInfo]

    --Here I get 16534

    SELECT DISTINCT
                    LEFT(CAST(al.Name AS VARCHAR(25)), CASE WHEN charindex(' ', CAST(al.Name AS VARCHAR(25))) = 0 THEN
                    LEN(CAST(al.Name AS VARCHAR(25))) ELSE charindex(' ', CAST(al.Name AS VARCHAR(25))) - 1 END) as FristName,
                    RIGHT(al.Name, CASE WHEN charindex(' ', al.Name) = 0 THEN
                    LEN(CAST(al.Name AS VARCHAR(25))) ELSE charindex(' ', CAST(al.Name AS VARCHAR(25))) - 1 END) as LastNmae,
                               Right(al.[SSN],4) AS SSN  
    FROM            [dbo].[AllLoans] al INNER JOIN  [DbInfo] di
                    ON Right(al.[SSN],4) = RIGHT(cast(di.ssn_no as nvarchar(9)), 4)

  • Your INTERSECT query is comparing all of the columns whereas the JOIN version compares only right(SSN,4) ... explaining why the second returns more matches.

    The absence of evidence is not evidence of absence.
    Martin Rees

    You can lead a horse to water, but a pencil must be lead.
    Stan Laurel

  • Basically, matching only on the last 4 isn't going to give you a correct result, as two people can easily have the last 4 of their SSN be the same.   That kind of matching is going to generate what amounts to tying records from one person to records of some other person.   Bad idea.

    Steve (aka sgmunson) 🙂 🙂 🙂
    Rent Servers for Income (picks and shovels strategy)

  • Also, INTERSECT matches on NULL = NULL whereas Joins usually do not (default server setting).

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Thanks guys this all really helps

  • itmasterw 60042 - Tuesday, May 22, 2018 10:02 AM

    Thanks guys this all really helps

    one last item - from MS docs:

      INTERSECT returns distinct rows that are output by both the left and right input queries operator.

    ----------------------------------------------------------------------------------
    Your lack of planning does not constitute an emergency on my part...unless you're my manager...or a director and above...or a really loud-spoken end-user..All right - what was my emergency again?

Viewing 6 posts - 1 through 5 (of 5 total)

You must be logged in to reply to this topic. Login to reply