Join VS IN VS Exists

  • I was reading an interesting article related to the JOIN vs IN vs EXISTS since i had a simlar issue at our place

    and hence i came up withe following script of code which would give this relation.

    http://weblogs.sqlteam.com/mladenp/archive/2007/05/18/60210.aspx

    Both the queries send in same amount of data; however Performancewise the ofcourse the one with a JOIN is fast.

    Am i missing something in the grand scheme of things, since i was not able to relate it with the article that i was mentioning.(specifically in this scenario)

    Here is the SQL code I was working with ;

    CREATE TABLE #Department(

    DepartmentID [int] NOT NULL,

    Name varchar (250) NOT NULL)

    INSERT INTO #Department (DepartmentID,Name)

    VALUES (1,'Engineering')

    INSERT INTO #Department (DepartmentID,Name)

    VALUES (2,'Administration')

    INSERT INTO #Department (DepartmentID,Name)

    VALUES (3,'Sales')

    INSERT INTO #Department (DepartmentID,Name)

    VALUES (4,'Marketing')

    INSERT INTO #Department (DepartmentID,Name)

    VALUES (5,'Finance')

    CREATE TABLE #Employee(

    [EmployeeID] [int] NOT NULL,

    [FirstName] [varchar](250) NOT NULL,

    [LastName] [varchar](250) NOT NULL,

    [DepartmentID] [int] NOT NULL)

    INSERT INTO #Employee (EmployeeID,FirstName,LastName,DepartmentID)

    VALUES (1,'Mark','Benson',1)

    INSERT INTO #Employee (EmployeeID,FirstName,LastName,DepartmentID)

    VALUES (2,'Janet','kreydin',2)

    INSERT INTO #Employee (EmployeeID,FirstName,LastName,DepartmentID)

    VALUES (3,'Phil','krayen',2)

    INSERT INTO #Employee (EmployeeID,FirstName,LastName,DepartmentID)

    VALUES (4,'Aran','john',3)

    INSERT INTO #Employee (EmployeeID,FirstName,LastName,DepartmentID)

    VALUES (5,'Jake','Omar',3)

    /*TO GET THE DIFFERENCE BETWEEN THESE STATEMENTS*/

    SELECT EmployeeID,FirstName,LastName

    FROM #Employee

    WHERE ( CASE WHEN DepartmentID IN ( SELECT DepartmentID FROM #Department WHERE DepartmentID = 2 ) THEN 1

    ELSE 0 END ) = 1

    SELECT EmployeeID,FirstName,LastName

    FROM #Employee E

    INNER JOIN ( SELECT DepartmentID FROM #Department WHERE DepartmentID = 2 )A

    ON E.DepartmentID = A.DepartmentID

    Thanks

  • ulteriorm (10/1/2010)


    I was reading an interesting article related to the JOIN vs IN vs EXISTS since i had a simlar issue at our place

    and hence i came up withe following script of code which would give this relation.

    http://weblogs.sqlteam.com/mladenp/archive/2007/05/18/60210.aspx

    Both the queries send in same amount of data; however Performancewise the ofcourse the one with a JOIN is fast.

    Am i missing something in the grand scheme of things, since i was not able to relate it with the article that i was mentioning.(specifically in this scenario)

    Here is the SQL code I was working with ;

    CREATE TABLE #Department(

    DepartmentID [int] NOT NULL,

    Name varchar (250) NOT NULL)

    INSERT INTO #Department (DepartmentID,Name)

    VALUES (1,'Engineering')

    INSERT INTO #Department (DepartmentID,Name)

    VALUES (2,'Administration')

    INSERT INTO #Department (DepartmentID,Name)

    VALUES (3,'Sales')

    INSERT INTO #Department (DepartmentID,Name)

    VALUES (4,'Marketing')

    INSERT INTO #Department (DepartmentID,Name)

    VALUES (5,'Finance')

    CREATE TABLE #Employee(

    [EmployeeID] [int] NOT NULL,

    [FirstName] [varchar](250) NOT NULL,

    [LastName] [varchar](250) NOT NULL,

    [DepartmentID] [int] NOT NULL)

    INSERT INTO #Employee (EmployeeID,FirstName,LastName,DepartmentID)

    VALUES (1,'Mark','Benson',1)

    INSERT INTO #Employee (EmployeeID,FirstName,LastName,DepartmentID)

    VALUES (2,'Janet','kreydin',2)

    INSERT INTO #Employee (EmployeeID,FirstName,LastName,DepartmentID)

    VALUES (3,'Phil','krayen',2)

    INSERT INTO #Employee (EmployeeID,FirstName,LastName,DepartmentID)

    VALUES (4,'Aran','john',3)

    INSERT INTO #Employee (EmployeeID,FirstName,LastName,DepartmentID)

    VALUES (5,'Jake','Omar',3)

    /*TO GET THE DIFFERENCE BETWEEN THESE STATEMENTS*/

    SELECT EmployeeID,FirstName,LastName

    FROM #Employee

    WHERE ( CASE WHEN DepartmentID IN ( SELECT DepartmentID FROM #Department WHERE DepartmentID = 2 ) THEN 1

    ELSE 0 END ) = 1

    SELECT EmployeeID,FirstName,LastName

    FROM #Employee E

    INNER JOIN ( SELECT DepartmentID FROM #Department WHERE DepartmentID = 2 )A

    ON E.DepartmentID = A.DepartmentID

    Thanks

    So, what is your problem exactly? Are you wondering why the IN() isn't as fast? It isn't exactly clear what you are asking here.

  • @getoffmyfoot

    Even after considering various scenario's i was not very confident in both the queries performing the same tasks( Or maybe they are). I was replacing the logic in some of the procs and hence was still wondering if they achieve the same results. Ofcouse IN is the bad performer when run against large set of data.

  • ulteriorm (10/1/2010)


    @getoffmyfoot

    Even after considering various scenario's i was not very confident in both the queries performing the same tasks( Or maybe they are). I was replacing the logic in some of the procs and hence was still wondering if they achieve the same results. Ofcouse IN is the bad performer when run against large set of data.

    In this case they will produce the same results, because you are joining/comparing based on a many-to-one relationship (so the lookup table will only have one row matching the corresponding main table row).

    Where they DIFFER is when you can have multiple records on either side, in which case the join would behave unexpectedly.

    But I'm not sure why you have that case statement in the where clause of the first one

  • The join is faster because that CASE expression is not SARGable. Not sure why you have the CASE there in the first place.

    This may be of interest: http://sqlinthewild.co.za/index.php/2010/04/27/in-exists-and-join-a-roundup/

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • GilaMonster (10/1/2010)


    The join is faster because that CASE expression is not SARGable. Not sure why you have the CASE there in the first place.

    This may be of interest: http://sqlinthewild.co.za/index.php/2010/04/27/in-exists-and-join-a-roundup/

    :ermm: That was unexpected. Useful, but not the results I expected to see. Thanks!

    Brad's blog link from there was useful too.


    - Craig Farrell

    Never stop learning, even if it hurts. Ego bruises are practically mandatory as you learn unless you've never risked enough to make a mistake.

    For better assistance in answering your questions[/url] | Forum Netiquette
    For index/tuning help, follow these directions.[/url] |Tally Tables[/url]

    Twitter: @AnyWayDBA

  • But I'm not sure why you have that case statement in the where clause of the first one

    @Nevyn: The reason I have a CASE statement since this is a part of the code snippet which i am working on and the second table has a temp table wherein it contains data based on the parameters passed.

    (CASE WHEN DepartmentID IN ( SELECT ID FROM @s_deptID)

    However there is indeed a possibility that in my CASE related temptable, the data could get repeated and as you said i wouldnot be able to use it as an inner join.

    Thanks for everyone's reply and absolutely very nice piece of information in those blogs. Indeed cleared up my basics and the performance aspects on all the three cases.

Viewing 7 posts - 1 through 6 (of 6 total)

You must be logged in to reply to this topic. Login to reply