In Line Subquery versus Join query

  • Guys,

    I am trying to figure out the advantages and disadvantages in terms of performance for in line sub query versus join based query.

    In the below query the EMPLOYEE_TYP and EMPLOYEE_STATUS table have only 10 rows each whereas EMPLOYEE and EMPLOYEE_DEM have 17mill rows each.

    Inline Query

    SELECT E.FIRST_NAME, E.LAST_NAME, E.DISPLAYNAME, D.SSN, D.DOB, (SELECT EMP_TYP_CD FROM EMPLOYEE_TYP T WHERE T.EMP_TYP_ID = E.EMP_TYP_ID) as EMP_TYP_CD,

    (SELECT EMP_STATUS_CD FROM EMPLOYEE_STATUS S WHERE S.EMP_STATUS_ID = E.EMP_STATUS_ID) AS EMP_STATUS_CD

    FROM EMEPLOYEE E INNER JOIN EMPLOYEE_DEM D

    ON E.ID = D.EMPID

    ---- OUTPUT 17MILL ROWS

    Join Query

    SELECT E.FIRST_NAME, E.LAST_NAME, E.DISPLAYNAME, D.SSN, D.DOB, T.EMP_TYP_CD, S.EMP_STATUS_CD

    FROM EMEPLOYEE E INNER JOIN EMPLOYEE_DEM D

    ON E.ID = D.EMPID

    INNER JOIN EMPLOYEE_TYP T ON E.EMP_TYP_ID = T.EMP_TYP_ID

    INNER JOIN EMPLOYEE_STATUS S ON E,EMP_STATUS_ID = T.EMP_STATUS_ID

    ---- OUTPUT 17MILL ROWS

    Any suggestions or inputs would help.

    Thanks

  • Run them both and look at the actual execution plans. You may be surprised to find they are the same or "trivially different" - maybe.


    [font="Arial"]Low-hanging fruit picker and defender of the moggies[/font]

    For better assistance in answering your questions, please read this[/url].


    Understanding and using APPLY, (I)[/url] and (II)[/url] Paul White[/url]

    Hidden RBAR: Triangular Joins[/url] / The "Numbers" or "Tally" Table: What it is and how it replaces a loop[/url] Jeff Moden[/url]

  • The optimizer would probably treat them the same in this particular case. Personally tend to avoid the in line subquery method. If the value needs to be used in other parts of the query you would need/want to use the direct Join methodology.

    The real difference is that the Join gives you the option of eliminating rows from the result set where no match to the status & types table exist. Of course you can use the Outer Join to produce a NULL which produces the same result as your subquery.

    The probability of survival is inversely proportional to the angle of arrival.

  • ChrisM@home (10/27/2010)


    Run them both and look at the actual execution plans. You may be surprised to find they are the same or "trivially different" - maybe.

    Oh, be careful.... Execution plans frequently lie like a rug. My recommendation is always to test against a substantial amount of data.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Jeff Moden (10/27/2010)


    ChrisM@home (10/27/2010)


    Run them both and look at the actual execution plans. You may be surprised to find they are the same or "trivially different" - maybe.

    Oh, be careful.... Execution plans frequently lie like a rug. My recommendation is always to test against a substantial amount of data.

    Couldn't agree more, Jeff.


    [font="Arial"]Low-hanging fruit picker and defender of the moggies[/font]

    For better assistance in answering your questions, please read this[/url].


    Understanding and using APPLY, (I)[/url] and (II)[/url] Paul White[/url]

    Hidden RBAR: Triangular Joins[/url] / The "Numbers" or "Tally" Table: What it is and how it replaces a loop[/url] Jeff Moden[/url]

  • am-244616 (10/27/2010)


    Guys,

    I am trying to figure out the advantages and disadvantages in terms of performance for in line sub query versus join based query.

    In the below query the EMPLOYEE_TYP and EMPLOYEE_STATUS table have only 10 rows each whereas EMPLOYEE and EMPLOYEE_DEM have 17mill rows each.

    Inline Query

    SELECT E.FIRST_NAME, E.LAST_NAME, E.DISPLAYNAME, D.SSN, D.DOB, (SELECT EMP_TYP_CD FROM EMPLOYEE_TYP T WHERE T.EMP_TYP_ID = E.EMP_TYP_ID) as EMP_TYP_CD,

    (SELECT EMP_STATUS_CD FROM EMPLOYEE_STATUS S WHERE S.EMP_STATUS_ID = E.EMP_STATUS_ID) AS EMP_STATUS_CD

    FROM EMEPLOYEE E INNER JOIN EMPLOYEE_DEM D

    ON E.ID = D.EMPID

    [font="Arial Black"]---- OUTPUT 17MILL ROWS[/font]

    Join Query

    SELECT E.FIRST_NAME, E.LAST_NAME, E.DISPLAYNAME, D.SSN, D.DOB, T.EMP_TYP_CD, S.EMP_STATUS_CD

    FROM EMEPLOYEE E INNER JOIN EMPLOYEE_DEM D

    ON E.ID = D.EMPID

    INNER JOIN EMPLOYEE_TYP T ON E.EMP_TYP_ID = T.EMP_TYP_ID

    INNER JOIN EMPLOYEE_STATUS S ON E,EMP_STATUS_ID = T.EMP_STATUS_ID

    [font="Arial Black"]---- OUTPUT 17MILL ROWS[/font]

    Any suggestions or inputs would help.

    Thanks

    Before I make any suggestions, please tell me... where are you outputing those 17 million rows to? 😉

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

Viewing 6 posts - 1 through 5 (of 5 total)

You must be logged in to reply to this topic. Login to reply