Order Tables in JOINS

  • Guys,

    I have two queries which produce the same results, as you would notice there is difference in the way the joins are done in each query. The first query goes from the lesser row table SCHOOL to EMPLOYEE table. I have noticed that QUERY1 gives better performance since doesnt do full table scan on large table EMPLOYEE. Below is the summary of the indexes. The question that I have is correct to conclude that lesser row column table should be the first table of the join.

    S.STATE_ID - PK CLUSTERED INDEX

    H.SCHOOL_ID - PK CLUSTERED INDEX

    D.DEPARTMENT_ID - PK CLUSTERED INDEX

    E.EMPLOYEE_ID - PK CLUSTERED INDEX

    H.STATE_ID - NON CLUSTERED INDEX

    D.SCHOOL_ID - NON CLUSTERED INDEX

    E.DEPARTEMENT_ID - NON CLUSTERED INDEX

    ROW COUNT

    STATE - 52

    SCHOOL - 67000

    DEPARTEMENT - 7200

    EMPLOYEE - 4370000

    QUERY1

    SELECT S.STATENAME, H.SCHOOLNAME, D.DEPARTMENTNAME, E.EMPLOYEENAME

    FROM STATE S INNER JOIN SCHOOL H ON S.STATE_ID = H.STATE_ID

    INNER JOIN DEPARTMENT D ON D.SCHOOL_ID = H.SCHOOL_ID

    INNER JOIN EMPLOYEE E ON E.EMPLOYEE_ID = D.EMPLOYEE_ID

    WHERE S.SCHOOLNAME = 'UND'

    GO

    QUERY2

    SELECT S.STATENAME, H.SCHOOLNAME, D.DEPARTMENTNAME, E.EMPLOYEENAME

    FROM STATE S INNER JOIN SCHOOL H ON S.STATE_ID = H.STATE_ID

    INNER JOIN DEPARTMENT D ON D.SCHOOL_ID = H.SCHOOL_ID

    INNER JOIN EMPLOYEE E ON E.EMPLOYEE_ID = D.EMPLOYEE_ID

    WHERE S.SCHOOLNAME = 'UND'

    GO

    Any suggestions inputs would help.

    Thanks

  • Can u attach execution plan for both the queries? Thanks

  • sorry maybe I'm blind, but I can't see any difference in the two queries.

    as stated could you supply the execution plan.

    I would be supprised if SQL didn't re-organise the tables based on the correct statistics to haev the tables in the correct order when it generate a query plan.

    I've written an article about this on my blog. if you want to follow my signature...

    ----------------------------------------------
    Try to learn something about everything and everything about something. - Thomas Henry Huxley

    :w00t:
    Posting Best Practices[/url]
    Numbers / Tally Tables[/url]

    SQL-4-Life
  • I think might be outside of what you are wanting to discuss, but it might be worth adding a non-clustered index to your School table that contains the SchoolName as this is in the where clause but does not seem to be indexed.

    I'll know more when I see the query plans

    ----------------------------------------------
    Try to learn something about everything and everything about something. - Thomas Henry Huxley

    :w00t:
    Posting Best Practices[/url]
    Numbers / Tally Tables[/url]

    SQL-4-Life
  • My bad the second query is

    SELECT S.STATENAME, H.SCHOOLNAME, D.DEPARTMENTNAME, E.EMPLOYEENAME

    FROM EMPLOYEE E INNER JOIN DEPARTMENT D ON E.EMPLOYEE_ID = D.EMPLOYEE_ID

    INNER JOIN SCHOOL H ON H.SCHOOL_ID = D.SCHOOL_ID

    INNER JOIN STATE S ON S.STATE_ID = H.STATE_ID

    WHERE S.SCHOOLNAME = 'UND'

    GO

  • The queries look identical.

    Post the execution plans.

    Depending on the complexity of the query, the table order in the JOINs doesn't matter because SQL Server can rearrange them as needed, based on the statistics, to arrive at the optimal query.

    I have seen very complex queries benefit from carefully arrange the order of the tables because it can help SQL Server find an optimal plan faster. You can see the reason that it dropped out of the optimizer when you look at the execution plan and check out the optimization level. If the reason is that it found a good enough plan, great. If the reason is that it timed out, you may have work to do and rearranging table order could help.

    "The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
    - Theodore Roosevelt

    Author of:
    SQL Server Execution Plans
    SQL Server Query Performance Tuning

  • As Grant says, unless you have SET FORCEPLAN ON, use OPTION(FORCE ORDER), or use a join hint (like INNER HASH JOIN), the order in which you write the joins does not really matter much.

    To post the execution plans, ensure you have Show Actual Execution Plan turned on in Management Studio, right click on the graphical plan produced and save it to a *.sqlplan file. You will need to either zip that file or rename it as *.txt in order to post it as an attachment - the attachment thing doesn't allow sqlplan files directly, if I recall correctly. (Yes that is kinda dumb).

    All that said, I am personally a fan of writing joins in the rough order which I expect to be optimal. That is just a style point though and absolutely not required practice, subject to the caveats I mentioned previously.

    As an aside, not everyone realizes that forcing the order of one join using a hint like INNER HASH JOIN, forces the order of all joins referenced in the query, not just the direct join participants...

    Paul

Viewing 7 posts - 1 through 6 (of 6 total)

You must be logged in to reply to this topic. Login to reply