Join performance question

  • I asked question this at another forum and the answer was there is no difference because the optimizer will pick the best path regardless of the join order.  Not that I don't beleive the answer I just wanted more opinions.  What's the consensus on this question?

    Is there any real performance hit when a joining on a foriegn key to the primary key and visa versa?  For example:

    select  DOSFrom,DOSTo

    From ExpenseDetail (NoLock)

    Join ExpenseHeader (nolock) on pkExpenseHeader =  fkExpenseHeader

    and pkExpenseHeader =  677067

    Versus:

    select  DOSFrom,DOSTo

    From ExpenseDetail (NoLock)

    Join ExpenseHeader (nolock) on fkExpenseHeader = pkExpenseHeader

    and pkExpenseHeader =  677067

    The DOSFrom and DOSTo are on the ExpenseDetail.  With a one Header to many detail.  Would I have been better off with:

    select  DOSFrom,DOSTo

    From ExpenseHeader (NoLock)

    Join ExpenseDetail (nolock) on pkExpenseHeader = fkExpenseHeader

    and pkExpenseHeader =  677067

    I looked at the execution plan and didn't see any difference in all three.  I put the foriegn key jpoin to the primary key into production and I'm thinking I should back it out.

  • Try both solutions and check the query plan. I don't think there should be a difference. However the granularity of each column could be of importance, but I think the query optimizer is smart enough.

     


    N 56°04'39.16"
    E 12°55'05.25"

  • One of your examples is...

    select  DOSFrom,DOSTo

    From ExpenseHeader (NoLock)

    Join ExpenseDetail (nolock) on pkExpenseHeader = fkExpenseHeader

    and pkExpenseHeader =  677067

    You should also try looking at

    select  DOSFrom,DOSTo

    From ExpenseHeader (NoLock)

    Join ExpenseDetail (nolock) on fkExpenseHeader = 677067

    and pkExpenseHeader =  677067

    Again, I don't think it will make a difference but you never know.  Of course this syntax limits you to a single value - if you used an in statement here you'd get a cartesian product of the two smaller tables.

  • Logically speaking, the join order per se is irrelevant to the optimiser, unless you use join hints or the force_order query hint, inwhich case the joins are performed in the exaact order specified.

    In more complex queries, the starting point the optimiser uses can affect which plans it cost-estimates. The optimiser doesn't necessarily model every possible plan - it stops trying after a while and goes with the best plan it has generated so far. So one starting point (here, join order) may cause it to miss out the optimal plan where a different starting point might allow it to try that plan before it gives up. But in a two-table query it is highly unlikely that the optimiser will overlook the optimal plan (optimal relative to the information it has, e.g. stats).

    Tim Wilkinson

    "If it doesn't work in practice, you're using the wrong theory"
    - Immanuel Kant

  • Thanks stax68 and others!

    That was a much more complete answer then I had gotten at the other site.

Viewing 5 posts - 1 through 4 (of 4 total)

You must be logged in to reply to this topic. Login to reply