June 29, 2006 at 5:22 am
I asked question this at another forum and the answer was there is no difference because the optimizer will pick the best path regardless of the join order. Not that I don't beleive the answer I just wanted more opinions. What's the consensus on this question?
Is there any real performance hit when a joining on a foriegn key to the primary key and visa versa? For example:
From ExpenseDetail (NoLock)
Join ExpenseHeader (nolock) on pkExpenseHeader = fkExpenseHeader
and pkExpenseHeader = 677067
Versus:
From ExpenseDetail (NoLock)
Join ExpenseHeader (nolock) on fkExpenseHeader = pkExpenseHeader
and pkExpenseHeader = 677067
The DOSFrom and DOSTo are on the ExpenseDetail. With a one Header to many detail. Would I have been better off with:
From ExpenseHeader (NoLock)
Join ExpenseDetail (nolock) on pkExpenseHeader = fkExpenseHeader
and pkExpenseHeader = 677067
I looked at the execution plan and didn't see any difference in all three. I put the foriegn key jpoin to the primary key into production and I'm thinking I should back it out.
June 29, 2006 at 5:28 am
Try both solutions and check the query plan. I don't think there should be a difference. However the granularity of each column could be of importance, but I think the query optimizer is smart enough.
N 56°04'39.16"
E 12°55'05.25"
June 29, 2006 at 6:23 am
One of your examples is...
select DOSFrom,DOSTo
From ExpenseHeader (NoLock)
Join ExpenseDetail (nolock) on pkExpenseHeader = fkExpenseHeader
and pkExpenseHeader = 677067
You should also try looking at
select DOSFrom,DOSTo
From ExpenseHeader (NoLock)
Join ExpenseDetail (nolock) on fkExpenseHeader = 677067
and pkExpenseHeader = 677067
Again, I don't think it will make a difference but you never know. Of course this syntax limits you to a single value - if you used an in statement here you'd get a cartesian product of the two smaller tables.
June 29, 2006 at 6:39 am
Logically speaking, the join order per se is irrelevant to the optimiser, unless you use join hints or the force_order query hint, inwhich case the joins are performed in the exaact order specified.
In more complex queries, the starting point the optimiser uses can affect which plans it cost-estimates. The optimiser doesn't necessarily model every possible plan - it stops trying after a while and goes with the best plan it has generated so far. So one starting point (here, join order) may cause it to miss out the optimal plan where a different starting point might allow it to try that plan before it gives up. But in a two-table query it is highly unlikely that the optimiser will overlook the optimal plan (optimal relative to the information it has, e.g. stats).
Tim Wilkinson
"If it doesn't work in practice, you're using the wrong theory"
- Immanuel Kant
June 29, 2006 at 7:40 am
Thanks stax68 and others!
That was a much more complete answer then I had gotten at the other site.
Viewing 5 posts - 1 through 4 (of 4 total)
You must be logged in to reply to this topic. Login to reply