November 20, 2008 at 5:30 am
Hi all.
I've been a special attention to every post in jumpstart about indexes from Grant Fritchey. One in particular caught my attention, when he said that when we have a join which above we have little rows and below a lot of rows, than it was a efficient join. the opposite, was a inefficiente join.
So, I bring a simple join below 2 tables and would like to make some question.
There is, in attachments, 2 query plans. Both exection plans are the same, I only change the order from the "From" clause.
The query is this:
SELECT
H.*
FROM QS10H_ACTIVIDADES H WITH(NOLOCK,INDEX(U_QS10H_ACTIVIDADES))
INNER JOIN QS10F_ESTADO_ACTIVIDADE F WITH(NOLOCK) ON (H.ID_AVARIA=F.ID_AVARIA)
QS10H_ACTIVIDADES is a table with a unique non-clustered index on ID_AVARIA field. QS10F_ESTADO_ACTIVIDADE has this same field but is not unique.
Questions:
1) In execution plan I have a join with a lot of rows above, and just one below...As Grant Fritchey said, this would be a inneficient join?
2) When I said I changed the from clause and run execution plan again, this was thinking that it might firts read from QS10H_ACTIVIDADES where there will be just one row and then join with QS10F_ESTADO_ACTIVIDADE with a lot of rows, making this a efficiente join. But this is not like I thought...Is there a way to change the execution plan so he reads first from QS10H_ACTIVIDADES so then joins with ESTADO_ACTIVIDADE?
3) Assume that I want to join ACTIVIDADES, ESTADO_ACTIVIDADE and now a 3rd table. Can I say to query optimizer (or better, is it possible) that ACTIVIDADES join with ESTADO_ACTIVIDADE by one index and ACTIVIDADES join with the 3rd table with another index?
FROM ACTIVIDADES A WITH(INDEX(U_QS10H_ACTIVIDADES,I_QS10H_ACTIVIDADES))
INNER JOIN ESTADO_ACTIVIDADE B ON (A.ID_AVARIA=B.ID_AVARIA)
INNER JOIN Third_Table C ON (A.NUM_POSTO=C.NUM_POSTO)
I don't really know what am I doing with "WITH(INDEX(U_QS10H_ACTIVIDADES,I_QS10H_ACTIVIDADES))"...It works, but I can't know what he does...if he selects one of the 2 indexes wich better fit to joins...but this doesn't do what I was thinking..that with table ESTADO_ACTIVIDADE he join with index U_QS10H_ACTIVIDADES and with third_table he joins with I_QS10H_ACTIVIDADES right?
Best regards and thanks in advance.
Marco Francisco
November 20, 2008 at 1:22 pm
Marco André Duarte Francisco (11/20/2008)
The query is this:SELECT
H.*
FROM QS10H_ACTIVIDADES H WITH(NOLOCK,INDEX(U_QS10H_ACTIVIDADES))
INNER JOIN QS10F_ESTADO_ACTIVIDADE F WITH(NOLOCK) ON (H.ID_AVARIA=F.ID_AVARIA)
QS10H_ACTIVIDADES is a table with a unique non-clustered index on ID_AVARIA field. QS10F_ESTADO_ACTIVIDADE has this same field but is not unique.
If you want to force the join order, you can try a query hint: OPTION (FORCE ORDER) at the end of your SELECT statement:
http://msdn.microsoft.com/en-us/library/ms181714(SQL.90).aspx
I'd still compare performance of the forced and non forced though to see if it really has a positive effect or not.
In general, you probably don't want to specify in your query what index to use. If you find that the database engine doesn't optimize the query properly without a hint, you'd start by only specifying the index on the table that the execution plan doesn't seem to be using that you think would help, in this case most likely on the ESTADO_ACTIVIDAD table. Don't try to specify all the indexes you think all the joins should be using. Again, compare the actual performance with and without index hints.
The database engine typically does pick the optimal plan, if you're consistantly having problems with it, check if your statistics are up to date, or maybe try a different approach in your query.
November 22, 2008 at 10:38 am
Thanks for your reply.
Well, I'm starting to use query hints a lot...in my particular case, the index hints. This because I had a report wich had a simple query. But this simple query, turned query optmizer to use one particular index. Wich the report took more thank 5m to retrieve...
Then, I use a index hint to use the one wich I thought it was much better, and the report retrieved in about 10-15s... Maybe statistics, I doubt, cuz we reorganize the indexes at the end of our processes...
Thanks for the tip of FORCE ORDER. didn't knew it.
November 22, 2008 at 12:00 pm
Marco André Duarte Francisco (11/22/2008)
Thanks for your reply.Well, I'm starting to use query hints a lot...in my particular case, the index hints.
Don't. Hints should be a last resort when there's no other way to get the optimiser to pick a good plan and they should be reserved for the cases where you know exactly what the hint is going to do and you are 120% sure that you know better than the optimiser. I've had to use a hint once in 5 years of doing performance tuning.
In most cases by changing the query, tweaking the indexes or updating stats, the optimiser will pick a good plan. It's pretty good these days.
The other problem with hints is that while they may work well today, there's no guarantee that they'll work well tomorrow or next week.
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
November 24, 2008 at 12:57 pm
I was trying to address the same problem "Which JOIN order is better ?" and came to the following conclusion. It IS BETTER when the first table in JOIN is small, because in this case SQL Server puts small table in memory and then loops thru the larger table. This strategy leads to lesser I/O activity with the disk.
However my experiments with Query Analyzer (SQL Server 2005) confirm that regardless of whatever order you use in JOIN there is no difference in query plans! From here out I concluded that Query Analyzer is smart enougth to understand which table to put on top (into the memory). So it does it for us automatically!
Some notes however:
1. You should not expect that SQL Server will always understand what you want; so do your job as you would do.
2. Query Optimizer uses statistics heavily to get optimal execution plan and there is no guarantee that statistics are up to date at all the times. So it is safer to write your code correctly with performance in mind.
So, the conclusion - make a habit to put smaller table first to assure optimal execution plans.
November 24, 2008 at 1:33 pm
shekihan (11/24/2008)
I was trying to address the same problem "Which JOIN order is better ?" and came to the following conclusion. It IS BETTER when the first table in JOIN is small, because in this case SQL Server puts small table in memory and then loops thru the larger table. This strategy leads to lesser I/O activity with the disk.However my experiments with Query Analyzer (SQL Server 2005) confirm that regardless of whatever order you use in JOIN there is no difference in query plans! From here out I concluded that Query Analyzer is smart enougth to understand which table to put on top (into the memory). So it does it for us automatically!
Some notes however:
1. You should not expect that SQL Server will always understand what you want; so do your job as you would do.
2. Query Optimizer uses statistics heavily to get optimal execution plan and there is no guarantee that statistics are up to date at all the times. So it is safer to write your code correctly with performance in mind.
So, the conclusion - make a habit to put smaller table first to assure optimal execution plans.
The conclusion only holds if you use the FORCE ORDER hint, otherwise the optimizer will figure out the most appropriate join order regardless of how the query is coded. The accuracy of statistics is only pertinent to determining whether the optimizer can make the best choice, not to whether it will make A choice.
As Gail said, for the most part you won't need to use query hints. Even if you do, FORCE ORDER is probably one of the least used and most dangerous.
November 25, 2008 at 1:17 pm
I've never had to use FORCE ORDER in SQL Server, the engine does pick the most efficient order in the work that I've done. I only mentioned it because I know it is a problem that does occasionally appear in Oracle (at least versions 9 and earlier, not sure about newer) and maybe just haven't encountered this situation in SQL Server.
November 25, 2008 at 1:21 pm
SQL Server needs statistics to figure out the row counts for to be joined tables. So, it does affect the order in joins.
November 25, 2008 at 1:39 pm
shekihan (11/24/2008)
2. Query Optimizer uses statistics heavily to get optimal execution plan and there is no guarantee that statistics are up to date at all the times. So it is safer to write your code correctly with performance in mind.
shekihan (11/25/2008)
SQL Server needs statistics to figure out the row counts for to be joined tables. So, it does affect the order in joins.
Correct. Therefore it makes no difference how you order the tables in your query - the optimizer doesn't factor that into the creation of the query plan by default.
November 26, 2008 at 7:28 am
shekihan (11/24/2008)
However my experiments with Query Analyzer (SQL Server 2005) confirm that regardless of whatever order you use in JOIN there is no difference in query plans! From here out I concluded that Query Analyzer is smart enougth to understand which table to put on top (into the memory). So it does it for us automatically!Some notes however:
1. You should not expect that SQL Server will always understand what you want; so do your job as you would do.
2. Query Optimizer uses statistics heavily to get optimal execution plan and there is no guarantee that statistics are up to date at all the times. So it is safer to write your code correctly with performance in mind.
So, the conclusion - make a habit to put smaller table first to assure optimal execution plans.
It's not query analyser. It's the query optimiser.
It makes absolutely no difference which way you put the tables. SQL will join them in the order it thinks is best. If the stats are bad, it may pick a bad order, but that has absolutely nothing to do with the order specified in the query. By putting the smaller table first, you are assuring absolutely nothing.
Same with predicates in the where clause.
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
November 26, 2008 at 5:13 pm
By putting small table first you do assure that if the Query Optimizer (not a Query Analyzer of course) for some reason fails and decides to go your way the order of tables is already correct. This is true unless you have a special personal contract with Query Optimizer 🙂 and know upfront what SQL Server will do for your query in every particular case..
November 26, 2008 at 8:26 pm
shekihan (11/26/2008)
By putting small table first you do assure that if the Query Optimizer (not a Query Analyzer of course) for some reason fails and decides to go your way the order of tables is already correct. This is true unless you have a special personal contract with Query Optimizer 🙂 and know upfront what SQL Server will do for your query in every particular case..
The optimizer cannot "fail" unless you are forcing an invalid plan with hints. If this happens then the query won't run and an error will be returned. Your statement is 100% untrue.
November 27, 2008 at 12:47 am
shekihan (11/26/2008)
By putting small table first you do assure that if the Query Optimizer (not a Query Analyzer of course) for some reason fails and decides to go your way the order of tables is already correct.
Not at all. The optimiser doesn't care in the slightest about the order that you specify the tables in. It will always put the tables in the order it thinks best, regardless of the order that you specify them in.
Also, by the time the query reaches the optimiser, it bears absolutely no resemblance to the T-SQL statement as it will be in the form of an algebratised tree, so the 'order' of tables is already lost
This is true unless you have a special personal contract with Query Optimizer 🙂 and know upfront what SQL Server will do for your query in every particular case..
Not true at all.
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
November 27, 2008 at 11:17 am
One thing I noted in the earlier part of this thread is the statement that there was no concern about the statistics being up to date because they reorganize the indexes.
Just be aware that reorganizing indexes does not update statistics. Rebuilding indexes will update statistics. So, if all you are doing is reorganizing your indexes then you definitely need to include a step to update statistics.
Jeffrey Williams
“We are all faced with a series of great opportunities brilliantly disguised as impossible situations.”
― Charles R. Swindoll
How to post questions to get better answers faster
Managing Transaction Logs
Viewing 14 posts - 1 through 13 (of 13 total)
You must be logged in to reply to this topic. Login to reply