January 1, 2008 at 5:54 am
Difference between 'hash match/inner join' and 'nested loops/inner join' in execution query plan?
what is meaning of them?
when it is used in query plan?
January 1, 2008 at 6:04 am
Forgive me for again directing you elsewhere, but I wrote on this just a couple days ago
http://sqlinthewild.co.za/index.php/2007/12/30/execution-plan-operations-joins/[/url]
Again, if you have more questions, I'll be very happy to answer them.
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
January 1, 2008 at 8:39 am
Nice blog, Gail...great explanations!
--Jeff Moden
Change is inevitable... Change for the better is not.
January 2, 2008 at 10:03 am
A nested loop query plan is usually optimal when there are small numbers of rows in one table (think 10s to perhaps 100s in most cases) that can be probed into another table that is either very small or has an index that allows a seek for each input. One thing to watch out for here is when the optimizer THINKS there will be few rows (check the estimated rows in the popup for the estimated query plan graphic) but in reality there are LOTS of rows (thousands or even millions). This is one of the worst things that can happen to a query plan and is usually the result of either out-of-date statistics, a cached query plan or data that is unevenly distributed. Each of these can be addressed to minimize the likelyhood of the problem.
A hash-match plan is optimal when there is a relatively few rows joined into a relatively large number of rows. Gail's post explains the basics of the mechanism. It can get REALLY slow on machines with insufficient buffer RAM to contain the hash tables in memory because they will have to be laid down to disk at a HUGE relative cost.
Best,
Kevin G. Boles
SQL Server Consultant
SQL MVP 2007-2012
TheSQLGuru on googles mail service
January 2, 2008 at 11:20 am
Good point. Thanks.
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
December 17, 2010 at 9:09 am
TheSQLGuru (1/2/2008)
A nested loop query plan is usually optimal when there are small numbers of rows in one table (think 10s to perhaps 100s in most cases) that can be probed into another table that is either very small or has an index that allows a seek for each input. One thing to watch out for here is when the optimizer THINKS there will be few rows (check the estimated rows in the popup for the estimated query plan graphic) but in reality there are LOTS of rows (thousands or even millions). This is one of the worst things that can happen to a query plan and is usually the result of either out-of-date statistics, a cached query plan or data that is unevenly distributed. Each of these can be addressed to minimize the likelyhood of the problem.A hash-match plan is optimal when there is a relatively few rows joined into a relatively large number of rows. Gail's post explains the basics of the mechanism. It can get REALLY slow on machines with insufficient buffer RAM to contain the hash tables in memory because they will have to be laid down to disk at a HUGE relative cost.
I know this is an old thread, but it seems to address the problem I'm facing now. I'll preface this description by saying that I deal mostly in BI, so you won't hurt my feelings if you point out a fundamental error in my performance tuning logic 😉
I've got a relational query run from SSRS that seems to be generating an inefficient execution plan. As part of the query, I'm joining an 18m row table with a 2k row table, and the resulting join shows up as a Nested Loop in my execution plan. Statistics on both tables are up to date, I've dumped the plan cache, and the columns on which the tables are joined are both indexed and are of the same data type (VARCHAR(7)).
My research tells me that a nested loop is best for small data sets only, and that a hash join might be a better performing method. I think the nested loop might be chosen because the expected number of rows (1) is wildly different from the actual number of rows (2.5 million).
So the question is, is there anything else I can do to help the query optimizer choose a different join structure? Further, are there any suggestions to improve the affinity between the expected and actual number of rows?
By the way, I'm checking with my client to make sure they have no concerns about uploading their execution plan to this thread.
Thanks,
Tim
Tim Mitchell
TimMitchell.net | @tmitch.net | Tyleris.com
ETL Best Practices
December 17, 2010 at 10:11 am
I suppose the important thing here is why the mismatch in row estimates.
Can you maybe start a new thread and post the query in question and the execution plan please?
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
September 16, 2011 at 6:51 am
I want to understand HASH MATCH. What is Procession of it.
September 16, 2011 at 6:59 am
Did you read the blog post that was given in this thread? Did you read the post it links to?
If you did and still have questions, please start a new thread and ask specific questions.
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
Viewing 9 posts - 1 through 8 (of 8 total)
You must be logged in to reply to this topic. Login to reply