Difference between 'hash match/inner join' and 'nested loops/inner join' in execution query plan?

  • Difference between 'hash match/inner join' and 'nested loops/inner join' in execution query plan?

    what is meaning of them?

    when it is used in query plan?

    _____________________________________________________________________________________________________________
    Paresh Prajapati
    ➡ +919924626601
    http://paresh-sqldba.blogspot.com/[/url]
    LinkedIn | Tweet Me | FaceBook | Brijj

  • Forgive me for again directing you elsewhere, but I wrote on this just a couple days ago

    http://sqlinthewild.co.za/index.php/2007/12/30/execution-plan-operations-joins/[/url]

    Again, if you have more questions, I'll be very happy to answer them.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • Nice blog, Gail...great explanations!

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • A nested loop query plan is usually optimal when there are small numbers of rows in one table (think 10s to perhaps 100s in most cases) that can be probed into another table that is either very small or has an index that allows a seek for each input. One thing to watch out for here is when the optimizer THINKS there will be few rows (check the estimated rows in the popup for the estimated query plan graphic) but in reality there are LOTS of rows (thousands or even millions). This is one of the worst things that can happen to a query plan and is usually the result of either out-of-date statistics, a cached query plan or data that is unevenly distributed. Each of these can be addressed to minimize the likelyhood of the problem.

    A hash-match plan is optimal when there is a relatively few rows joined into a relatively large number of rows. Gail's post explains the basics of the mechanism. It can get REALLY slow on machines with insufficient buffer RAM to contain the hash tables in memory because they will have to be laid down to disk at a HUGE relative cost.

    Best,
    Kevin G. Boles
    SQL Server Consultant
    SQL MVP 2007-2012
    TheSQLGuru on googles mail service

  • Good point. Thanks.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • TheSQLGuru (1/2/2008)


    A nested loop query plan is usually optimal when there are small numbers of rows in one table (think 10s to perhaps 100s in most cases) that can be probed into another table that is either very small or has an index that allows a seek for each input. One thing to watch out for here is when the optimizer THINKS there will be few rows (check the estimated rows in the popup for the estimated query plan graphic) but in reality there are LOTS of rows (thousands or even millions). This is one of the worst things that can happen to a query plan and is usually the result of either out-of-date statistics, a cached query plan or data that is unevenly distributed. Each of these can be addressed to minimize the likelyhood of the problem.

    A hash-match plan is optimal when there is a relatively few rows joined into a relatively large number of rows. Gail's post explains the basics of the mechanism. It can get REALLY slow on machines with insufficient buffer RAM to contain the hash tables in memory because they will have to be laid down to disk at a HUGE relative cost.

    I know this is an old thread, but it seems to address the problem I'm facing now. I'll preface this description by saying that I deal mostly in BI, so you won't hurt my feelings if you point out a fundamental error in my performance tuning logic 😉

    I've got a relational query run from SSRS that seems to be generating an inefficient execution plan. As part of the query, I'm joining an 18m row table with a 2k row table, and the resulting join shows up as a Nested Loop in my execution plan. Statistics on both tables are up to date, I've dumped the plan cache, and the columns on which the tables are joined are both indexed and are of the same data type (VARCHAR(7)).

    My research tells me that a nested loop is best for small data sets only, and that a hash join might be a better performing method. I think the nested loop might be chosen because the expected number of rows (1) is wildly different from the actual number of rows (2.5 million).

    So the question is, is there anything else I can do to help the query optimizer choose a different join structure? Further, are there any suggestions to improve the affinity between the expected and actual number of rows?

    By the way, I'm checking with my client to make sure they have no concerns about uploading their execution plan to this thread.

    Thanks,

    Tim

  • I suppose the important thing here is why the mismatch in row estimates.

    Can you maybe start a new thread and post the query in question and the execution plan please?

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • I want to understand HASH MATCH. What is Procession of it.

  • Did you read the blog post that was given in this thread? Did you read the post it links to?

    If you did and still have questions, please start a new thread and ask specific questions.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass

Viewing 9 posts - 1 through 8 (of 8 total)

You must be logged in to reply to this topic. Login to reply