Sqlserver Joins

  • hi all,

    what is the difference between Merge join and Hash join ?

    thanks

    Regards,

    pradeep.P

  • is it some interview question or you have some problem in JOIN query?

    By the way sql server uses nested loops and index

    nested loops join.

    Regards,
    Nitin

  • hi ,

    actually this question is asked by some one .... i searched so many links....

    but no body giving Proper answer ....thats why posting this query...

    thanks,

    regards

    pradeep

  • I am sure this is interview question. But merge and hash join are Oracle terminology. It is not good idea to ask this question here.

    Any way here is the link for your answer

    http://www.geekinterview.com/question_details/15261

    Regards,
    Nitin

  • The concept of Merge Join IS also used in SQL Server.

    Here is a link to BOL

    http://msdn.microsoft.com/en-us/library/ms141775.aspx

    ----------------------------------------------
    Try to learn something about everything and everything about something. - Thomas Henry Huxley

    :w00t:
    Posting Best Practices[/url]
    Numbers / Tally Tables[/url]

    SQL-4-Life
  • nitinpatel31 (1/6/2009)


    I am sure this is interview question. But merge and hash join are Oracle terminology. It is not good idea to ask this question here.

    Any way here is the link for your answer

    http://www.geekinterview.com/question_details/15261

    I'm sorry to say, that's not accurate.

    Actually Merge and Hash joins are join types in SQL Server. You don't normally call them out specifically. You normally only specify INNER, OUTER, CROSS APPLY, etc. From there, the optimizer will determine, based on data and statistics of the data, which type of join, MERGE, HASH, LOOP. You can pass, as a query hint, that you want a specific join type, but this should only be done after VERY thorough testing to ensure you're not mucking up the optimizer. It usually makes the right choice.

    You can see these in action in the execution plans for the query

    A hash join is done, usually when dealing with large sets of data but with one of the two sets being compared is substantially smaller than the other. A hash table is built with the top set of data and then the bottom set of data is probed against the top set.

    A merge join is done when you have ordered sets of data, usually both about the same size. It simply shuffles the data together.

    "The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
    - Theodore Roosevelt

    Author of:
    SQL Server Execution Plans
    SQL Server Query Performance Tuning

  • Knowing the differences on these can make a big difference in your ability to read execution plans, so I suggest looking all of them up in Books Online, which has specifics of which ones are used when and how each one works.

    In 8 years of writing queries, I've had to tell the server once which type of join math to use, and that was because the execution plans for different inputs were so vastly different that the query worked great for the two most common parameters, and horribly the other 25% of the time. With a join hint, it worked slightly less great, but still well enough, most of the time, and quite well with the exceptions. (Difference between 2-minute run time vs 1 second, and difference between 100 milliseconds and 200 milliseconds for the most common.) Other than that one time, I've not used them in hints.

    But in reading execution plans, and knowing how the data is set up, it can make a difference how much you can tune a query if you know how each of these works.

    - Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
    Property of The Thread

    "Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon

  • For a summary: http://sqlinthewild.co.za/index.php/2007/12/30/execution-plan-operations-joins/

    For the really technical details, follow the links on that page.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • nitinpatel31 (1/6/2009)


    By the way sql server uses nested loops and index

    nested loops join.

    Not true. Nested loops, merge and hash are the three algorithms that SQL uses to do joins. Nested loops is one of the better known and is usually used on smaller result sets. Merge and hash usually work better on larger result sets.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • thanks for giving valuable solutions........

    Regards

    Pradeep.P

  • Hi

    Merge Join :

    SQL Server performs a join between two sets of row data using the merge

    join algorithm. The inputs are two separate sets of row data. Output is

    the results of the join. Oracle reads rows from both inputs in an

    alternating fashion and merges together matching rows in order to

    generate output. The two inputs are sorted on join column.

    Hash Join :

    SQL Server performs a join between two sets of row data using hash join

    algorithm. Input and Output same as Merge Join. Oracle reads all rows

    from the second input and builds a hash structure (like has table in

    java), before reading each row from the first input one at a time. For

    each row from the first input, the hash structure is probed and matching

    rows generate output.

    Thanks -- Vj

    http://dotnetvj.blogspot.com

    http://oravj.blogspot.com

  • Vijaya Kadiyala (1/8/2009)


    Merge Join :

    SQL Server performs a join between two sets of row data using the merge

    join algorithm. The inputs are two separate sets of row data. Output is

    the results of the join. Oracle reads rows from both inputs in an

    alternating fashion and merges together matching rows in order to

    generate output. The two inputs are sorted on join column.

    Ordered sets is how SQL Server does it as well.

    Hash Join :

    SQL Server performs a join between two sets of row data using hash join

    algorithm. Input and Output same as Merge Join. Oracle reads all rows

    from the second input and builds a hash structure (like has table in

    java), before reading each row from the first input one at a time. For

    each row from the first input, the hash structure is probed and matching

    rows generate output.

    Again, the hash mechanism in SQL Server very closely mirrors that of Oracle.

    "The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
    - Theodore Roosevelt

    Author of:
    SQL Server Execution Plans
    SQL Server Query Performance Tuning

Viewing 12 posts - 1 through 11 (of 11 total)

You must be logged in to reply to this topic. Login to reply