January 6, 2009 at 1:53 am
hi all,
what is the difference between Merge join and Hash join ?
thanks
Regards,
pradeep.P
January 6, 2009 at 2:04 am
is it some interview question or you have some problem in JOIN query?
By the way sql server uses nested loops and index
nested loops join.
Regards,
Nitin
January 6, 2009 at 2:14 am
hi ,
actually this question is asked by some one .... i searched so many links....
but no body giving Proper answer ....thats why posting this query...
thanks,
regards
pradeep
January 6, 2009 at 2:34 am
I am sure this is interview question. But merge and hash join are Oracle terminology. It is not good idea to ask this question here.
Any way here is the link for your answer
http://www.geekinterview.com/question_details/15261
Regards,
Nitin
January 6, 2009 at 2:45 am
The concept of Merge Join IS also used in SQL Server.
Here is a link to BOL
http://msdn.microsoft.com/en-us/library/ms141775.aspx
----------------------------------------------
Try to learn something about everything and everything about something. - Thomas Henry Huxley
:w00t:
Posting Best Practices[/url]
Numbers / Tally Tables[/url]
January 6, 2009 at 6:05 am
nitinpatel31 (1/6/2009)
I am sure this is interview question. But merge and hash join are Oracle terminology. It is not good idea to ask this question here.Any way here is the link for your answer
I'm sorry to say, that's not accurate.
Actually Merge and Hash joins are join types in SQL Server. You don't normally call them out specifically. You normally only specify INNER, OUTER, CROSS APPLY, etc. From there, the optimizer will determine, based on data and statistics of the data, which type of join, MERGE, HASH, LOOP. You can pass, as a query hint, that you want a specific join type, but this should only be done after VERY thorough testing to ensure you're not mucking up the optimizer. It usually makes the right choice.
You can see these in action in the execution plans for the query
A hash join is done, usually when dealing with large sets of data but with one of the two sets being compared is substantially smaller than the other. A hash table is built with the top set of data and then the bottom set of data is probed against the top set.
A merge join is done when you have ordered sets of data, usually both about the same size. It simply shuffles the data together.
"The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
- Theodore Roosevelt
Author of:
SQL Server Execution Plans
SQL Server Query Performance Tuning
January 6, 2009 at 12:02 pm
Knowing the differences on these can make a big difference in your ability to read execution plans, so I suggest looking all of them up in Books Online, which has specifics of which ones are used when and how each one works.
In 8 years of writing queries, I've had to tell the server once which type of join math to use, and that was because the execution plans for different inputs were so vastly different that the query worked great for the two most common parameters, and horribly the other 25% of the time. With a join hint, it worked slightly less great, but still well enough, most of the time, and quite well with the exceptions. (Difference between 2-minute run time vs 1 second, and difference between 100 milliseconds and 200 milliseconds for the most common.) Other than that one time, I've not used them in hints.
But in reading execution plans, and knowing how the data is set up, it can make a difference how much you can tune a query if you know how each of these works.
- Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
Property of The Thread
"Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon
January 6, 2009 at 12:55 pm
For a summary: http://sqlinthewild.co.za/index.php/2007/12/30/execution-plan-operations-joins/
For the really technical details, follow the links on that page.
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
January 6, 2009 at 12:57 pm
nitinpatel31 (1/6/2009)
By the way sql server uses nested loops and indexnested loops join.
Not true. Nested loops, merge and hash are the three algorithms that SQL uses to do joins. Nested loops is one of the better known and is usually used on smaller result sets. Merge and hash usually work better on larger result sets.
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
January 6, 2009 at 9:25 pm
thanks for giving valuable solutions........
Regards
Pradeep.P
January 8, 2009 at 10:29 am
Hi
Merge Join :
SQL Server performs a join between two sets of row data using the merge
join algorithm. The inputs are two separate sets of row data. Output is
the results of the join. Oracle reads rows from both inputs in an
alternating fashion and merges together matching rows in order to
generate output. The two inputs are sorted on join column.
Hash Join :
SQL Server performs a join between two sets of row data using hash join
algorithm. Input and Output same as Merge Join. Oracle reads all rows
from the second input and builds a hash structure (like has table in
java), before reading each row from the first input one at a time. For
each row from the first input, the hash structure is probed and matching
rows generate output.
Thanks -- Vj
January 8, 2009 at 11:39 am
Vijaya Kadiyala (1/8/2009)
Merge Join :SQL Server performs a join between two sets of row data using the merge
join algorithm. The inputs are two separate sets of row data. Output is
the results of the join. Oracle reads rows from both inputs in an
alternating fashion and merges together matching rows in order to
generate output. The two inputs are sorted on join column.
Ordered sets is how SQL Server does it as well.
Hash Join :
SQL Server performs a join between two sets of row data using hash join
algorithm. Input and Output same as Merge Join. Oracle reads all rows
from the second input and builds a hash structure (like has table in
java), before reading each row from the first input one at a time. For
each row from the first input, the hash structure is probed and matching
rows generate output.
Again, the hash mechanism in SQL Server very closely mirrors that of Oracle.
"The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
- Theodore Roosevelt
Author of:
SQL Server Execution Plans
SQL Server Query Performance Tuning
Viewing 12 posts - 1 through 11 (of 11 total)
You must be logged in to reply to this topic. Login to reply