hash join

Question

hash join

sharon-472085

Ten Centuries

Points: 1182
More actions
May 3, 2011 at 3:28 am

#238938

hi,
i have this quay:
select * from data_accounts da
left join dbo.data_users as us
on da.userid=us.id
on both table i have index in join condition
there is reason why sql choose hash join insted of mereg join
thanks alot sharon

Viewing 9 posts - 1 through 8 (of 8 total)

You must be logged in to reply to this topic. Login to reply

Gail Shaw SSC Guru Points: 1004504 More actions · Answer 1

The indexes probably aren't covering. With no filter conditions, SQL has to scan either the index or the table. I'll only scan the indexes if they are covering (otherwise too expensive to do lookups for every row).

Since SQL has chosen to do a clustered index scan on both, the rows are likely not in order for the join, and SQL's calculated that a sort will be more expensive than a hash join.

The hash join is one of the best joins for really large resultsets (like here) and it's the one that parallels the best.

Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

We walk in the dark places no others will enter
We stand on the bridge and no one may pass

Dave Ballantyne SSC-Dedicated Points: 33667 More actions · Answer 2

In addition to what Gail said you can force a merge join using 'LEFT MERGE JOIN'.

It will *probably* be a lot more expensive

Clear Sky SQL
My Blog[/url]

sharon-472085 Ten Centuries Points: 1182 More actions · Answer 3

first thank you for your reply,

second i run this quary :

select da.userid,us.id from usagc.dbo.data_accounts da

inner join usagc.dbo.data_users as us

on da.userid=us.id

on both cloumn i put an index

second thay are same data type

and again sql using hashing insted of merge

thanks sharon

Gail Shaw SSC Guru Points: 1004504 More actions · Answer 4

Hash joins are the best join type for really huge resultsets. They parallel the best of any of the join types.

Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

We walk in the dark places no others will enter
We stand on the bridge and no one may pass

Latheesh NK SSCertifiable Points: 6559 More actions · Answer 5

As Gail said, SQL optimizer had choosen HASH join is the best option here as long as you have plenty of data in both tables for SORT is costly operation.

sharon-472085 Ten Centuries Points: 1182 More actions · Answer 6

hi,

maybe i miss something , but both clolumn hava index mean that cloumn ordered ( sql no need to sort them , thay all ready sort)

all so the select is only on this cloumn ?

so what i shuld do to make optimizer to decide merge join in plan?

thank sharon

Gail Shaw SSC Guru Points: 1004504 More actions · Answer 7

If the optimiser has decided on a hash join, it's because it's the cheapest join for the number of rows involved..

Merge doesn't parallel well (iirc) and your query is running in parallel.

Why do you want a merge join?

Do you absolutely, 100%, definitely know that it's better for this query in all cases and circumstances?

p.s. Please post table and index definitions.

Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

We walk in the dark places no others will enter
We stand on the bridge and no one may pass

Ninja's_RGR'us SSC Guru Points: 294069 More actions · Answer 8

Well maybe I'm missing something but what's the point of returning 12M + rows to anything?

I'd understand the left join if you wanted to find unactive users. But with no where condition it's hard to guess what you want to do.

More to the point, returning 12M rows will take time and there's little opportunities for tuning other than hardware here.