May 3, 2011 at 3:28 am
hi,
i have this quay:
select * from data_accounts da
left join dbo.data_users as us
on da.userid=us.id
on both table i have index in join condition
there is reason why sql choose hash join insted of mereg join
thanks alot sharon
May 3, 2011 at 3:37 am
The indexes probably aren't covering. With no filter conditions, SQL has to scan either the index or the table. I'll only scan the indexes if they are covering (otherwise too expensive to do lookups for every row).
Since SQL has chosen to do a clustered index scan on both, the rows are likely not in order for the join, and SQL's calculated that a sort will be more expensive than a hash join.
The hash join is one of the best joins for really large resultsets (like here) and it's the one that parallels the best.
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
May 3, 2011 at 4:02 am
In addition to what Gail said you can force a merge join using 'LEFT MERGE JOIN'.
It will *probably* be a lot more expensive
May 3, 2011 at 5:37 am
first thank you for your reply,
second i run this quary :
select da.userid,us.id from usagc.dbo.data_accounts da
inner join usagc.dbo.data_users as us
on da.userid=us.id
on both cloumn i put an index
second thay are same data type
and again sql using hashing insted of merge
thanks sharon
May 3, 2011 at 5:43 am
Hash joins are the best join type for really huge resultsets. They parallel the best of any of the join types.
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
May 3, 2011 at 5:44 am
As Gail said, SQL optimizer had choosen HASH join is the best option here as long as you have plenty of data in both tables for SORT is costly operation.
May 3, 2011 at 6:10 am
hi,
maybe i miss something , but both clolumn hava index mean that cloumn ordered ( sql no need to sort them , thay all ready sort)
all so the select is only on this cloumn ?
so what i shuld do to make optimizer to decide merge join in plan?
thank sharon
May 3, 2011 at 6:21 am
If the optimiser has decided on a hash join, it's because it's the cheapest join for the number of rows involved..
Merge doesn't parallel well (iirc) and your query is running in parallel.
Why do you want a merge join?
Do you absolutely, 100%, definitely know that it's better for this query in all cases and circumstances?
p.s. Please post table and index definitions.
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
May 3, 2011 at 6:27 am
Well maybe I'm missing something but what's the point of returning 12M + rows to anything?
I'd understand the left join if you wanted to find unactive users. But with no where condition it's hard to guess what you want to do.
More to the point, returning 12M rows will take time and there's little opportunities for tuning other than hardware here.
Viewing 9 posts - 1 through 8 (of 8 total)
You must be logged in to reply to this topic. Login to reply