Hash Joins and Indexes

  • I have two tables with 75mil records between the two of them. I'm joining the tables on 4 different columns and each table has a non-clustered index on the four columns.

    I'm pretty sure that a hash join is unavoidable, but looking at my execution plan the hash doesn't look to be using the indexes. I've ran a second execution plan in the absence of the indexes and I got the same results with the same I/O and CPU cost.

    So do hash joins us indexes? If so what might I be doing wrong with the indexes?

    Thanks

  • It's not a question of the type of join being used. It's a question of whether SQL Server thinks the indexes will help or not.

    If, for example, you're joining all the rows in the tables (no Where clause), then indexes probably won't help, so probably won't be used.

    If you're querying columns not included in the indexes, the indexes may not be used to avoid bookmark lookups.

    If the columns don't have enough distinct values (selectivity), the indexes won't be used.

    If the statistics are out of date, the indexes might not be used.

    If the indexes are too fragmented to perform well, they might not be used.

    And so on.

    Do any of those seem to apply?

    - Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
    Property of The Thread

    "Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon

  • markmeiller (1/6/2009)


    I'm pretty sure that a hash join is unavoidable, but looking at my execution plan the hash doesn't look to be using the indexes. I've ran a second execution plan in the absence of the indexes and I got the same results with the same I/O and CPU cost.

    The join type does not determine whether or not indexes are used. Sometimes whether or not indexes are used determines the join type though (sometimes)

    Can you post the table definitions, the index definitions, the query and the exec plan. I'm sure someone will be able to tell you why the indexes are been ignoed. As a wild guess, it's because they're not covering.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • Are the columns in the two indexes in the same order? That can also impact the join type.

  • To really get an answer, post what Gail says.

    For another guess... You're moving so much data that SQL Server is just ignoring the indexes.

    BTW, looking at the execution plan, to determine if the indexes are getting used, don't look at the join operator, look at the data access operator. Are you getting an index seek or and index scan or a table scan. If it's an index scan, is it on the index you think it should be on or is it on a different one?

    Post the stuff like Gail says and end the speculation.

    "The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
    - Theodore Roosevelt

    Author of:
    SQL Server Execution Plans
    SQL Server Query Performance Tuning

  • Mark, do this - FORCE sql server to use the two indexes you think it should use (from mytable WITH (INDEX=myNCindex)...). Now show the estimated query plan for that query and compare the cost and plan to the one where you let sql server pick what it thinks is optimal (table scan hash). I bet it will become obvious why it does NOT use the indexes.

    Best,
    Kevin G. Boles
    SQL Server Consultant
    SQL MVP 2007-2012
    TheSQLGuru on googles mail service

Viewing 6 posts - 1 through 5 (of 5 total)

You must be logged in to reply to this topic. Login to reply