Querying small table against very large table, indexing strategy involved?

  • We have a table with gift certificates which we create and has a redemption date set. Every once in a while, I get an email from an analyst or loss prevention officer with a list of certificate codes (attached as a spreadsheet) and asking which of them have been redeemed. I typically copy those codes into a temp table with one column called GCID and execute:

    select * from #tmp_giftcerts where gcid in

    (select gcid from MasterGiftCertTable where redeemedStatus = 'true')

    I make sure that my temp table is indexed and the gcid columin in MasterGiftCertTable is clustered as well.

    I was wondering if the query was re-written if there would be much of a difference, such as:

    select * from MasterGiftCertTable where redeemedStatus = 'true' and gcid in

    (select gcid from #tmp_giftcerts)

    While compared side by side, estimated execution plan says they take the same time.

    My question is, on properly indexed tables, does it matter which version of the two queries is run, or is it more a matter of personal preference and esthetics as they behave the exact same way. Is it possible I haven't noticed the difference because I'm not comparing my lookup list against a truly massive table (currently the master table is only about 478,000 rows)?

    Thanks.

    Gaby

    Gaby________________________________________________________________"In theory, theory and practice are the same. In practice, they are not." - Albert Einstein

  • I would expect those queries to execute in the same way. SQL is probably going to turn both of them into a nested loops join, which means they'll do exactly the same thing.

    - Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
    Property of The Thread

    "Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon

  • Thanks GSquared. I had a sneaky suspicision that was the case, but had to be sure in cases where the table is truly massive.

    Gaby________________________________________________________________"In theory, theory and practice are the same. In practice, they are not." - Albert Einstein

  • If the Query Processor is compiling to the same plan I don't think either really matters. My preference would be to query the main table and use the smaller temp table as the criteria. You may also want to try out these 2 options if you are really concerned about performance:

    SELECT

    M.*

    FROM

    MasterGiftCertTable M LEFT JOIN

    #tmp_giftcerts T

    ON M.gcid = T.gcid

    WHERE

    M.redeemed = 'TRUE'

    or

    SELECT

    *

    FROM

    MasterGiftCertTable M

    WHERE

    M.redeemed = 'true' AND

    EXISTS ( SELECT

    1

    FROM

    #tmp_giftcerts T

    Where

    M.gcid = T.gcid )

  • Take a look in Books Online (or MSDN) at how the various join types work. You're looking for "Nested Loops Join", "Hash Join" and "Merge Join". Draw a few diagrams for each, to see how they work, and suddenly a LOT of how SQL server does stuff will make tons of sense, and you'll see why these two queries are basically the same.

    It's worth doing.

    - Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
    Property of The Thread

    "Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon

Viewing 5 posts - 1 through 4 (of 4 total)

You must be logged in to reply to this topic. Login to reply