Temp Table Vs Table Variable

  • Ninja's_RGR'us (12/7/2011)


    Dev (12/7/2011)


    Ninja's_RGR'us (12/7/2011)


    Dev (12/7/2011)


    Ninja's_RGR'us (12/7/2011)


    WHERE Items.ItemCategoryCode IN (SELECT CategoryCode FROM @CaTs)

    Convert to #tbl with PK to help the stats, search and replace to find the 5-10 occurances and boom, 98% reads improvements ;-). Around 50% duration improvement as well.

    Will Table Variable @tbl with PK not work?

    No stats on @t.

    That made the whole difference.

    Do we really need stats for 2 rows? 😀

    YES. Especially in filters. This will byte you in the arse so hard when you filter on skewed data distribution.

    1 filter might return 1 % of the data, but the next one 25%. The QO needs to be able to know about that. Can't do it without stats on that #tbl. The PK then tells him that there's no duplicate value (especially useful in joins).

    And that's precisely my point. Plus, it can cache complete table in memory.

  • Dev (12/7/2011)


    Ninja's_RGR'us (12/7/2011)


    Dev (12/7/2011)


    Ninja's_RGR'us (12/7/2011)


    Dev (12/7/2011)


    Ninja's_RGR'us (12/7/2011)


    WHERE Items.ItemCategoryCode IN (SELECT CategoryCode FROM @CaTs)

    Convert to #tbl with PK to help the stats, search and replace to find the 5-10 occurances and boom, 98% reads improvements ;-). Around 50% duration improvement as well.

    Will Table Variable @tbl with PK not work?

    No stats on @t.

    That made the whole difference.

    Do we really need stats for 2 rows? 😀

    YES. Especially in filters. This will byte you in the arse so hard when you filter on skewed data distribution.

    1 filter might return 1 % of the data, but the next one 25%. The QO needs to be able to know about that. Can't do it without stats on that #tbl. The PK then tells him that there's no duplicate value (especially useful in joins).

    And that's precisely my point. Plus, it can cache complete table in memory.

    Temp tables can stay in memory too.

    "The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
    - Theodore Roosevelt

    Author of:
    SQL Server Execution Plans
    SQL Server Query Performance Tuning

  • Dev (12/7/2011)


    And that's precisely my point. Plus, it can cache complete table in memory.

    As I already said before this myth to effing die!

    http://www.sqlservercentral.com/articles/Temporary+Tables/66720/

  • Dev (12/7/2011)


    Plus, it can cache complete table in memory.

    So if I have a SQL instance with 2GB of memory and put 10GB of data into the table variable it'll be completely in memory?

    That table variables are solely in cache is a plain myth and it just keeps being propagated. There is no difference between how temp tables are cached (memory vs disk) and how table variables are cached (table vs disk)

    Do we really need stats for 2 rows? 😀

    Yes, absolutely. It's the lack of statistics (and nothing more) that can skew the plans for table variables terribly. The general advice is to not ever use table vars when plan costing is important.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • Ninja's_RGR'us (12/7/2011)


    Dev (12/7/2011)


    And that's precisely my point. Plus, it can cache complete table in memory.

    As I already said before this myth to effing die!

    http://www.sqlservercentral.com/articles/Temporary+Tables/66720/

    I wish it would die. I'm so sick of seeing it propagated and people getting terrible performance because they believe what they're told (it is non-trivial to test).

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • GilaMonster (12/7/2011)


    Ninja's_RGR'us (12/7/2011)


    Dev (12/7/2011)


    And that's precisely my point. Plus, it can cache complete table in memory.

    As I already said before this myth to effing die!

    http://www.sqlservercentral.com/articles/Temporary+Tables/66720/

    I wish it would die. I'm so sick of seeing it propagated and people getting terrible performance because they believe what they're told (it is non-trivial to test).

    Well it's trivial to test as in it takes 2 minutes. The costs saving can be stagerring tho!

    Maybe it's time for another round of articles!

  • @Gail, @Remi & @Grant: All Agree on what you said.

    The only point of argument is the table is toooooo small i.e. 2 rows. I am not a victim of myth but I am trying to understand why 2 rows need stats.

    BTW... I found following in the article suggested by Remi.

    Statistics on @tbl

    Optimizer cannot create any statistics on columns, so it treats table variable has having 1 record when creating execution plans.

    Statistics on #tbl

    Optimizer can create statistics on columns. Uses actual row count for generation execution plan.

  • Dev (12/7/2011)


    @Gail, @Remi & @Grant: All Agree on what you said.

    The only point of argument is the table is toooooo small i.e. 2 rows. I am not a victim of myth but I am trying to understand why 2 rows need stats.

    BTW... I found following in the article suggested by Remi.

    Statistics on @tbl

    Optimizer cannot create any statistics on columns, so it treats table variable has having 1 record when creating execution plans.

    Statistics on #tbl

    Optimizer can create statistics on columns. Uses actual row count for generation execution plan.

    It's not the amount of rows, it's what you do with them.

    If my case when have 50-ish categories. So "in theory" in category is 2% of the data. In REALITY, 5 of them account for 80% of the data.

    If you have 2 rows in that filter @table and the optimizer assumes only 1 normal row then the plan will be vastly different than knowing the 2 rows in the table and how many matching rows in the other tables. That's why the stats are crucial in that case.

    We're not even talking about massive tables here. Only a few 100 MB and less than 1M rows.

  • Ninja's_RGR'us (12/7/2011)


    Dev (12/7/2011)


    @Gail, @Remi & @Grant: All Agree on what you said.

    The only point of argument is the table is toooooo small i.e. 2 rows. I am not a victim of myth but I am trying to understand why 2 rows need stats.

    BTW... I found following in the article suggested by Remi.

    Statistics on @tbl

    Optimizer cannot create any statistics on columns, so it treats table variable has having 1 record when creating execution plans.

    Statistics on #tbl

    Optimizer can create statistics on columns. Uses actual row count for generation execution plan.

    It's not the amount of rows, it's what you do with them.

    If my case when have 50-ish categories. So "in theory" in category is 2% of the data. In REALITY, 5 of them account for 80% of the data.

    If you have 2 rows in that filter @table and the optimizer assumes only 1 normal row then the plan will be vastly different than knowing the 2 rows in the table and how many matching rows in the other tables. That's why the stats are crucial in that case.

    We're not even talking about massive tables here. Only a few 100 MB and less than 1M rows.

    So in this case I should summarise it as issue with Skewed Data or non-uniform distribution. Even Temporary Tables will fail there if the stored procedure is not created with RECOMPILE or OPTIMIZE FOR hint.

  • Dev (12/7/2011)


    Even Temporary Tables will fail there if the stored procedure is not created with RECOMPILE or OPTIMIZE FOR hint.

    No, they won't. Temp tables have statistics. When a threshold of rows changes (and it's small) and there are statistics there is an automatic recompile. No recompile hint is needed.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • Dev (12/7/2011)


    Ninja's_RGR'us (12/7/2011)


    Dev (12/7/2011)


    @Gail, @Remi & @Grant: All Agree on what you said.

    The only point of argument is the table is toooooo small i.e. 2 rows. I am not a victim of myth but I am trying to understand why 2 rows need stats.

    BTW... I found following in the article suggested by Remi.

    Statistics on @tbl

    Optimizer cannot create any statistics on columns, so it treats table variable has having 1 record when creating execution plans.

    Statistics on #tbl

    Optimizer can create statistics on columns. Uses actual row count for generation execution plan.

    It's not the amount of rows, it's what you do with them.

    If my case when have 50-ish categories. So "in theory" in category is 2% of the data. In REALITY, 5 of them account for 80% of the data.

    If you have 2 rows in that filter @table and the optimizer assumes only 1 normal row then the plan will be vastly different than knowing the 2 rows in the table and how many matching rows in the other tables. That's why the stats are crucial in that case.

    We're not even talking about massive tables here. Only a few 100 MB and less than 1M rows.

    So in this case I should summarise it as issue with Skewed Data or non-uniform distribution. Even Temporary Tables will fail there if the stored procedure is not created with RECOMPILE or OPTIMIZE FOR hint.

    #tbl cause recompiles!

    Did you read the article???

  • Dev (12/7/2011)


    Optimizer can create statistics on columns. Uses actual row count for generation execution plan.

    Estimated row count, not actual. The optimiser doesn't go off and count the actual number of rows that will be affected. It estimates.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • GilaMonster (12/7/2011)


    Dev (12/7/2011)


    Optimizer can create statistics on columns. Uses actual row count for generation execution plan.

    Estimated row count, not actual. The optimiser doesn't go off and count the actual number of rows that will be affected. It estimates.

    +1

    But in my case since I have only ± 50 values. The estimates are pretty darn accurate. Especially with daily full scan stats updates on all tables.

  • Ninja's_RGR'us (12/7/2011)


    Did you read the article???

    Yes. But it will recompile only if number of rows cross Recompilation Threshold else NO.

  • Dev (12/7/2011)


    Yes. But it will recompile only if number of rows cross Recompilation Threshold else NO.

    And tell me, what are the recompilation thresholds for a temporary table? Since you appear to know more about this than Ninja and I do...

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass

Viewing 15 posts - 16 through 30 (of 59 total)

You must be logged in to reply to this topic. Login to reply