temp table vs. table variable

  • Hi all,

    I've come across a (at least for me) strange behavior of Sql Server. If I run a query, using two temp table, I get ~900'000 reads. If I run the same query with two table variables, I get only 700 reads.

    Here is the Query:

    select internbelnum = b#internbelnum,

    zeilennbr = ve_tbreferenz#bemerknbr,

    anzeigenr = ve_entitylink#anzeigenr,

    textbaustein = ve_tbstamm#dokutext,

    zeilentyp = 40,

    gruppierung = case when ve_entitylink#enttyp = 1060 then 2 else 1 end, -- textbausteine mit enttyp 1060 gehoeren zu az zeilen

    refnum = ve_tbreferenz#refnum

    from @belege b, ve_entitylink, ve_tbreferenz, ve_textbausteine, ve_tbstamm, sprachauswahl

    where ve_entitylink#oberentid = b#internbelnum

    and ve_entitylink#oberenttyp = 20

    and ve_tbreferenz#bemerknbr = ve_entitylink#entid

    and ve_tbreferenz#enttyp = ve_entitylink#enttyp

    and isnull#ve_tbreferenz#intbemnum, 0# > 0

    and ve_tbreferenz#bemerknbr not in #select z#zeilennbr from @zeilen z where z#zeilentyp = 40#

    and isnull#ve_tbreferenz#bemerkung, ''# like ''

    and ve_textbausteine#textbausteinnum = ve_tbreferenz#intbemnum

    and ve_textbausteine#mandid = @mandidin

    and sprachauswahl#base = 1

    and ve_tbstamm#textbausteinnum = ve_textbausteine#textbausteinnum

    and ve_tbstamm#mandid = @mandidin

    and ve_tbstamm#lcid = sprachauswahl.fallback

    Some of the tables have many rows #ve_entitylink, ve_tbstamm#.

    Here if I substitude @belege and @zeilen with temp tables, the performance is far worse.

    (belege has 4 rows and zeilen 20 rows)

    Why is that like that?

    How could I improve the version with temp tables?

    Thx for hints (I'm sure there is a explanation for this behavior).

    Reto Eggenberger

  • It looks like your query has hash marks where it should have periods and parentheses. Either that, or your using some version of SQL other than Microsoft's T-SQL.

    When you mention the quantity of reads, do you mean on your I/O statistics, or something in an execution plan, or something else?

    - Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
    Property of The Thread

    "Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon

  • You're right. Somehow the points and parentheses got mutaded during copy-paste.

    I use SQL2008. An the reads come from profiler.

    The excecution plans for the queries are also different. The query with table variables uses index seeks. The other has lots of table scans.

    Maybe I could create indexes on the temp tables. But I think with that few rows, this shouldn't make such a difference.

    greetings

    Reto E.

  • May I ask why you prefer temporary tables above table variables?

    Almost all the books I have read about SQL Server state that it's better to use table variables than temporary tables.

    http://www.sqlservercentral.com/articles/Temporary+Tables/66720/

  • jeedee (3/28/2011)


    May I ask why you prefer temporary tables above table variables?

    Almost all the books I have read about SQL Server state that it's better to use table variables than temporary tables.

    http://www.sqlservercentral.com/articles/Temporary+Tables/66720/%5B/quote%5D

    Oddly enough, if what you say is true, then I strongly disagree with all of those books. Even Wayne's fine article (which you provided a link to) draws no conclusion as to which is better other than "It Depends".

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • In fact, I'll state that I normally avoid Table Variables because they don't persist during troubleshooting and they don't act like Temp Tables would. Here's an example I recently documented for myself from a post here on SSC where just changing from a Table Variable to a Temp Table fixes the problem. And, to be sure, Table Variables are NOT a "memory only" structure any more than Temp Tables are a "disk only" structure... they both start out in memory and spill to disk if they don't fit in memory.

    Run the following more than once and see what I mean.

    /**********************************************************************************************************************

    The goal is to delete 10 randomly selected rows from a table. It doesn't work with a Table Variable alone.

    This is one of those places where SQL Server sometimes loses it's mind and it's known as "Halloweening". It's when SQL

    Server basically decides to make an "internal Cross-Join" to solve a problem. For every row in Table "A", it makes a

    full scan of Table "B". If you don't think so, take a look at the table scan symbol in the lower right corner of the

    Actual Execution Plan and look at the arrow coming out of it. It has 256 actual rows coming out of it which is the

    number of rows in the table times itself (16 in this case).

    In this case, there's a bit of obfuscation of the problem caused by the TOP 10... if you read all 16 rows from the table

    and take the TOP 10, you should get what you want EXCEPT that it's doing that 16 times. If you take the Top 10 of the

    same set of 16 rows of data in random order, you end up with more or less than 10 unique ID's being chosen because of

    the random sort. Even though the Inner Join that follows that will "uniquify" the ID's there are still more than 10

    which is why you end up deleting more or less than 10.

    There are three fixes for this. One is to trick SQL Server into doing the right thing by using an INNER MERGE JOIN.

    Another thing you can do is use a Temp Table so SQL Server can make a better "guess" at the Execution Plan where it will

    throw in an extra table spool to sort the mess out. The reason it'll make a better guess with the Temp Table is because

    it can actually estimate the rows instead of SQL Server estimating that there's just one row in the table (which won't

    need the table spool) because of the fact that Table Variables are estimated as having just one row. You can prove

    that by using a third possible fix... force a recompile for the DELETE by using OPTION(RECOMPILE) and it'll work

    correctly every time.

    As a side bar, this (being evaluated as a single row no matter how many rows it has) is why I don't use Table Variables

    in ANY T-SQL except in functions and special "rollback proof" code (very rarely need) even if they are occasionally

    faster. I just don't trust the damned things. They're also a PITA to troubleshoot because they don't persist data in

    SSMS.

    The explanation for a QOD should be that it'll delete more or less than 10 rows in an unpredictable fashion because the

    Halloweening will cause the SELECT TOP 10 to run and return 10 random ID's once for each row in the table and there may

    be more or less than 10 unique SomeValueues returned by those (in this case), 16 executions.

    Run this a dozen times and see what you get.

    **********************************************************************************************************************/

    --===== Create a table variable to store the example data in.

    DECLARE @t TABLE (RowNum INT IDENTITY(1,1), SomeValue VARCHAR(9))

    ;

    --===== Populate the table with example data

    INSERT @t

    (SomeValue)

    SELECT 'a' UNION ALL SELECT 'b' UNION ALL SELECT 'c' UNION ALL SELECT 'd' UNION ALL

    SELECT 'e' UNION ALL SELECT 'f' UNION ALL SELECT 'g' UNION ALL SELECT 'h' UNION ALL

    SELECT 'i' UNION ALL SELECT 'j' UNION ALL SELECT 'k' UNION ALL SELECT 'l' UNION ALL

    SELECT 'm' UNION ALL SELECT 'n' UNION ALL SELECT 'o' UNION ALL SELECT 'p'

    ;

    --===== Try to do the random delete of ten rows

    DELETE t

    FROM @t AS t

    INNER JOIN (

    SELECT TOP 10 RowNum FROM @t ORDER BY NEWID()

    ) AS b

    ON b.RowNum = t.RowNum

    ;

    --===== Show that it didn't work

    SELECT @@ROWCOUNT AS RowsDeleted

    ;

    SELECT COUNT(*) AS RowsKept FROM @t

    ;

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • jeedee (3/28/2011)


    May I ask why you prefer temporary tables above table variables?

    Almost all the books I have read about SQL Server state that it's better to use table variables than temporary tables.

    http://www.sqlservercentral.com/articles/Temporary+Tables/66720/%5B/quote%5D

    I strongly disagree. Could you provide a reference from said books? As Jeff stated, not even the article from Wayne recommends one above the other. Each case depends on many variables as to whether one will work better than the other - and both (temp table and table variable) really should be tested.

    Jason...AKA CirqueDeSQLeil
    _______________________________________________
    I have given a name to my pain...MCM SQL Server, MVP
    SQL RNNR
    Posting Performance Based Questions - Gail Shaw[/url]
    Learn Extended Events

  • For example Microsoft SQL Server 2008 R2 Unleashed, under the "General T-SQL Performance Recommendations" section. There (page 1655) I read the following:

    You should use table variables instead of temporary tables in stored procedures whenever

    possible or feasible. Table variables are memory resident and do not incur the I/O overhead

    and system table and I/O contention that can occur in tempdb with normal temporary

    tables. However, remember that table variables exist only for the duration of the SQL

    batch or stored procedure in which they are defined.

    Of course, according to the above quote, it depends (just like Jeff Moden said) if table variables have advantages over temporary tables based on the case.

    However, due to the fact that I'm reading these guidelines in a "general performance recommendation section" I would assume table variables are preferred in most cases.

    Furthermore if I delve into this matter on the Internet, most people seem to be in favor of a particular option regarding this matter.

    But perhaps I should start making use of "it depends!" more often starting from now. 😉

  • jeedee (3/28/2011)


    For example Microsoft SQL Server 2008 R2 Unleashed, under the "General T-SQL Performance Recommendations" section. There (page 1655) I read the following:

    You should use table variables instead of temporary tables in stored procedures whenever

    possible or feasible. Table variables are memory resident and do not incur the I/O overhead

    and system table and I/O contention that can occur in tempdb with normal temporary

    tables. However, remember that table variables exist only for the duration of the SQL

    batch or stored procedure in which they are defined.

    Of course, according to the above quote, it depends (just like Jeff Moden said) if table variables have advantages over temporary tables based on the case.

    However, due to the fact that I'm reading these guidelines in a "general performance recommendation section" I would assume table variables are preferred in most cases.

    Furthermore if I delve into this matter on the Internet, most people seem to be in favor of a particular option regarding this matter.

    But perhaps I should start making use of "it depends!" more often starting from now. 😉

    It looks like somebody needs an errata submission for this book. Table variables are not entirely memory resident. They do incur IO cost and can be quite the bottleneck in your queries if working with large data sets. Wayne and Gail Shaw both did a great job on proving that one with their research on the matter.

    Jason...AKA CirqueDeSQLeil
    _______________________________________________
    I have given a name to my pain...MCM SQL Server, MVP
    SQL RNNR
    Posting Performance Based Questions - Gail Shaw[/url]
    Learn Extended Events

  • jeedee (3/28/2011)


    For example Microsoft SQL Server 2008 R2 Unleashed, under the "General T-SQL Performance Recommendations" section. There (page 1655) I read the following:

    You should use table variables instead of temporary tables in stored procedures whenever

    possible or feasible. Table variables are memory resident and do not incur the I/O overhead

    and system table and I/O contention that can occur in tempdb with normal temporary

    tables. However, remember that table variables exist only for the duration of the SQL

    batch or stored procedure in which they are defined.

    Of course, according to the above quote, it depends (just like Jeff Moden said) if table variables have advantages over temporary tables based on the case.

    However, due to the fact that I'm reading these guidelines in a "general performance recommendation section" I would assume table variables are preferred in most cases.

    Furthermore if I delve into this matter on the Internet, most people seem to be in favor of a particular option regarding this matter.

    But perhaps I should start making use of "it depends!" more often starting from now. 😉

    This is why I don't recommend any books to anyone. The quote you dug out is proof enough that even well known and generally trusted authors make what are, IMHO opion, some pretty glaring mistakes. Combine that with the glaring error cause by table variables in the code I posted a couple of posts above, it brings two things to mind... yes, most definitely, "It Depends". The other is, "One test is worth a thousand expert opinions." 😉

    Thanks for the feedback jeedee.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Sergiy,

    If you're still out there somewhere, thanks again for the wonderful pearl you gave me years ago... it's saved my hiney more than once...

    [font="Arial Black"]"A Developer must not guess... a Developer must KNOW!" [/font]:-)

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Hi all

    I did some further investigation.

    When I delete the "zeilentyp = 40" part (see --<<---) of the select statement, I get only about 1'100 reads instead of 110'000 reads!

    But this is only a constant and shouldn't impact performance like that, right?!?

    What might be the reason, that the query optimizer generates a completely different execution plan for those two nearly identical queries?

    select internbelnum = b.internbelnum,

    zeilennbr = ve_tbreferenz.bemerknbr,

    anzeigenr = ve_entitylink.anzeigenr,

    textbaustein = ve_tbstamm.dokutext,

    --zeilentyp = 40, --<<---

    gruppierung = case when ve_entitylink.enttyp = 1060 then 2 else 1 end, -- textbausteine mit enttyp 1060 gehoeren zu az zeilen

    refnum = ve_tbreferenz.refnum

    from #belege b

    inner join ve_entitylink

    on ve_entitylink.oberentid = b.internbelnum

    and ve_entitylink.oberenttyp = 20

    and ve_entitylink.mandid = @mandidin

    and ve_entitylink.enttyp in (60,1060)

    inner join ve_tbreferenz

    on ve_tbreferenz.bemerknbr = ve_entitylink.entid

    and ve_tbreferenz.enttyp = ve_entitylink.enttyp

    and isnull(ve_tbreferenz.intbemnum, 0) > 0

    and ve_tbreferenz.bemerkung is null

    inner join ve_textbausteine

    on ve_textbausteine.textbausteinnum = ve_tbreferenz.intbemnum

    and ve_textbausteine.mandid = @mandidin

    inner join sprachauswahl

    on sprachauswahl.base = 1

    inner join ve_tbstamm

    on ve_tbstamm.textbausteinnum = ve_textbausteine.textbausteinnum

    and ve_tbstamm.mandid = @mandidin

    and ve_tbstamm.lcid = sprachauswahl.fallback

    where ve_tbreferenz.bemerknbr not in (select z.zeilennbr from #zeilen z where z.zeilentyp = 40)

  • Hi all

    I was able to dramatically improve the procedure. But I had to rewrite the whole section. I didn't want to do this (since it's not an application we wrote). But the vendor wasn't able to bring up a solution ...

    To original procedure had the above stated select statement 3 times (one for the invoice language, one for fallback language, one for base language).

    Now I combined those 3 selects into a single one using ISNULL() to fall back onto the higher language level.

    Problem solved.

    by the way:

    If I used a large amout of data with the above mentioned query, there was no difference between the temp table and the table variable! The table variable version was only faster with a small set of data.

    I guess because the table variable got written down to disk with the large data sets...

    Grettings

    Reto E.

  • jeedee (3/28/2011)


    May I ask why you prefer temporary tables above table variables?

    Almost all the books I have read about SQL Server state that it's better to use table variables than temporary tables.

    http://www.sqlservercentral.com/articles/Temporary+Tables/66720/%5B/quote%5D

    Well, since you referenced my article, I'd like to point out that it does not say that it's better to use one over the other. (It does say that MS recommends table variables.) I do state that you need to evaluate the differences between the two, and to do testing.

    There are two reasons why I rarely use table variables:

    1. Lack of statistics, etc. makes sql create a execution plan with the table variable having just 1 row. If you happen to be on SQL 2008 SP2, and use the OPTION (RECOMPILE) query table hint, then it can see just how many rows are in the table... but if you do a where clause on it, it then guesses a row quantity. That guess appears to be about 35% of the # of rows in the table variable - and can again cause an inappropriate execution plan to be generated.

    2. Debugging - it's easier to fill a temp table once, and then just run selects off of it in SSMS. For table variables, you have to declare & populate it first every time you want to run a query against it.

    In short, MS says that it's better to use table variables over temp tables. Most people don't agree with this blanket statement. There are benefits to tables variables, but there are also plenty of valid reasons to avoid them.

    Wayne
    Microsoft Certified Master: SQL Server 2008
    Author - SQL Server T-SQL Recipes


    If you can't explain to another person how the code that you're copying from the internet works, then DON'T USE IT on a production system! After all, you will be the one supporting it!
    Links:
    For better assistance in answering your questions
    Performance Problems
    Common date/time routines
    Understanding and Using APPLY Part 1 & Part 2

  • CirquedeSQLeil (3/28/2011)


    It looks like somebody needs an errata submission for this book. Table variables are not entirely memory resident. They do incur IO cost and can be quite the bottleneck in your queries if working with large data sets. Wayne and Gail Shaw both did a great job on proving that one with their research on the matter.

    Actually, Gail did a great EXCELLENT job of proving this with her research - and she has graciously allowed me to use that in my presentations. Please visit her blog at http://sqlinthewild.co.za/index.php/2010/10/12/a-trio-of-table-variables/[/url] for where you can have proof of the data in table variables being written to the data file.

    Wayne
    Microsoft Certified Master: SQL Server 2008
    Author - SQL Server T-SQL Recipes


    If you can't explain to another person how the code that you're copying from the internet works, then DON'T USE IT on a production system! After all, you will be the one supporting it!
    Links:
    For better assistance in answering your questions
    Performance Problems
    Common date/time routines
    Understanding and Using APPLY Part 1 & Part 2

Viewing 15 posts - 1 through 15 (of 23 total)

You must be logged in to reply to this topic. Login to reply