Table Variable Issue

  • The issue I’m seeing is that when I join a table variable to a persistent table, the query plan is scanning my persistent table instead of using the non-clustered index. I discovered that if I run the exact same code, except using a temp table instead of a table variable, the query plan uses the non-clustered index on my persistent table instead of scanning the table. I’ve also noticed that a DISTINCT TOP X in the SELECT statement seems to be contributing to the issue.

    I’ve provided the T-SQL code necessary to reproduce the issue. If anyone has any input, I’d be delighted to hear an explanation.

    Thanks,

    Adam

    --Create our test table

    CREATE TABLE [dbo].[myTest](

    [rowid] [int] IDENTITY(1,1) NOT NULL,

    [mydata] [varchar](100) NULL,

    CONSTRAINT [PK_myTest] PRIMARY KEY CLUSTERED

    ([rowid] ASC

    ))

    --Create our nonclustered index

    CREATE NONCLUSTERED INDEX [IX_mydata] ON [dbo].[myTest]

    ([mydata] ASC)

    --Populate our table with some data (this should take about a minute)

    declare @total int, @current int

    set @current = 1

    set @total = 100000

    while @total >= @current

    begin

    insert into myTest (mydata) values (cast(@current as varchar(10)) + ' row data')

    set @current = @current + 1

    end

    --Join our test table to a temp table. This uses the nonclustered index.

    create table #temp (mydata varchar(100) NULL)

    insert into #temp (mydata)values ('54 row%')

    insert into #temp (mydata)values ('9510%')

    select distinct top 100 m.rowid, m.mydata--This will do an index seek using my nonclustered index

    --select m.rowid, m.mydata--This will do an index seek using my nonclustered index

    from myTest m

    inner join #temp i on m.mydata like i.mydata

    drop table #temp

    --Now join our test table to a table variable. The SELECT DISTINCT TOP 100 does not use the index. Instead, it will scan the table.

    declare @temp table(mydata varchar(100) NULL)

    insert into @temp (mydata)values ('54 row%')

    insert into @temp (mydata)values ('9510%')

    select distinct top 100 m.rowid, m.mydata--**This will table scan**

    --select m.rowid, m.mydata--This will do an index seek using my nonclustered index

    from myTest m

    inner join @temp i on m.mydata like i.mydata

  • One of the differences between table variables and temp tables is that no statistics are kept on table variables, while statistics are kept on temp tables. In the absence of such statistics, the optimizer may e choosing a more conservative (slower) execution plan.

    You could always try giving it a hint. Or alternatively, you might try making the column of your temp table a primary key.

    __________________________________________________

    Against stupidity the gods themselves contend in vain. -- Friedrich Schiller
    Stop, children, what's that sound? Everybody look what's going down. -- Stephen Stills

  • The Dixie Flatline (2/4/2010)


    One of the differences between table variables and temp tables is that no statistics are kept on table variables, while statistics are kept on temp tables. In the absence of such statistics, the optimizer may e choosing a more conservative (slower) execution plan.

    You could always try giving it a hint. Or alternatively, you might try making the column of your temp table a primary key.

    I agree with Bob.

    Jason...AKA CirqueDeSQLeil
    _______________________________________________
    I have given a name to my pain...MCM SQL Server, MVP
    SQL RNNR
    Posting Performance Based Questions - Gail Shaw[/url]
    Learn Extended Events

  • On our actual code (not my test code) we chose to use a hint to force the query to use the index and that works. But I just don't understand what's going on.

    I get that SQL keeps statistics on temp tables, but no on table variables. But I'm not clear how statistics on my temp table (or lack thereof on my table variable) is going to help SQL decide whether or not it should scan my persistent table or use the NC index.

    Also, I'm failing to understand why the DISTINCT TOP X makes a difference when I'm using a table variable.

    Thanks for the quick replies, I appreciate the feedback.

    Thanks,

    Adam

  • Oh yeah, and I forgot to mention that if you change the LIKE to "=" then SQL will use the index. So if you change this statement that does a table scan:

    select distinct top 100 m.rowid, m.mydata

    from myTest m

    inner join @temp i on m.mydata like i.mydata

    To this statement, it will use the nonclustered index:

    select distinct top 100 m.rowid, m.mydata

    from myTest m

    inner join @temp i on m.mydata = i.mydata

  • I get that SQL keeps statistics on temp tables, but no on table variables. But I'm not clear how statistics on my temp table (or lack thereof on my table variable) is going to help SQL decide whether or not it should scan my persistent table or use the NC index.

    Well, since SQL doesn't keep statistics on table variables, the optimizer assumes that the table variable only has one row, and makes a plan accordingly. Which is to just scan the table for the one row, instead of going to the index. Then when it hits 100/1000 rows, the plan is no longer the correct plan.

    (I've seen temp tables outperform table variables with as little as 36 rows.)

    You might be interested in this article: Comparing Table Variables to Temporary Tables[/url]

    Wayne
    Microsoft Certified Master: SQL Server 2008
    Author - SQL Server T-SQL Recipes


    If you can't explain to another person how the code that you're copying from the internet works, then DON'T USE IT on a production system! After all, you will be the one supporting it!
    Links:
    For better assistance in answering your questions
    Performance Problems
    Common date/time routines
    Understanding and Using APPLY Part 1 & Part 2

  • But the issue isn't in scanning the table variable, or the temp table. The issue is that SQL is scanning my static table that has 100,000 rows (or in the actual PROD environment, it has over 5 million). When we use a table variable, SQL does a table scan on my static/physical table.

    My table variable only has 2 rows. Scanning it is fine. Scanning my physical table with 5 million rows is the problem.

    Thanks,

    Adam

  • Post the execution plans?

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • I've attached the SQL plans, named accordingly. Please let me know if there's anything else I can post that would be helpful.

    Thanks,

    Adam

  • The estimated row count on your actual physical table is wrong. SQL thinks that the physical table contain only 1000 rows and hence that a scan of the physical table isn't so bad. The row estimate is wrong in both, so it's not just due to the table variable.

    Try a stats update or index rebuild. See if it fixes that inaccuracy.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • I rebuilt the clustered index, and attached the new plans. I also attached a plan for the table variable after removing the "DISTINCT TOP 100" from the SELECT statement. This is interesting to me. Notice the plan with the DISTINCT in there; it's estimating 1 row for our table variable, but the actual number of rows is 199,999. My static table has 100,000 rows, and my table variable has 2 rows. So it almost seems like the plan is doing some sort of cartesian join??

    Notice the plan for the table variable where I've removed the DISTINCT TOP 100. In this plan, the actual number of rows returned from the table variable is 2. And, the plan is doing a nonclustered index seek for my static table.

    -Adam

  • None of the three execution plans attached to your last post use the temp table. All three use the table variable. The first and third are identical queries, yet show different execution plans. What changed between those two? Was the table variable not indexed until the last query?

    --

    -------------------------------------------------------------------------------------------------------------

    -- temp_table2 execution plan

    -------------------------------------------------------------------------------------------------------------

    select m.rowid, m.mydata--This will do an index seek using my nonclustered index

    from myTest m

    inner join @temp i on m.mydata LIKE i.mydata

    -------------------------------------------------------------------------------------------------------------

    -- table_variable2 execution plan

    -------------------------------------------------------------------------------------------------------------

    select distinct top 100 m.rowid, m.mydata--**This will table scan**

    --select m.rowid, m.mydata--This will do an index seek using my nonclustered index

    from myTest m

    inner join @temp i on m.mydata LIKE i.mydata

    -------------------------------------------------------------------------------------------------------------

    -- table_variable2_NoDISTINCT execution plan

    -------------------------------------------------------------------------------------------------------------

    select m.rowid, m.mydata--This will do an index seek using my nonclustered index

    from myTest m

    inner join @temp i on m.mydata LIKE i.mydata

    --

    __________________________________________________

    Against stupidity the gods themselves contend in vain. -- Friedrich Schiller
    Stop, children, what's that sound? Everybody look what's going down. -- Stephen Stills

  • amoericke (2/5/2010)


    On our actual code (not my test code) we chose to use a hint to force the query to use the index and that works. But I just don't understand what's going on.

    I get that SQL keeps statistics on temp tables, but no on table variables. But I'm not clear how statistics on my temp table (or lack thereof on my table variable) is going to help SQL decide whether or not it should scan my persistent table or use the NC index.

    Also, I'm failing to understand why the DISTINCT TOP X makes a difference when I'm using a table variable.

    Thanks for the quick replies, I appreciate the feedback.

    Thanks,

    Adam

    My question would be... if the Temp Table does the trick so much better, why are you good folks hell bent on using Table Variables? Unless you need the ability to do a rollback without rolling back a temporary structure, there's not much of a need for table variables in this situation.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • You got me, Jeff. Me and table variables were just a summer thing, y'know. That said, any time someone has the ah-hah! moment that execution plans are somehow important, I think they should be encouraged. Speaking of which...

    Amoericke: Grant Frichey has written an excellent book on execution plans. You should order it if you really want to get into this. Doing your homework will pay you great dividends in years to come, because we aren't going to discuss all the variations online. We just don't have the time.

    __________________________________________________

    Against stupidity the gods themselves contend in vain. -- Friedrich Schiller
    Stop, children, what's that sound? Everybody look what's going down. -- Stephen Stills

  • If you're still curious, I uploaded the correct plan for the temp table (sorry for getting the wrong plan uploaded earlier).

    I'm not hell bent on using a table variable, I was just posting a question to see if anyone could explain why my static table would be scanned when joined to a table variable vs. using an index when joined to a temp table.

    I appreciate all the feedback.

    -Adam

Viewing 15 posts - 1 through 15 (of 21 total)

You must be logged in to reply to this topic. Login to reply