Date string in WHERE causes table scan instead of index seek

  • Simple query on a table with > 10 million rows. Non-clustered index with TRANS_DATE (smalldatetime) field as the first column.

    This query uses the index and is lightning fast:

    select ACCT_NUM, OPERATOR_NUM, BRANCH_NUM

    from TRANS_J_DATA_FULL

    where TRANS_DATE between '2007-12-01 00:00:00' and '2007-12-01 23:59:59'

    This query uses a table scan and takes > 60 seconds:

    declare @ddate varchar(20)

    declare @dstartdate varchar(20)

    declare @denddate varchar(20)

    set @dDate = '2007-12-01'

    set @dStartDate = @dDate + ' 00:00:00'

    set @dEndDate = @dDate + ' 23:59:59'

    select ACCT_NUM, OPERATOR_NUM, BRANCH_NUM

    from TRANS_J_DATA_FULL

    where trans_date between @dStartDate and @dEndDate

    @dDate would actually be a string passed into the sp, above is just for illustrative purposes. Why is query optimizer not using the index when I add the time portion onto the date? I've tried CONVERT and CAST with the same result. Any suggestions are appreciated!

    Steve

  • It's not the concat that's the problem, it's the variable usage.

    If a query uses either parameters or hard coded values, the optimiser knows, at the point that it's compiling and optimising the query what the values are. It can use those values, along with the column statistics to get an estimate of the number of rows that the query will affect. Based on the number of rows, if can pick a good execution plan. If only a very small percentage of the table will be returned, it picks a index seek.

    If the query uses variables, the optimiser can't see the value of those variables at compile and optimise time. Since it can't see the values, it has no idea if the between is going to affect 1 row, or the entire table. The optimiser makes a guess that 1/3 of the table will be returned. Based on that guess it picks a plan. Since it doesn't know that only a few rows will be returned, it picks a plan optimal for around 3 million rows - table scan.

    The best solution would be to pass to the stored proc, the two dates that you're using in the query. If you can't change the proc, try doing the date manipulation in the where clause.

    select ACCT_NUM, OPERATOR_NUM, BRANCH_NUM

    from TRANS_J_DATA_FULL

    where trans_date >= @dDate and trans_date < DATEADD(dd,1,@dDate)

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • Thanks for the input. Unfortunately, that didn't help either. Seems that manipulating the date at all is enough for SQL to ignore the index. I can't pass the actual dates to the sp because this is running in a scheduled DTS package with getdate as the reference. However, I finally found a solution by using an index hint. This works perfectly:

    select ACCT_NUM, OPERATOR_NUM, BRANCH_NUM

    from TRANS_J_DATA_FULL (INDEX=IX_TRAN_FULL)

    where TRANS_DATE >= @dStartDate and TRANS_DATE <= @dEndDate

  • If memory serves correctly, I believe that form of index hint is "deprecated". Try this, instead...

    select ACCT_NUM, OPERATOR_NUM, BRANCH_NUM

    from TRANS_J_DATA_FULL WITH (INDEX(IX_TRAN_FULL))

    where TRANS_DATE >= @dStartDate and TRANS_DATE <= @dEndDate

    Also, if your dates have a time other than midnight, then the best way to do this is...

    select ACCT_NUM, OPERATOR_NUM, BRANCH_NUM

    from TRANS_J_DATA_FULL WITH (INDEX(IX_TRAN_FULL))

    where TRANS_DATE >= @dStartDate and TRANS_DATE < @dEndDate + 1

    Better to do it that way most of the time because requirements on dates can easily change...

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Steve Ervolino (1/4/2008)


    Thanks for the input. Unfortunately, that didn't help either. Seems that manipulating the date at all is enough for SQL to ignore the index.

    Hmmm. Sometimes I love parameter sniffing and sometimes I hate it.

    Ok, here's another possibility. Create a wrapper proc that the DTS calls. The wrapper proc recalculates the dates and calls a second proc that takes 2 parameters. something like

    CREATE PROCEDURE OuterProc @dt

    as

    declare @dstartdate datetime

    declare @denddate datetime

    set @dStartDate = dateadd(dd, datediff(dd,0, @dt),0)

    set @dEndDate = dateadd(dd, datediff(dd,0, @TheDate) + 1,0)

    EXEC InnerProc @dStartDate, @dEndDate

    GO

    CREATE PROCEDURE InnerProc @dStartDate, @dEndDate

    AS

    select ACCT_NUM, OPERATOR_NUM, BRANCH_NUM

    from TRANS_J_DATA_FULL

    where trans_date >= @dStartDate and trans_date < @dEndDate

    GO

    The DTS package calls OuterProc. That should get you an optimal plan, without handicapping the query optimiser.

    The problem with index hints are twofold. Firstly, if that index is ever renamed or dropped, your procedure will break. Second, while you can force SQL to us a particular index, you cannot force it to seek on that index. It may decide to do an index scan. Also, if things change so that that index is no longer the optimal one to use, the query will still use the non-optimal index.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • Steve,

    index on a column used in "range" selection ("BETWEEN", ">" or/and "<", TRANS_DATE in your case) must be clustered.

    Fix it and the SP will be always fast.

    _____________
    Code for TallyGenerator

  • I see another potential problem here...what is the datatype of your column in the table? You are declaring your variables as type varchar, but if your columns are a datetime, then the system has to convert behind the scenes. If your column datatypes are datetime, then try changing the datatypes of your variables to match. Perhaps the reason the first one works with a "varchar literal" is that behind the scenes, it makes them dates.

    Let me know if I'm off base.

    Hope this helps.

  • Sergiy (1/5/2008)


    Steve,

    index on a column used in "range" selection ("BETWEEN", ">" or/and "<", TRANS_DATE in your case) must be clustered.

    Fix it and the SP will be always fast.

    I don't agree with you there. A covering noncluster works as well (or better) than a cluster for a range selection. A non-covering nonclustered may or may not be used for a range, depending on how many rows qualify. Lookups to the cluster/heap are expensive

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • GilaMonster (1/6/2008)


    Sergiy (1/5/2008)


    Steve,

    index on a column used in "range" selection ("BETWEEN", ">" or/and "<", TRANS_DATE in your case) must be clustered.

    Fix it and the SP will be always fast.

    I don't agree with you there. A covering noncluster works as well (or better) than a cluster for a range selection. A non-covering nonclustered may or may not be used for a range, depending on how many rows qualify. Lookups to the cluster/heap are expensive

    Apologies for dragging up an old thread, but I've just come across this myself.

    A simple query on a Date field and a Bit field didn't use the covering non-clustered index

    SELECT * FROM MyTable Where MyDateField < '2009-06-01' and MyBitField = 0 -- DOES A TABLE SCAN

    It did use the non-clustered index if I change the MyDateField condition to an "="

    SELECT * FROM MyTable Where MyDateField < '2009-06-01' and MyBitField = 0 -- USES THE INDEX

    Turns out the table didn't have any clustered indexes - would this affect whether the index was used or not ?

  • Joseph Fallon what order are the fields in your in covering index, this will make a difference.

    The other thing to look out for is that I'm pretty sure, but happy to be told otherwise, that because you have a select * , the optimizer might think it's cheaper to scan the table than to use the index and then have to do a Look-up afterwords...

    ----------------------------------------------
    Try to learn something about everything and everything about something. - Thomas Henry Huxley

    :w00t:
    Posting Best Practices[/url]
    Numbers / Tally Tables[/url]

    SQL-4-Life
  • Christopher Stobbs (6/11/2009)


    Joseph Fallon what order are the fields in your in covering index, this will make a difference.

    It will, because one of the columns is an inequality.

    http://sqlinthewild.co.za/index.php/2009/02/06/index-columns-selectivity-and-inequality-predicates/

    p.s. Please in future start a new thread for a new problem. Thanks

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • Gail am I correcting in saying and I think I might have learn't or if I am wrong missunderstood, but ify ou have two columns in your where clause and 1 is an inequality the inequality should be on the right of the index list.

    So for example:

    WHERE DateField > '2009-01-01' AND BitField = 0

    Index should be: (BitField, DateField).

    I tested this earlier and it would do a seek with both columns in the predicate but it did depend on what my output was.

    ----------------------------------------------
    Try to learn something about everything and everything about something. - Thomas Henry Huxley

    :w00t:
    Posting Best Practices[/url]
    Numbers / Tally Tables[/url]

    SQL-4-Life
  • Christopher Stobbs (6/11/2009)


    Gail am I correcting in saying and I think I might have learn't or if I am wrong missunderstood, but ify ou have two columns in your where clause and 1 is an inequality the inequality should be on the right of the index list.

    Yup.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • Right ...

    When I don't use "SELECT * FROM " the query always uses the covering index. Thanks for the tip Mr Stobbs.

    When I DO use "SELECT * FROM " then the query will do either a table scan or an index scan depending on whether I use ">" or " '2009-06-01'

    and prealloc = 1[/i]

    Returns 50 records, uses index

    select * from [investing fund transactions] where allocation_date < '2009-06-01'

    and prealloc = 1

    Returns 3625 records, does not use index

    select fund_no from [investing fund transactions] where allocation_date < '2009-06-01'

    and prealloc = 1

    Always uses the index

  • Joseph Fallon (6/11/2009)


    When I don't use "SELECT * FROM " the query always uses the covering index. Thanks for the tip Mr Stobbs.

    Is the index really covering for SELECT *? Does it have every single column in the table somewhere in the index (key or include)?

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass

Viewing 15 posts - 1 through 15 (of 31 total)

You must be logged in to reply to this topic. Login to reply