The Cost of Function Use In A Where Clause

  • Thanks for the Response...

    Again, I'd have to see the exact DB to be able to reproduce the results.

    All of our tables have clustered indexes and stats are updated.

    I can pretty much reproduce our behavior 100% of the time.

    Interesting behavior. I suppose the moral to the story is always check the query plan to make sure you're gettig the index relief you expect.

    GAJ

    Gregory A Jackson MBA, CSM

  • Nick Rawlins (2/28/2008)


    Good artible, however a full table scan isn't being performed, but an Index Scan is which is different to a table scan...

    There really isnt much of a difference between an index scan and a table scan on a non-clustered index, and absolutely no difference on a clustered index scan vs. a table scan, cept the name. You will never see "table scan" on a table with a clustered index, not because it doesnt happen, but because the Clustered Index contains the data at the leaf level of the B-Tree, so you are scanning effectively...the table.

  • jeremy.giaco (2/28/2008)


    I liked the article, but I don't seem to have the problem with my queries that you are describing when using functions like IsNull in the where clause.

    Im running SQL Server 2005. I created a Table to test this with, [Table1] with a non-clustered index on [Column1] (no included columns on that index). It is important to say that my table has a clustered index on it, but not on Column1. The following queries both have identical execution plans.

    Select *

    From Table1

    Where IsNull(Column1,'') = ''

    Select *

    From Table1

    Where Column1 = '' or Column1 is null

    ...and both utilize my index on Column1.

    That being MY case (though my case may be one of the "not always as expected cases")....concerning the post about the Created_Date index. Is it possible that your table does not have a clustered index, thus a HEAP, or that your statistics are out of date? You may not want to rule anything out until you've tried it.

    Yes, they both "utilize" the index... but how? I think you'll find that utilization is limited to an Index SCAN and that you cannot get an Index SEEK out of the ISNULL(), ever.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • jeremy.giaco (2/28/2008)


    Nick Rawlins (2/28/2008)


    Good artible, however a full table scan isn't being performed, but an Index Scan is which is different to a table scan...

    There really isnt much of a difference between an index scan and a table scan on a non-clustered index, and absolutely no difference on a clustered index scan vs. a table scan, cept the name. You will never see "table scan" on a table with a clustered index, not because it doesnt happen, but because the Clustered Index contains the data at the leaf level of the B-Tree, so you are scanning effectively...the table.

    If it's a covering non-clustered index, then an index scan is VERY different from a table scan....

    ----------------------------------------------------------------------------------
    Your lack of planning does not constitute an emergency on my part...unless you're my manager...or a director and above...or a really loud-spoken end-user..All right - what was my emergency again?

  • The most common function I use in a WHERE clause is a convert. Usually "WHERE Convert(char(10),ModifiedOn,101) = Convert(char(10),GetDate(),101)". Usually because I'm in the middle of a data update and want to look at all the records I just changed.

    BTW, I hate the BETWEEN keyword because it always misses some of my records or adds records in I didn't expect.

    I do like the suggestion someone else made with the >= and <= date comparisons with the DateAdd in the max comparison part.

    Interesting article. Not something I ever thought about, but now that I have, I'm going back over some of my most important code and seeing if I have anything that might affect me.

    EDIT: It occurs to me... What about the YEAR() and MONTH() functions? How do you deal when you want to do a WHERE Year(MyDate) = 2006 or something like that? I can't find a good alternative to using a function in situations like this.

    Brandie Tarvin, MCITP Database AdministratorLiveJournal Blog: http://brandietarvin.livejournal.com/[/url]On LinkedIn!, Google+, and Twitter.Freelance Writer: ShadowrunLatchkeys: Nevermore, Latchkeys: The Bootleg War, and Latchkeys: Roscoes in the Night are now available on Nook and Kindle.

  • Jeff Moden (2/28/2008)


    Yes, they both "utilize" the index... but how? I think you'll find that utilization is limited to an Index SCAN and that you cannot get an Index SEEK out of the ISNULL(), ever.

    I now completely agree to remove the IsNull() from WHERE clause after more testing, however I was able to get a seek with it...(this is kind of funny)...

    ...I neglected to set my Column1 to NULL. I'm assuming, (and glad to learn that) the optimizer is smart enough to realize that there cannot be a null value in my test table because of the NOT NULL, so it did not care about the IsNull() in my WHERE clause. Anyone in the post who did get an index seek with it probably took any ol' table they had layin around and ran a query against it using the IsNull(), so some of us got seeks, and some didnt, probably because of the mocked up nature of the queries we were writing.

    So, to summarize I did get a seek using IsNull in my test table, however it will probably never happen under normal circumstances, and I REALLY appreciate this article now. I feel enlightened.

    Thank you.

    -Jeremy

  • --My Test

    Set NoCount On

    Go

    Create Table #Testing(

    TestingId int identity not null,

    SomeColumn varchar(25) not null

    )

    Go

    Declare @x int

    Set @x = 1000

    While @x > 0

    Begin

    Insert Into #Testing (SomeColumn)

    Select 'John' Union

    Select 'James' Union

    Select 'Mary' Union

    Select 'Paul' --Union

    --Select Null

    Set @x = @x -1

    End

    Go

    Insert Into #Testing (SomeColumn)

    Values ('Jeremy')

    Go

    Create Nonclustered Index IX_SomeColumn on #Testing(SomeColumn)

    Create Clustered Index IX_TestingId on #Testing(TestingId)

    Go

    Select SomeColumn

    From #Testing

    Where IsNull(SomeColumn,'') = 'Jeremy'

    Go

    --Drop Table #Testing

    Go

  • Jeremy,

    Excellent work.

    it's important to prove assertions in articles like this.

    all too often, folks just read stuff and believe it as the gospel.

    Thanks for the kudos and for a great discussion thread.

    GAJ

    Gregory A Jackson MBA, CSM

  • It looks like a multikey index will be used ("index seek") if the first key in the index is referenced in the WHERE clause without a function and the second key in the index is referenced with a function. From what I can tell based on IO stats the index is used differently but is still a seek on a covering index but the IO is higher than it is with no function at all in the where clause. Here's a modified version of Jeremy's code. There's some superfluous stuff in there from me experimenting trying to replicate what I was seeing in my testing. The use of a function does not completely remove the possibility of index usage:

    Set NoCount On

    Go

    Create Table #Testing(

    TestingId int identity not null,

    SomeColumn int null

    )

    Go

    Declare @x int

    Set @x = 10000

    While @x > 0

    Begin

    Insert Into #Testing (SomeColumn)

    Select 1.1 Union

    Select 2.2 Union

    Select 3.3 Union

    Select 4.4 --Union

    --Select Null

    Set @x = @x -1

    End

    Go

    --Insert Into #Testing (SomeColumn)

    --Values ('Jeremy')

    Go

    --drop index #Testing.IX_SomeColumn

    Create clustered Index IX_SomeColumn on #Testing(SomeColumn,TestingId)

    --Create unique clustered Index IX_SomeColumn on #Testing(TestingId)

    --Create Clustered Index IX_TestingId on #Testing(TestingId)

    Go

    Select *

    From #Testing with (index(IX_SomeColumn))

    Where ceiling(TestingId) > 124 and ceiling(TestingId) < 500 and SomeColumn = 4

    --Where TestingId > 124 and TestingId < 500 and SomeColumn = 4

    option (recompile)

    Go

    --Drop Table #Testing

    Go

    select SomeColumn, count(*) from #Testing group by SomeColumn

  • I did not find the same results. The results for functions upper and substring did cause an index scan but the isnull function consistently returned results using index seek. The table had 8 indexes on it with various first key columns. I tried with various first key columns always the same results. Albeit it was a limited test, involving only this one table, warrants keeping an eye on it to see if I come up with isnull causing an index scan.

  • yes agreed.

    I would try to break the habit of using isNull() and other functions in your where clauses unless you are very certain you can consistently get the index Seeks.

    definately monitor this. if you see IO start to go UP or if you start to see the number of Full Scans\sec increase, then the data in your DB may have changed enough to modify the query plan and you may lose the Seek relief that you are now getting.

    the DB I tested this on for the Article is our production DB

    about 40GB in total size (not huge) but many of the tables in question have 100s of thousands or even millions of records so the impact was far from trivial.

    Cheers,

    GAJ

    Gregory A Jackson MBA, CSM

  • Update: I was reviewing some of the other posts and see that the column must be nullable, made sense re-ran the test on a nullable column happen to be a numeric column, the isnull function returned the results using the index seek but a using the convert function to char returned the results using the index scan. The index seek was used even when the numeric value was compared to a string.

  • Jeremy Giaco (2/28/2008)


    Jeff Moden (2/28/2008)


    Yes, they both "utilize" the index... but how? I think you'll find that utilization is limited to an Index SCAN and that you cannot get an Index SEEK out of the ISNULL(), ever.

    I now completely agree to remove the IsNull() from WHERE clause after more testing, however I was able to get a seek with it...(this is kind of funny)...

    ...I neglected to set my Column1 to NULL. I'm assuming, (and glad to learn that) the optimizer is smart enough to realize that there cannot be a null value in my test table because of the NOT NULL, so it did not care about the IsNull() in my WHERE clause. Anyone in the post who did get an index seek with it probably took any ol' table they had layin around and ran a query against it using the IsNull(), so some of us got seeks, and some didnt, probably because of the mocked up nature of the queries we were writing.

    So, to summarize I did get a seek using IsNull in my test table, however it will probably never happen under normal circumstances, and I REALLY appreciate this article now. I feel enlightened.

    Thank you.

    -Jeremy

    Wow... nice job, Jeremy... I ran your test and, by gum, there's an index SEEK both in SQL Server 2000 and 2005 ... I've never seen such a thing. Certainly opens up some "possibilites" and certainly seems to put the whole function thing into a "myth" category.

    You just know I'm going to do some more testing on this... like you said, I feel enlightened, but in a differenet way. I just love it when someone comes up with a real live "myth buster" of their own! Well done!

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • if you look up sargeable criteria you will find that

    column name on left hand side

    any function(where humanly avoidable) on the right hand side

    basic rules of query writing , and while i applaud the author for writing this to enforce good practice, it always amazes me that people are still doing it the wrong way.

    MVDBA

  • select * from table_name where created_date>=convert(varchar(11), Created_Date, 106) and

    created_date<convert(varchar(11), dateadd(d,-1,Created_Date), 106)

    completly sargeable

    MVDBA

Viewing 15 posts - 46 through 60 (of 98 total)

You must be logged in to reply to this topic. Login to reply