Select 1 day of data

  • TABLE1 - contains 5 column and one column is

    DATECOL DATATYPE smalldatetime

    Total number of rows 400K+

    Q1.

    SELECT * FROM Table1 WHER datecol >= '2009-10-01' and datecol < '2009-10-02'

    Q2.

    SELECT * FROM Table1 WHER datecol BETWEEN '2009-10-01 00:00:00' and '2009-10-01 23:59:59'

    which query is better for selecting one day of data?

  • Q1 is better, since it actually includes the whole day.

  • [Edit] Previously suggested a cast method for datecol that as Michael pointed out could prevent index seeks. Removing the suggestion so as not to steer people who don't read the whole thing in the wrong direction. Go with his.

    [/Edit]

    Seth Phelabaum


    Consistency is only a virtue if you're not a screwup. 😉

    Links: How to Post Sample Data[/url] :: Running Totals[/url] :: Tally Table[/url] :: Cross Tabs/Pivots[/url] :: String Concatenation[/url]

  • Garadin (10/12/2009)


    Neither, use this:

    SELECT * FROM Table1 WHER CAST(CAST(datecol-.5 as int) as datetime) =

    CAST('2009-10-01' as datetime)

    The dateadd/datediff method that someone will likely post shortly is almost identical, I just find this one easier to remember and from the results I've seen, the performance difference is mostly academic.

    That is not a good method, since it means that datecol will be cast it an int and back to datetime for every row, and the query cannot use an index because of that.

  • Michael Valentine Jones (10/12/2009)


    That is not a good method, since it means that datecol will be cast it an int and back to datetime for every row, and the query cannot use an index because of that.

    That's not what I see in my tests. That said, the method is matching up exactly with the original supplied methods, which I could have sworn was not the case. I thought getting rid of the range significantly increased performance. Could have been mistaken there.

    Seth Phelabaum


    Consistency is only a virtue if you're not a screwup. 😉

    Links: How to Post Sample Data[/url] :: Running Totals[/url] :: Tally Table[/url] :: Cross Tabs/Pivots[/url] :: String Concatenation[/url]

  • Garadin (10/12/2009)

    The dateadd/datediff method that someone will likely post shortly......

    Just to honour it 🙂

    SELECT dateadd(dd,datediff(dd,0,getdate()),0)

    What Mike is saying could apply in this case also since its using function on a column!

    ---------------------------------------------------------------------------------

  • Garadin (10/12/2009)


    Michael Valentine Jones (10/12/2009)


    That is not a good method, since it means that datecol will be cast it an int and back to datetime for every row, and the query cannot use an index because of that.

    That's not what I see in my tests. That said, the method is matching up exactly with the original supplied methods, which I could have sworn was not the case. I thought getting rid of the range significantly increased performance. Could have been mistaken there.

    I just tested both methods against a large table (44,000,000 rows) with a datetime column that had an index. See sample queries below.

    The first did an index seek and took 0.017 seconds to complete. The second did an index scan and took 14.987 seconds to complete.

    As I said before, converting the datetime column to an int and back to a datetime prevents an index lookup.

    select

    MyDateCol

    from

    MyTable

    where

    MyDateCol >= '2009-10-01' and

    MyDateCol < '2009-10-02'

    select

    MyDateCol

    from

    MyTable

    where

    CAST(CAST(MyDateCol-.5 as int) as datetime) = CAST('2009-10-01' as datetime)

  • Michael Valentine Jones (10/12/2009)


    I just tested both methods against a large table (44,000,000 rows) with a datetime column that had an index. See sample queries below.

    The first did an index seek and took 0.017 seconds to complete. The second did an index scan and took 14.987 seconds to complete.

    As I said before, converting the datetime column to an int and back to a datetime prevents an index lookup.

    Hrm, I stand corrected. I may have gotten this mixed up at some point. (I was getting index scans on both methods in mine). I know you don't normally want to do conversions / use functions like this in a WHERE clause, but for some reason it was stuck in my head that date ranges were an exception to this rule. Thanks for the clarification Michael.

    [Edit] Just Re-tested on a larger table and my results match yours. [/Edit]

    Seth Phelabaum


    Consistency is only a virtue if you're not a screwup. 😉

    Links: How to Post Sample Data[/url] :: Running Totals[/url] :: Tally Table[/url] :: Cross Tabs/Pivots[/url] :: String Concatenation[/url]

  • Garadin (10/12/2009)


    [Edit] Previously suggested a cast method for datecol that as Michael pointed out could prevent index usage. Removing the suggestion so as not to steer people who don't read the whole thing in the wrong direction. Go with his.

    [/Edit]

    Are the tests you've run postable? And keep in mind that an Index SCAN isn't what most people think of when they talk about "index usage".

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Jeff Moden (10/12/2009)


    Garadin (10/12/2009)


    [Edit] Previously suggested a cast method for datecol that as Michael pointed out could prevent index usage. Removing the suggestion so as not to steer people who don't read the whole thing in the wrong direction. Go with his.

    [/Edit]

    Are the tests you've run postable? And keep in mind that an Index SCAN isn't what most people think of when they talk about "index usage".

    Unfortunately not, I just tested on a couple live tables that were 6M or so rows. I haven't completely given up on this one, so if you think it's worth exploring, I'll look into it more. It does allow index scans, I just haven't seen it use a seek in any tests today.

    Not sure why I can't shake the feeling that this one circumvented the rule tho.

    And yeah, I'll clarify the seeks, not usage.

    Seth Phelabaum


    Consistency is only a virtue if you're not a screwup. 😉

    Links: How to Post Sample Data[/url] :: Running Totals[/url] :: Tally Table[/url] :: Cross Tabs/Pivots[/url] :: String Concatenation[/url]

  • Thanks for the offer, Seth, but in light of the other posts, there's no need unless you can remember the exception you're thinking of. It also seems to me that there was an exception somewhere along the line but I can't remember what it may have been.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

Viewing 11 posts - 1 through 10 (of 10 total)

You must be logged in to reply to this topic. Login to reply