Smalldatetme (Query with Where, Cast, DATE vs Others)

  • I have a Smalldatetime field (logdate) in a 2008 R2 Table.

    If I use any of the Where statements below in a basic query 55 Rows are returned.

    Select *

    From Table.

    --where logdate = '20150911' -- 55 Rows (logdate declared as smalldatetime field in db)

    --where convert(smalldatetime, logdate) ='20150911' -- 55 Rows

    --where cast(logdate as smalldatetime) ='20150911' -- 55 Rows

    --where convert(datetime, logdate) ='20150911' -- 55 Rows

    --where convert(datetimeoffset, logdate) ='20150911' -- 55 Rows

    However if I cast or convert to DATE 429 Rows are returned.

    where cast(logdate as date) ='20150911' -- 429 Rows

    --where convert(date, logdate) ='20150911' -- 429 Rows

    If I do NOT cast as DATE only records with time stamp of 00.00.00 (2015-09-11 00:00:00) are returned for that day.

    If I cast or convert to DATE all records are returned for the date regardless of the time stamp (e.g 2015-09-11 17:01:00)

    Is this normal ?

    Apologies I'm fairly new to SQL.

    Thanks,

    Neil

  • 2 things jump out.

    1. Why do you need to use cast or convert if the data type is already smalldatetime?

    2. I (and this is just me speaking) never use a single equality comparison when working with dates. I always use a construct along these lines:

    declare @beg_date smalldatetime = '2015-09-01'

    declare @end_date smalldatetime = '2015-09-01'

    Select * from whateverTable

    where TableDate >= @beg_date

    and TableDate < Dateadd(dd, @end_date, 1)

    This eliminates the time component from any comparison regardless of the temporal data type

    __________________________________________________________________________________________________________
    How to Post to get the most: http://www.sqlservercentral.com/articles/Best+Practices/61537/

  • neil.michael (9/15/2015)


    Is this normal ?

    Yup. Dunno if it is a good analogy, but consider that you have a floating point number column with values 1.0, 1.1, 1.2 ... and you say

    WHERE MyFloatColumn = 1

    you are only going to get the rows where MyFloatColumn matches 1.0

    TIME is like a fraction of the day.

    where cast(logdate as date) ='20150911' -- 429 Rows

    sadly this is very inefficient because it is not SARGable (well it might be, but a more complex expression/function used on a column in a Where clause or Join condition won't be).

    If you say

    WHERE logdate = @SomeValue

    or

    WHERE logdate >= @SomeValue AND logdate < @SomeOtherValue

    (i.e. a range) then SQL can use any index on [logdate] very effectively.

    But if you involve [logdate] in a funciton then SQL will scan the whole table looking for matches instead.

    So basically you want

    WHERE logdate >= @MyStartDate

    AND logdate < @LastDateToIncludePlusOneDay

    Now then 🙂 if instead you have a DATE, rather than a DATETIME, datatype column you can definitely say

    WHERE logdate = @MyReportDay

    and for a range you can say

    WHERE logdate >= @MyStartDate

    AND logdate <= @LastDateToInclude

  • Yes, it's normal. Indeed, it's the only way to get consistently accurate results.

    In general terms, if you're comparing a value including a time, you need the time to get an accurate comparison. For example, if you tell someone to meet you on Sep 11 at 9:00AM, you don't expect them to pick any random time during the day to meet you. But if you tell someone just to meet you on Sep 11, by default it's the same as telling them any time, so all times should match.

    In SQL Server terms, the important underlying concept is that when you compare different data types, SQL uses one of its rules, "data type precedence", to decide which type to convert to the other type to make them compatible for whatever operation is to be done.

    Some examples:

    1) Comparing smalldatetime to varchar. Since smalldatetime has a higher precedence than varchar, the varchar will first be converted to smalldatetime, then they will be compared as smalldatetime. Since the value includes a time, the time must match.

    2) Comparing date to smalldatetime. The date will be converted to smalldatetime, etc..

    3) UPDATE table SET smalldatetime_column = smalldatetime_column + 'A'

    You correctly get an error, because you can't add "A" to a datetime. SQL detects this error as it tries to convert the 'A' to a smalldatetime.

    Now consider this expression:

    SELECT 4 + '1'

    Is the answer 5 or '41'? Since any integer value has a higher precedence than a varchar, the answer is 5.

    So what do you think happens when you do this:

    SELECT 4 + 'A'

    Right! You get an error, because SQL can't convert the 'A' to an int.

    Likewise, that explains why you get a (frustrating) error when you do this:

    PRINT 'Date run is: ' + GETDATE()

    Since varchar has lower precedence than a datetime, SQL tries to convert 'Date run is:' to datetime and CRASH! You yourself have to put GETDATE() into varchar so that SQL can concat two varchars, just like you really want to happen:

    PRINT 'Date run is: ' + CAST(GETDATE() AS varchar(30))

    SQL DBA,SQL Server MVP(07, 08, 09) "It's a dog-eat-dog world, and I'm wearing Milk-Bone underwear." "Norm", on "Cheers". Also from "Cheers", from "Carla": "You need to know 3 things about Tortelli men: Tortelli men draw women like flies; Tortelli men treat women like flies; Tortelli men's brains are in their flies".

  • Kristen-173977 (9/15/2015)


    where cast(logdate as date) ='20150911' -- 429 Rows

    sadly this is very inefficient because it is not SARGable (well it might be, but a more complex expression/function used on a column in a Where clause or Join condition won't be).

    Actually, it's SARGable. This doesn't mean that all conversions will be SARGable, but converting from a datetime or smalldatetime to date maintains the condition SARGable.

    Here's a small test to prove it.

    IF OBJECT_ID('tempdb..#SomeTestTable','U') IS NOT NULL

    DROP TABLE #SomeTestTable;

    DECLARE @StartDate SMALLDATETIME = '20150101';

    SELECT TOP (1000000)

    SomeId = IDENTITY(int,1,1),

    SomeValue = RAND(CHECKSUM(NEWID())) * 8546 ,

    logdate = RAND(CHECKSUM(NEWID())) * 365 + @StartDate

    INTO #SomeTestTable

    FROM sys.all_columns ac1

    CROSS JOIN sys.all_columns ac2;

    CREATE CLUSTERED INDEX CI_Test ON #SomeTestTable(SomeId);

    CREATE NONCLUSTERED INDEX IX_Test ON #SomeTestTable(logdate);

    -- This query is still SARGable

    SELECT logdate

    FROM #SomeTestTable

    WHERE CAST(logdate AS date) = '20150911';

    GO

    -- DROP TABLE #SomeTestTable;

    Be careful when using functions in your WHERE clauses, avoid them when possible and defend them when necessary or useful. If you can't give a convincing argument to use them, simply don't use them.

    Luis C.
    General Disclaimer:
    Are you seriously taking the advice and code from someone from the internet without testing it? Do you at least understand it? Or can it easily kill your server?

    How to post data/code on a forum to get the best help: Option 1 / Option 2
  • Luis Cazares (9/16/2015)


    Here's a small test to prove it.

    Interesting, thanks. I tried it without the CAST and got a query plan that looks like less work - its still using INDEX SEEK, but does the lack of WHERE / CONVERT / etc. mean that there is any, tangible, CPU saving?

    |--Nested Loops(Inner Join, OUTER REFERENCES: ([Expr1006], [Expr1007], [Expr1005]))

    |--Compute Scalar(DEFINE: (([Expr1006],[Expr1007],[Expr1005])

    =GetRangeThroughConvert('2015-09-11','2015-09-11',(62))))

    | |--Constant Scan

    |--Index Seek(OBJECT: ([tempdb].[dbo].[#SomeTestTable]),

    SEEK: ([tempdb].[dbo].[#SomeTestTable].[logdate] > [Expr1006]

    AND [tempdb].[dbo].[#SomeTestTable].[logdate] < [Expr1007]),

    WHERE: (CONVERT(date,[tempdb].[dbo].[#SomeTestTable].[logdate],0)='2015-09-11')

    ORDERED FORWARD)

    WHERE logdate >= '20150911' AND logdate < '20150912'

    gave me this:

    |--Index Seek(OBJECT: ([tempdb].[dbo].[#SomeTestTable]),

    SEEK: ([tempdb].[dbo].[#SomeTestTable].[logdate] >= '2015-09-11 00:00:00.000'

    AND [tempdb].[dbo].[#SomeTestTable].[logdate] < '2015-09-12 00:00:00.000')

    ORDERED FORWARD)

  • The statistics are about the same for both cases. I was actually testing it quite deeply last week as i wasn't sure if CAST(logdate AS date) = '20150911' is a SARG.

  • Kristen-173977 (9/16/2015)


    Luis Cazares (9/16/2015)


    Here's a small test to prove it.

    Interesting, thanks. I tried it without the CAST and got a query plan that looks like less work - its still using INDEX SEEK, but does the lack of WHERE / CONVERT / etc. mean that there is any, tangible, CPU saving?

    There's a small CPU saving, but not tangible.

    I do agree that explicitly defining the range is a better option to write queries.

    Luis C.
    General Disclaimer:
    Are you seriously taking the advice and code from someone from the internet without testing it? Do you at least understand it? Or can it easily kill your server?

    How to post data/code on a forum to get the best help: Option 1 / Option 2
  • Thanks so much to all for the very informative and helpful replies.

    where logdate >= '20150911' and logdate < '20150912'

    works well for me.

  • Luis Cazares (9/16/2015)


    There's a small CPU saving, but not tangible.

    Thanks, as I assumed. I was pleasantly surprised how the plan turned out 🙂

    I'm now going to have to spend some time fiddling around seeing how other similar situations actually plan ... I might be spending far too much time trying to avoid assumed inefficient plans when in fact I would get a perfectly decent plan without my hard work!

    Bit like someone pointing out to me the other day that I don't need to use RTrim() in

    NullIf(RTrim(@SomeParameter), '')

    I must have wasted hours - days? - typing RTrim() in statement like that over the years!

Viewing 10 posts - 1 through 9 (of 9 total)

You must be logged in to reply to this topic. Login to reply