isdate priority

  • I need help identifying a workaround for an optimizer problem I am having.

    Basically, I have a table with dates stored in a varchar field. (Yes I know it is bad. I didn't design it and fixing it breaks the software).

    Here is the query

    select datediff( m, CHARDATE, '20090630' ), *

    from dbo.ComputeScalarProblem

    where isdate( CHARDATE ) = 1 and datediff( m, CHARDATE, '20090630' ) < 6[/code] The problem I am seeing is the scalar datediff is being calculated before the isdate function gets applied. Here is test data to try with.
    [code="sql"]drop table ComputeScalarProblem

    create table ComputeScalarProblem(
    CHARDATE varchar( 8 )
    )

    insert into ComputeScalarProblem ( CHARDATE ) values ( '20080101' )
    insert into ComputeScalarProblem ( CHARDATE ) values ( '20080201' )
    insert into ComputeScalarProblem ( CHARDATE ) values ( '20080301' )
    insert into ComputeScalarProblem ( CHARDATE ) values ( '20080401' )
    insert into ComputeScalarProblem ( CHARDATE ) values ( '20080501' )
    insert into ComputeScalarProblem ( CHARDATE ) values ( '20080601' )
    insert into ComputeScalarProblem ( CHARDATE ) values ( '20080701' )
    insert into ComputeScalarProblem ( CHARDATE ) values ( '20080801' )
    insert into ComputeScalarProblem ( CHARDATE ) values ( '20080901' )
    insert into ComputeScalarProblem ( CHARDATE ) values ( '20081001' )
    insert into ComputeScalarProblem ( CHARDATE ) values ( '20081101' )
    insert into ComputeScalarProblem ( CHARDATE ) values ( '20081201' )
    insert into ComputeScalarProblem ( CHARDATE ) values ( '20090101' )
    insert into ComputeScalarProblem ( CHARDATE ) values ( '20090201' )
    insert into ComputeScalarProblem ( CHARDATE ) values ( '20090301' )
    insert into ComputeScalarProblem ( CHARDATE ) values ( '20090401' )
    insert into ComputeScalarProblem ( CHARDATE ) values ( '20090501' )
    insert into ComputeScalarProblem ( CHARDATE ) values ( '20090601' )
    insert into ComputeScalarProblem ( CHARDATE ) values ( '00000000' )

    select datediff( m, CHARDATE, '20090630' ), *
    from dbo.ComputeScalarProblem
    where isdate( CHARDATE ) = 1 and datediff( m, CHARDATE, '20090630' ) < 6
    [/code]

    I know I can do a subselect but that is rather ugly for a straightforward query.
    Any ideas on a better approach

  • Well, you could use a CASE statement or a UDF, but you would need to use it in both your SELECT clause and the WHERE clause. Otherwise you're going to have to use a CTE or a subquery. I think that the CTE is slightly cleaner. Here is what I came up with.

    WITH DateConvert AS (

    SELECT CASE isdate(CharDate)

    WHEN 1 THEN DateDiff(m,CHARDATE, '20090630' )

    ELSE Null

    END AS DateDifference

    , *

    FROM dbo.ComputeScalarProblem

    )

    SELECT *

    FROM DateConvert

    WHERE DateDifference < 6

    Drew

    J. Drew Allen
    Business Intelligence Analyst
    Philadelphia, PA

  • First convert the chardate into date format then you can apply all date functions you want.

    convert(datetime,chrdate,103)

  • Thanks Drew, that is exactly what I was hoping to learn.

    I haven't used CTE's before, but they look very interesting.

  • Personally, I like the Case method better. The execution plan is the same as for the CTE, but to me it is easier to see what is going on.

    select CASE isDate(CHARDATE) WHEN 1 THEN datediff( m, CHARDATE, '20090630' ) END, *

    from dbo.ComputeScalarProblem

    where CASE isDate(CHARDATE) WHEN 1 THEN datediff( m, CHARDATE, '20090630' ) END < 6

  • shamassaeedmr (7/28/2009)


    First convert the chardate into date format then you can apply all date functions you want.

    convert(datetime,chrdate,103)

    What happens to your query when chrdate isn't actually convertable to a datetime datatype??

    Best,
    Kevin G. Boles
    SQL Server Consultant
    SQL MVP 2007-2012
    TheSQLGuru on googles mail service

  • This works OK for me.

    select

    datediff( m, CHARDATE, '20090630' ), *

    from

    dbo.ComputeScalarProblem

    where

    case

    when isnull(isdate( CHARDATE ),0) 1

    then 0

    when datediff( m, CHARDATE, '20090630' ) >= 6

    then 0

    else 1

    end = 1

    Results:

    CHARDATE

    ----------- --------

    5 20090101

    4 20090201

    3 20090301

    2 20090401

    1 20090501

    0 20090601

    (6 row(s) affected)

  • I think that last case statement is the best solution so far.

    Thanks everyone for your help.

    In the end, I am really interested in why the Optimizer thinks applying a scalar function to every row in the table will be faster then applying it after the other filter(s) are applied? You would think that calculating the diff for 6 rows would be faster then calculating the diff for 12 rows.

Viewing 8 posts - 1 through 7 (of 7 total)

You must be logged in to reply to this topic. Login to reply