July 27, 2009 at 5:01 pm
I need help identifying a workaround for an optimizer problem I am having.
Basically, I have a table with dates stored in a varchar field. (Yes I know it is bad. I didn't design it and fixing it breaks the software).
Here is the query
select datediff( m, CHARDATE, '20090630' ), *
from dbo.ComputeScalarProblem
where isdate( CHARDATE ) = 1 and datediff( m, CHARDATE, '20090630' ) < 6[/code]
The problem I am seeing is the scalar datediff is being calculated before the isdate function gets applied.
Here is test data to try with.
[code="sql"]drop table ComputeScalarProblem
create table ComputeScalarProblem(
CHARDATE varchar( 8 )
)
insert into ComputeScalarProblem ( CHARDATE ) values ( '20080101' )
insert into ComputeScalarProblem ( CHARDATE ) values ( '20080201' )
insert into ComputeScalarProblem ( CHARDATE ) values ( '20080301' )
insert into ComputeScalarProblem ( CHARDATE ) values ( '20080401' )
insert into ComputeScalarProblem ( CHARDATE ) values ( '20080501' )
insert into ComputeScalarProblem ( CHARDATE ) values ( '20080601' )
insert into ComputeScalarProblem ( CHARDATE ) values ( '20080701' )
insert into ComputeScalarProblem ( CHARDATE ) values ( '20080801' )
insert into ComputeScalarProblem ( CHARDATE ) values ( '20080901' )
insert into ComputeScalarProblem ( CHARDATE ) values ( '20081001' )
insert into ComputeScalarProblem ( CHARDATE ) values ( '20081101' )
insert into ComputeScalarProblem ( CHARDATE ) values ( '20081201' )
insert into ComputeScalarProblem ( CHARDATE ) values ( '20090101' )
insert into ComputeScalarProblem ( CHARDATE ) values ( '20090201' )
insert into ComputeScalarProblem ( CHARDATE ) values ( '20090301' )
insert into ComputeScalarProblem ( CHARDATE ) values ( '20090401' )
insert into ComputeScalarProblem ( CHARDATE ) values ( '20090501' )
insert into ComputeScalarProblem ( CHARDATE ) values ( '20090601' )
insert into ComputeScalarProblem ( CHARDATE ) values ( '00000000' )
select datediff( m, CHARDATE, '20090630' ), *
from dbo.ComputeScalarProblem
where isdate( CHARDATE ) = 1 and datediff( m, CHARDATE, '20090630' ) < 6
[/code]
I know I can do a subselect but that is rather ugly for a straightforward query.
Any ideas on a better approach
July 27, 2009 at 7:27 pm
Well, you could use a CASE statement or a UDF, but you would need to use it in both your SELECT clause and the WHERE clause. Otherwise you're going to have to use a CTE or a subquery. I think that the CTE is slightly cleaner. Here is what I came up with.
WITH DateConvert AS (
SELECT CASE isdate(CharDate)
WHEN 1 THEN DateDiff(m,CHARDATE, '20090630' )
ELSE Null
END AS DateDifference
, *
FROM dbo.ComputeScalarProblem
)
SELECT *
FROM DateConvert
WHERE DateDifference < 6
Drew
J. Drew Allen
Business Intelligence Analyst
Philadelphia, PA
July 28, 2009 at 5:54 am
First convert the chardate into date format then you can apply all date functions you want.
convert(datetime,chrdate,103)
July 28, 2009 at 8:21 am
Thanks Drew, that is exactly what I was hoping to learn.
I haven't used CTE's before, but they look very interesting.
July 29, 2009 at 8:03 am
Personally, I like the Case method better. The execution plan is the same as for the CTE, but to me it is easier to see what is going on.
select CASE isDate(CHARDATE) WHEN 1 THEN datediff( m, CHARDATE, '20090630' ) END, *
from dbo.ComputeScalarProblem
where CASE isDate(CHARDATE) WHEN 1 THEN datediff( m, CHARDATE, '20090630' ) END < 6
July 29, 2009 at 9:43 am
shamassaeedmr (7/28/2009)
First convert the chardate into date format then you can apply all date functions you want.convert(datetime,chrdate,103)
What happens to your query when chrdate isn't actually convertable to a datetime datatype??
Best,
Kevin G. Boles
SQL Server Consultant
SQL MVP 2007-2012
TheSQLGuru on googles mail service
July 29, 2009 at 10:33 am
This works OK for me.
select
datediff( m, CHARDATE, '20090630' ), *
from
dbo.ComputeScalarProblem
where
case
when isnull(isdate( CHARDATE ),0) 1
then 0
when datediff( m, CHARDATE, '20090630' ) >= 6
then 0
else 1
end = 1
Results:
CHARDATE
----------- --------
5 20090101
4 20090201
3 20090301
2 20090401
1 20090501
0 20090601
(6 row(s) affected)
July 30, 2009 at 9:26 am
I think that last case statement is the best solution so far.
Thanks everyone for your help.
In the end, I am really interested in why the Optimizer thinks applying a scalar function to every row in the table will be faster then applying it after the other filter(s) are applied? You would think that calculating the diff for 6 rows would be faster then calculating the diff for 12 rows.
Viewing 8 posts - 1 through 7 (of 7 total)
You must be logged in to reply to this topic. Login to reply