July 11, 2014 at 7:01 pm
This is probably a silly question so don't put me in the worst questions posted thread (plz), however I have tried to google it with no luck. What is the performance difference between using Scalar Functions vs a Calculated Field and how are they processed?
The reason I ask is, I am trying to figure out if my method of using a Calculated field for determining someone's age is the same, better or worse than using a scalar function. I haven't had a chance to test with different row counts as I have started my new job and they are burring me with backlogged work all the way to January. A generalization is fine as I know "depends" is a real thing.
Thanks guys!
***SQL born on date Spring 2013:-)
July 11, 2014 at 7:50 pm
I am assuming that you are not talking about a persisted calculated column. That calculation, iirc, is done once and then stored. It isn't updated unless the column it is based on is updated. If I am wrong here, please, some one correct me.
An ordinary calculated column is calculated each time the row is accessed including the column itself. If the column is not accessed I don't believe the calculation is done, just seems to make logical sense.
This would be done on a row by row basis. Is this faster than a scalar function on the same column, don't know as I haven't tested it. Sounds like a something you might want to try and report back on, either in this thread or perhaps by writing an article about it.
If you are going to use a function in a query, I would steer away from the scalar function and look at using an inline table valued function written to return a one row table and use the cross apply in the from clause.
July 12, 2014 at 8:42 am
Lynn Pettis (7/11/2014)
I am assuming that you are not talking about a persisted calculated column. That calculation, iirc, is done once and then stored. It isn't updated unless the column it is based on is updated. If I am wrong here, please, some one correct me.An ordinary calculated column is calculated each time the row is accessed including the column itself. If the column is not accessed I don't believe the calculation is done, just seems to make logical sense.
This would be done on a row by row basis. Is this faster than a scalar function on the same column, don't know as I haven't tested it. Sounds like a something you might want to try and report back on, either in this thread or perhaps by writing an article about it.
If you are going to use a function in a query, I would steer away from the scalar function and look at using an inline table valued function written to return a one row table and use the cross apply in the from clause.
Thanks Lynn, and you are correct in your assumption I am not referring to a persisted computed column. I don't know very much about functions yet. So I guess maybe I should start researching it much more instead of bugging on here first. I'll look up the Table Valued Functions and see how to create and use those. As far as the article goes I don't have the smarts for something like that yet.
This was the example that sparked my curiosity on which one would be better used.
AGE = convert(int,DATEDIFF(d, 'DateofBirth', getdate())/365.25)
VS. this Scalar Function.
BEGIN
IF @BirthDate > @CurrentDate
RETURN -1
DECLARE @Age INT
SELECT @Age = DATEDIFF(YY, @BirthDate, @CurrentDate) -
CASE WHEN(
(MONTH(@BirthDate)*100 + DAY(@BirthDate)) >
(MONTH(@CurrentDate)*100 + DAY(@CurrentDate))
) THEN 1 ELSE 0 END
RETURN @Age
END
GO
***SQL born on date Spring 2013:-)
July 12, 2014 at 10:05 am
Thought you'd like to see some code that will help:
declare @BirthDate date;
set @BirthDate = '1959-07-29';
select datediff(year,@BirthDate,getdate()) - case when dateadd(year,datediff(year,@BirthDate,getdate()),@BirthDate) > getdate()
then 1
else 0
end;
set @BirthDate = '1959-07-06';
select datediff(year,@BirthDate,getdate()) - case when dateadd(year,datediff(year,@BirthDate,getdate()),@BirthDate) > getdate()
then 1
else 0
end;
go
-- The following is itvf function and the code that follows shows how it is used.
create function dbo.BirthAge (@BirthDate date)
returns table
with schemabinding
as return(
select datediff(year,@BirthDate,getdate()) - case when @BirthDate > dateadd(year,-datediff(year,@BirthDate,getdate()),getdate())
then 1
else 0
end Age);
go
create table #TestFunc(
Birthdate date
);
insert into #TestFunc
values ('1960-09-18'),('1966-05-23'),('1990-11-07'),('1994-05-13'),('1998-07-12');
select
tf.Birthdate,
ba.Age
from
#TestFunc tf
cross apply (select Age from dbo.BirthAge(tf.Birthdate))ba
drop table #TestFunc;
go
July 12, 2014 at 10:52 am
Lynn, if you haven't been told before "Your Amazin" 😀
***SQL born on date Spring 2013:-)
July 12, 2014 at 10:57 am
Thank you.
I did make a slight change to itvf function in my post above.
July 12, 2014 at 11:48 am
I made that into a permanent table and populated dates of birth all the way back to 01/01/1900 and then ran the Itvf against the computed column. The execution plans where identical below is th stats. This was against approx. 40,000 rows. I'm not sure exactly what it means though as I'm still learning this stuff.
SET STATISTICS IO ON
SET STATISTICS TIME ON
SELECT *
,AGE = convert(int,DATEDIFF(d, t.Birthdate, getdate())/365.25)
FROM dbo.testfunc1 t
------------------Above is Calculated Field----------
------------Below IVT Function-----------------------
SELECT
tf.Birthdate,
ba.Age
FROM
TestFunc1 tf
cross apply (SELECT Age FROM dbo.BirthAge(tf.Birthdate))ba
I got this SQL Server parse and compile time:
CPU time = 0 ms, elapsed time = 0 ms.
SQL Server Execution Times:
CPU time = 0 ms, elapsed time = 0 ms.
SQL Server parse and compile time:
CPU time = 4 ms, elapsed time = 4 ms.
SQL Server Execution Times:
CPU time = 0 ms, elapsed time = 0 ms.
SQL Server Execution Times:
CPU time = 0 ms, elapsed time = 0 ms.
(40025 row(s) affected)
Table 'TestFunc1'. Scan count 1, logical reads 68, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
(1 row(s) affected)
SQL Server Execution Times:
CPU time = 15 ms, elapsed time = 284 ms.
(40025 row(s) affected)
Table 'TestFunc1'. Scan count 1, logical reads 68, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
(1 row(s) affected)
SQL Server Execution Times:
CPU time = 31 ms, elapsed time = 282 ms.
SQL Server parse and compile time:
CPU time = 0 ms, elapsed time = 0 ms.
SQL Server Execution Times:
CPU time = 0 ms, elapsed time = 0 ms.
***SQL born on date Spring 2013:-)
July 14, 2014 at 11:53 am
A computed age column could not be persisted in any event because it uses a non-deterministic value (todays_date).
Whether you use the MONTH+DAY version or the -YEAR version could be based on the results you want for Feb 29th birthdays. The former will treat Mar 1 as the birthday in non-leap years, the latter will treat Feb28th as the birthday. (In some jurisdictions, for certain anniversary calculations, there are actually laws/ordinances that specify which date must be used.)
SQL DBA,SQL Server MVP(07, 08, 09) "It's a dog-eat-dog world, and I'm wearing Milk-Bone underwear." "Norm", on "Cheers". Also from "Cheers", from "Carla": "You need to know 3 things about Tortelli men: Tortelli men draw women like flies; Tortelli men treat women like flies; Tortelli men's brains are in their flies".
Viewing 8 posts - 1 through 7 (of 7 total)
You must be logged in to reply to this topic. Login to reply