Blog Post

Comparing Hardcoded functions, In-Line TVF's, and Scalar Functions

,

While perusing some of the threads on SQLServerCentral.com one of the questions that hit me was regarding the use of scalar functions in queries.  In this particular case, in the column list of a SELECT statement.  Did it make sense to encapsulate a routine to strip of the time portion of a datetime value or should you put the actual function call(s) in the SELECT list.

I decided to run some tests so I created to user-defined functions that would strip off the time portion of a datetime value, one was created as a scalar function and the second as an in-line table valued function.  In my SandBox database I have a table, dbo.LAPTest, that has 17,996,362 records.  One of the columns in this table is a datetime column with random datetime values.

Here is the code for the two functions I wrote for the test:

CREATE FUNCTION [dbo].[ufnDayOnly](@DateValue DateTime) RETURNS table AS RETURN (select dateadd(dd, datediff(dd,0, @DateValue),0) as DateOnly ); GO

CREATE FUNCTION [dbo].[fn_DayOnly] (@DateValue DateTime) RETURNS DateTime AS BEGIN RETURN dateadd(dd, datediff(dd,0, @DateValue),0) END GO

I then wrote three separate T-SQL scripts, shown below, that will run 8 time each selecting 1 to 10,000,000 rows.  Each script strips the time portion of datetime values using three methods; hardcoded date functions, the table-valued function, and the scalar function.  Each time through the loop we will clear the caches.

Set NoCount On;
Declare @cpu_ int,
        @lreads_ int,
        @eMsec_ int;
declare @AcctID int,
        @Amount money,
        @TDate datetime,
        @ADate datetime;
declare @LoopCnt int;
set @LoopCnt = 0;
while @LoopCnt < 8
begin
    dbcc freeproccache with no_infomsgs;
    dbcc freesystemcache('ALL') with no_infomsgs;
    Select
        @cpu_ = cpu_time 
        , @lreads_ = logical_reads 
        , @eMsec_ = total_elapsed_time 
    From
        sys.dm_exec_requests 
    Where
        session_id = @@spid;
    select top (power(10, @LoopCnt))
        @AcctID = AccountID,
        @Amount = Amount,
        @TDate  = TDate,
        @ADate  = dateadd(dd,datediff(dd,0,TDate),0)
    from
        dbo.LAPTest;
     
    Select
        power(10, @LoopCnt) as RowsProcessed,
        cpu_time-@cpu_ as CpuMs 
        , logical_reads- @lreads_ as LogRds 
        , total_elapsed_time - @eMsec_ as Elapsed 
    From
        sys.dm_exec_requests 
    Where
        session_id = @@spid;
    set @LoopCnt = @LoopCnt + 1;
end

 

Set NoCount On;
Declare @cpu_ int,
        @lreads_ int,
        @eMsec_ int;
declare @AcctID int,
        @Amount money,
        @TDate datetime,
        @ADate datetime;
declare @LoopCnt int;
set @LoopCnt = 0;
while @LoopCnt < 8
begin
    dbcc freeproccache with no_infomsgs;
    dbcc freesystemcache('ALL') with no_infomsgs;
    Select
        @cpu_ = cpu_time 
        , @lreads_ = logical_reads 
        , @eMsec_ = total_elapsed_time 
    From
        sys.dm_exec_requests 
    Where
        session_id = @@spid;
    select top (power(10, @LoopCnt))
        @AcctID  = AccountID,
        @Amount = Amount,
        @TDate   = TDate,
        @ADate   = DateOnly
    from
        dbo.LAPTest
        cross apply dbo.ufnDayOnly(TDate);
     
    Select
        power(10, @LoopCnt) as RecordsProcessed,
        cpu_time-@cpu_ as CpuMs 
        , logical_reads- @lreads_ as LogRds 
        , total_elapsed_time - @eMsec_ as Elapsed 
    From
        sys.dm_exec_requests 
    Where
        session_id = @@spid;
    set @LoopCnt = @LoopCnt + 1;
end

 

Set NoCount On;
Declare @cpu_ int,
        @lreads_ int,
        @eMsec_ int;
declare @AcctID int,
        @Amount money,
        @TDate datetime,
        @ADate datetime;
declare @LoopCnt int;
set @LoopCnt = 0;
while @LoopCnt < 8
begin
    dbcc freeproccache with no_infomsgs;
    dbcc freesystemcache('ALL') with no_infomsgs;
    Select
        @cpu_ = cpu_time 
        , @lreads_ = logical_reads 
        , @eMsec_ = total_elapsed_time 
    From
        sys.dm_exec_requests 
    Where
        session_id = @@spid;
    select top (power(10,@LoopCnt))
        @AcctID = AccountID,
        @Amount = Amount,
        @TDate  = TDate,
        @ADate   = dbo.fn_DayOnly(TDate)
    from
        dbo.LAPTest;
     
    Select
        power(10,@LoopCnt) as RecordsProcessed,
        cpu_time-@cpu_ as CpuMs 
        , logical_reads- @lreads_ as LogRds 
        , total_elapsed_time - @eMsec_ as Elapsed 
    From
        sys.dm_exec_requests 
    Where
        session_id = @@spid;
    set @LoopCnt = @LoopCnt + 1;
end

The results of running these scripts are as follows:

Hardcoded FunctionsIn-Line TVF/Cross ApplyScalar Function

RowsProcessed

CpuMs

LogRds

Elapsed

CpuMs

LogRds

Elapsed

CpuMs

LogRds

Elapsed

1

0

26

1

0

52

1

0

40

2

10

0

4

0

0

4

1

0

18

1

100

0

4

1

0

4

0

0

18

2

1000

0

14

1

0

14

2

16

28

36

10000

16

58

12

16

58

13

94

72

102

100000

125

523

143

125

523

115

703

537

699

1000000

703

5130

730

703

5130

715

7687

5144

7958

10000000

6984

51171

7072

7063

51171

7157

67016

51185

67414

As you can see there isn't much difference between the hardcoded functions and the In-Line TVF/Cross Apply, however, as the number of rows increase there is a definite disadvantage to using the scalar function.

Does this mean you should convert all your scalar functions to In-Line TVF's?  Not necessarily, as you may need those functions else where, such as a WHERE clause where it may only be called once.

Does this mean you should always use the CROSS APPLY with In-Line TVF's? Again, not necessarily as it may not be performant either in all cases.

What can I suggest?  Test, test, and test again is the best advice.  The testing I ran above is limited and not necessarily what you may do on a normal, day to day basis.  But it does give you some food for thought.

 

See you on the forums.  I look forward to seeing what you all have to say.

Rate

You rated this post out of 5. Change rating

Share

Share

Rate

You rated this post out of 5. Change rating