udf very slow?

Question

Post reply

udf very slow?

Eric Mamet

SSChampion

Points: 11728
More actions
October 31, 2013 at 3:53 am

#395756

I created a User Defined Function that performs some arithmetic calculations on a few columns and returns an amount.
The UDF does not do any lookup on the database, just uses the parameters passed (and calls another UDF with similar characteristics).
I organised my underlying table so that all required columns are available in that table.
I expected the UDF performance to be extremely quick but it happens to be VERY slow.
Retrieving ~300,000 rows with the UDF takes about 10mns.
Just removing the function makes it about 15 seconds (scanning the table and displaying in SSMS).
There is no index at all on my table and I don't specify any criteria so both Tests involve just a table scan.
I was under the impression that I would have no performance problem as long as there was no database access within the UDF but this seems wrong.
I suppose I could get the required performance using a CLR function but I would rather avoid that because it is way beyond the technical skills of my customer (I am a consultant).
My bottleneck is entirely CPU
Any idea how to improve this?

Viewing 15 posts - 1 through 15 (of 22 total)

You must be logged in to reply to this topic. Login to reply

Koen Verbeeck SSC Guru Points: 259207 More actions · Answer 1

If you have 300,000 rows, the UDF will be called 300,000 times if you defined a scalar UDF.

More info:

User Defined Functions and Performance

Using a table valued function could improve performance.

Need an answer? No, you need a question
My blog at https://sqlkover.com.
MCSE Business Intelligence - Microsoft Data Platform MVP

Eric Mamet SSChampion Points: 11728 More actions · Answer 2

I know but I was hoping it would be fast...

Just talked to my customer and I'll try using CLR!

Yipee!!! 😀

Koen Verbeeck SSC Guru Points: 259207 More actions · Answer 3

Eric Mamet (10/31/2013)
I know but I was hoping it would be fast...
Just talked to my customer and I'll try using CLR!
Yipee!!! 😀

Never seen someone so excited about CLR 🙂

Need an answer? No, you need a question
My blog at https://sqlkover.com.
MCSE Business Intelligence - Microsoft Data Platform MVP

Sean Pearce SSCoach Points: 15999 More actions · Answer 4

It is much faster to apply a table function.

CREATE TABLE Test1 (ID INT);

GO

INSERT INTO Test1

SELECT TOP 1000000 ROW_NUMBER() OVER (ORDER BY a.object_id)

FROM sys.all_columns a

CROSS JOIN sys.all_columns b

GO

CREATE FUNCTION UDF_INLINE (@Input INT)

RETURNS INT

AS

BEGIN

DECLARE @I INT;

SET @I = @Input * 0.14;

RETURN @I;

END;

GO

CREATE FUNCTION UDF_APPLY (@Input INT)

RETURNS TABLE

AS

RETURN (SELECT @Input * 0.14 AS Result);

GO

SET STATISTICS IO ON;

SET STATISTICS TIME ON;

-- Return the column with no function

SELECT

ID

INTO

#test1

FROM

Test1;

/*

Table 'Test1'. Scan count 1, logical reads 3345

CPU time = 437 ms, elapsed time = 533 ms.

*/

-- Return the inline function

SELECT

dbo.UDF_INLINE(ID) AS RowName

INTO

#test2

FROM

Test1;

/*

Table '#test2'. Scan count 0, logical reads 1001607

Table 'Test1'. Scan count 1, logical reads 3345

CPU time = 10389 ms, elapsed time = 13965 ms.

*/

-- Apply the function

SELECT

b.Result

INTO

#test3

FROM

Test1 t

CROSS APPLY

dbo.UDF_APPLY(t.ID) AS b;

/*

Table 'Test1'. Scan count 1, logical reads 3345

CPU time = 577 ms, elapsed time = 578 ms.

*/

The SQL Guy @ blogspot[/url]

@SeanPearceSQL

About Me[/url]

Alan Burstein SSC Guru Points: 61152 More actions · Answer 5

I know but I was hoping it would be fast...
Just talked to my customer and I'll try using CLR!
Yipee!!!

I don't know if a CLR is going to be the way to go. I have been a big proponent of CLRs for some things because they do have their place but, in this case, there is no reason to believe that a CLR is going to perform aggregations faster than a T-SQL function. Remember, creating and Implementing a CLR is not a trivial task and introduces new risks and overhead for your SQL environment. You may just be adding more work without any added benefits.

Take a look at this article How to Make Scalar UDFs Run Faster (SQL Spackle)[/url]. As Sean was saying and demonstrated: you should get much better results turning your function into an inline table valued function. You will have to change your query logic to include a cross apply but that is something numerous people on this site can help you with if you get stuck.

P.S. I believe there is a new SQL Server Central Stairway on CLRs coming soon. I expect that it will be a good read. 😉

"I cant stress enough the importance of switching from a sequential files mindset to set-based thinking. After you make the switch, you can spend your time tuning and optimizing your queries instead of maintaining lengthy, poor-performing code."

-- Itzik Ben-Gan 2001

Eric Mamet SSChampion Points: 11728 More actions · Answer 6

Actually, I am not too sure to understand what the table function is doing in all this...

As for the speed, it's shockingly different!

To retrieve about 1.5 million rows using a simple TSQL UDF took about 4 minutes.

Replacing the TSQL UDF by a CLR function shrinks the 4 minutes to 18"

Removing any function still gives me about the same (17").

In other words, the CLR function is practically invisible in terms of performance! :w00t:

In fact, I should have remembered because this is an experiment that Itzik Ben Gan had already demonstrated in his wonderful book "Inside SQL Server 2005 TSQL Querying"

Silly me...

Luis Cazares SSC Guru Points: 183706 More actions · Answer 7

CLR might perform very well, but it seems to me that you're cracking nuts with a sledgehammer (or as said in spanish, killing flies with cannonballs).

Your bottleneck was CPU using UDFs because it will limit your CPU use to one (in other words, you're not using parallelism).

Good T-SQL should be enough for your problem, but it's all up to you.

Luis C.
General Disclaimer:
Are you seriously taking the advice and code from someone from the internet without testing it? Do you at least understand it? Or can it easily kill your server?

How to post data/code on a forum to get the best help: Option 1 / Option 2

Xedni SSCertifiable Points: 6951 More actions · Answer 8

What's the nature of the function you're trying to apply to the data? As Koen says, using functions against large sets of rows can have serious performance implications. If you can do it all inline, you may consider a computed column on the table, which SQL might be able to optimize better than calling a function for each row.

Executive Junior Cowboy Developer, Esq.[/url]

Eric Mamet SSChampion Points: 11728 More actions · Answer 9

The machine I used was a dual processor and yes CPU was limited to 1 processor.

So I had one processor flat out for 4 mns.

I doubt that using 2 processors instead of one would improve the performance to a couple of seconds...

I don't have access to the actual procedure now but I'll try to post it tomorrow.

Luis Cazares SSC Guru Points: 183706 More actions · Answer 10

The lack of parallelism is just a part of the reasons why udfs slow down the code performance. I'll be glad to help (and I'm sure others will be as well) when we know what is intended with the function.

Luis C.
General Disclaimer:
Are you seriously taking the advice and code from someone from the internet without testing it? Do you at least understand it? Or can it easily kill your server?

How to post data/code on a forum to get the best help: Option 1 / Option 2

Jeff Moden SSC Guru Points: 1004704 More actions · Answer 11

Sean Pearce (10/31/2013)
It is much faster to apply a table function.
CREATE TABLE Test1 (ID INT);
GO
INSERT INTO Test1
SELECT TOP 1000000 ROW_NUMBER() OVER (ORDER BY a.object_id)
FROM sys.all_columns a
CROSS JOIN sys.all_columns b
GO
CREATE FUNCTION UDF_INLINE (@Input INT)
RETURNS INT
AS
BEGIN
DECLARE @I INT;
SET @I = @Input * 0.14;
RETURN @I;
END;
GO
CREATE FUNCTION UDF_APPLY (@Input INT)
RETURNS TABLE
AS
RETURN (SELECT @Input * 0.14 AS Result);
GO
SET STATISTICS IO ON;
SET STATISTICS TIME ON;
-- Return the column with no function
SELECT
ID
INTO
#test1
FROM
Test1;
/*
Table 'Test1'. Scan count 1, logical reads 3345
CPU time = 437 ms, elapsed time = 533 ms.
*/
-- Return the inline function
SELECT
dbo.UDF_INLINE(ID) AS RowName
INTO
#test2
FROM
Test1;
/*
Table '#test2'. Scan count 0, logical reads 1001607
Table 'Test1'. Scan count 1, logical reads 3345
CPU time = 10389 ms, elapsed time = 13965 ms.
*/
-- Apply the function
SELECT
b.Result
INTO
#test3
FROM
Test1 t
CROSS APPLY
dbo.UDF_APPLY(t.ID) AS b;
/*
Table 'Test1'. Scan count 1, logical reads 3345
CPU time = 577 ms, elapsed time = 578 ms.
*/

+1. I don't know what others call the type of function you wrote for the "Apply" function but I call them "iSF" or "Inline Scalar Function". Of course, they're really just an Inline Table Valued Function (iTVF) that returns a single element.

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

Jeff Moden SSC Guru Points: 1004704 More actions · Answer 12

Eric Mamet (10/31/2013)
I know but I was hoping it would be fast...
Just talked to my customer and I'll try using CLR!
Yipee!!! 😀

Heh... hate to throw a wet blanket on your fire but you don't need to resort to an SQLCLR function for something so simple (although if you're comfortable with that, then fire away! It can be a great solution when done properly.:-D). Please see the example that Sean Pierce provided in his post above and please see the following article that demonstrates the problem and the fix (the "iSF").

How to Make Scalar UDFs Run Faster (SQL Spackle)

[/url]

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

Sean Pearce SSCoach Points: 15999 More actions · Answer 13

Jeff Moden (10/31/2013)
+1. I don't know what others call the type of function you wrote for the "Apply" function but I call them "iSF" or "Inline Scalar Function". Of course, they're really just an Inline Table Valued Function (iTVF) that returns a single element.

I was really struggling with terminology when I wrote that :blush:

The SQL Guy @ blogspot[/url]

@SeanPearceSQL

About Me[/url]

Eric Mamet SSChampion Points: 11728 More actions · Answer 14

Indeed this is what I have done and the result is very impressive.

I don't think I can detect any load due to the SQLCLR function.

I have even been able to convince the DBA in charge to CLR enable the database 😛

Perfect!