Using functions in Procedures

Question

Post reply

Using functions in Procedures

StefanJ

SSCarpal Tunnel

Points: 4713
More actions
February 19, 2004 at 8:40 am

#107546

I keep seeing reference to not using functions because of poor performance. I make extensive use of functions in my procedures to look up full names based on codes. See sample function 'fSalesUser' below and a simplified Select statement extracted from one of my procedures where I use this function twice.

Viewing 14 posts - 1 through 13 (of 13 total)

You must be logged in to reply to this topic. Login to reply

StefanJ SSCarpal Tunnel Points: 4713 More actions · Answer 1

Should I be writing the Select statement in a different way to improve performance?

StefanJ SSCarpal Tunnel Points: 4713 More actions · Answer 2

Sorry could not post the code, because I only get one line to type in and I can't paste in either. Hope my question is clear without the example.

RBYoung Right there with Babe Points: 742 More actions · Answer 3

The problem is not so much "using functions in stored procedures" as it is "using function in queries or views".

Since functions are really just stored procedures that return a value, and a query that references one is likely to call it once for (at least) every row in the result set, their overhead can be quite high.

This is especially true when compared to the "in-line alias" modedl that everyone sort of assumed that functions woudl use.

Mike Good SSCertifiable Points: 7415 More actions · Answer 4

Your use of name lookup functions will definitely hamper performance as compared to function-less approach based on joins. Maybe w/small DB and/or fast h/w you can tolerate this.

RBYoung is exactly right, best reply I've seen on the topic.

Robert W Marda SSChampion Points: 13413 More actions · Answer 5

We have found that functions in stored procedures work fine if the result set using the function is small. The larger ones tend to run too slow. I would simply test what you have on existing data and try to test against a worst case scenario and see if performance is still acceptable. If so then you should be fine.

For us a worst case scenario is adding 500,000 or more rows and trying again.

Robert W. Marda
Billing and OSS Specialist - SQL Programmer
MCL Systems

Calvin Lawson SSChampion Points: 11030 More actions · Answer 6

My rule of thumb: If it's re-usable and the performance isn't bad then put it in a function (Think OO). Otherwise use Procs. Functions don't necessarily get executed for each row. I think it has something to do with determinism, but I haven't figured out what.

Lookups have a lot more to do with indexes than with functions. For example:

CREATE TABLE [Lookup2] (

[LookupID] [int] NOT NULL ,

[Last_Name] [char] (25) COLLATE SQL_Latin1_General_CP1_CI_AS NOT NULL,

[First_Name] [char] (25) COLLATE SQL_Latin1_General_CP1_CI_AS NOT NULL,

[ZIP_Lookup] [char] (5) COLLATE SQL_Latin1_General_CP1_CI_AS NOT NULL,

[BirthDate] [int] NOT NULL ,

PRIMARY KEY CLUSTERED

(

[ZIP_Lookup],

[First_Name],

[Last_Name],

[BirthDate]

&nbsp ON [PRIMARY]

) ON [PRIMARY]

GO

-----------------------------------------------------------------------------------

CREATE function LOOKUP (

@vcZip_Lookup varchar(5),

@vcFirst_Name varchar(17),

@vcLast_Name varchar(17),

@iBirthDate int)

returns int

AS

begin

Return

(Select top 1 LookupID

from Lookup2 (nolock)

where ZIP_Lookup = @vcZip_Lookup and

First_Name = @vcFirst_Name and Last_Name = @vcLast_Name and BirthDate = @iBirthDate

&nbsp

end

This comes back in a click. It doesn't matter how many rows are in the table, as it always does an index seek (and will always return one row per the PK).

Signature is NULL

Calvin Lawson SSChampion Points: 11030 More actions · Answer 7

OK, a , ) is not a wink, it's syntax! What the heck happened to the Code option on this website? I love it when new tech doesn't do everything old tech did...talk about cutting off your nose to spite your face.

Signature is NULL

StefanJ SSCarpal Tunnel Points: 4713 More actions · Answer 8

I have a procedure with lots of lookups, a number of them used more than once.

I tend to start procedures in a View and then copy and paste into Query Analyzer where I can edit them. In View if I don't use functions they end up looking like a spiders web and in QA I have great difficulty understanding the logic.

If I use Functions then the code is really easy to understand.

See an example below of one of my most complicated function below dbo.Sales.userID and dbo.Countries.[Country Code] are both primary keys and Sales currently has 504 rows (say max 2000) and Countries has 238.

(Yes I know I should not be using spaces in field names)

All the other functions are against single small lookup tables.

Are we saying that in these circumstances funtions are OK. It's only against very large tables there is a problem?

CREATE FUNCTION fSalesCountry

(@UserId varchar(7))

RETURNS varchar(25) AS

BEGIN

DECLARE @SalesCountry varchar(50)

SELECT @SalesCountry =

(SELECT dbo.Countries.Country

FROM dbo.Sales INNER JOIN

dbo.Countries ON dbo.sales.countrycode = dbo.Countries.[Country Code]

WHERE dbo.Sales.userID =@UserId)

RETURN @SalesCountry

END

RBYoung Right there with Babe Points: 742 More actions · Answer 9

Are we saying that in these circumstances funtions are OK. It's only against very large tables there is a problem?

No. A Function procedure gets optimized the same as any other procedure or query. What matter inoarticular with functions is "How many times is it called?"

This is because the performance overhead that you incur with a function that you do not have with an in-line Join or Subselect is having to invoke the function (as a stored procedure) once for each function result. If this is only once, then it is probably no big deal. However, if it is say 10,000 times, then it could be a problem.

There is an additional implicit overhead assoctiated with function usage in queries because the SQL optimizer cannot do good global query optimizations across procedure boundaries. Again, unlike in-line Joins and Subselects.

If I use Functions then the code is really easy to understand.

This is the basic quandry of SQL Server Functions: They are much easier to read and write than complex compound Joins and Subselects. However, they can have performance problems, depending on how they are used.

The general rule of thumb that I use is: Functions are OK for queries that return a single result, such as OLTP and typical Create/Read/Update/Delete user apps. However they are not OK for procedures that return or copy or manipulate many rows such as reports and typical DB maintenance tasks.

StefanJ SSCarpal Tunnel Points: 4713 More actions · Answer 10

The procedure I am thinking about returns several hundred rows for each country and puts these into individual zipped spreadsheets. So an awful lot of calls to the functions. The whole DTS package runs at night when nothing else is happening on the Server. The run time including zipping the files onto another server is 26 seconds.

So the current timings not an issue but no point getting into bad habits so I will take a closer look at the procedure in isolation to see how long it takes before and after rewriting.

I am also using functions in procedures which return a single row to Access and in these circumstances I understand from your reply that using functions is OK.

Incidentally I am wondering whether there is any impact on performance if I do everything in the SELECT part of the query with a nested SELECT rather than using a join. I find this much easier to read.

Also if I am using a look up table which will only return one row, is it more efficient to use MIN() or not?

Amit Jethva SSCarpal Tunnel Points: 4707 More actions · Answer 11

one of the quick optimization is @SalesCountry variable declaration.

change it to varchar(25) as your function is only returning varchar(25).

Another piece of optimisation would be an index on dbo.sales.countrycode

see if this helps.

-- Amit

"There is no 'patch' for stupidity."

Download the Updated SQL Server 2005 Books Online.

StefanJ SSCarpal Tunnel Points: 4713 More actions · Answer 12

Now I am confused. The results I am getting are the opposite from what I would have expected.

When I run the following code:

SELECT OrganisationName, dbo.fsalescountry(dbo.Companies.Sales1) AS AccountManagerCountry
from dbo.Companies
where dbo.Companies.CountryCode = 'nl'
GO
SELECT OrganisationName, dbo.Countries.Country AS AccountManagerCountry
FROM dbo.Countries RIGHT OUTER JOIN
dbo.Sales ON dbo.Countries.[Country Code] = dbo.Sales.CountryCode RIGHT OUTER JOIN
dbo.Companies ON dbo.Sales.UserID = dbo.Companies.Sales1
WHERE (dbo.Companies.CountryCode = 'nl')
GO

I find that that the query using the function has a cost of 44.12% compared to without of 55.88%

I am fairly new to this so might be missing something!

I also tried without the WHERE clause and here the performance improvement using a function was even more: 36.92% compared to 63.08%

The companies table has total records 10233 (1921 = NL)

Execution Plan

The evidence in the above would seem to indicate that it is better to use function than joins when looking up data.

Have I missed something?

StefanJ SSCarpal Tunnel Points: 4713 More actions · Answer 13

I have not used the Execution Plan windows in Query Analyzer much before but today seemed a good time to do this.

I took the SQL Statement from my procedure and gradually removed the joins and fields where I was looking up a single row and replaced these with functions.

Each one I replaced with a function reduced the estimated cost. I dropped from 1.13 to 0.848.

I then amended the code in the original procedure and found that the actual cost went the other way from 0.762 to 0.848 however the code is much easier to read see From statement before and after below.

I reckon that an 11% is worth it for the easier to maintain code but wonder why the different results between an SQL Statement and Procedure

Before:

FROM dbo.Classifications RIGHT OUTER JOIN
dbo.tblPotentials RIGHT OUTER JOIN
dbo.Sales Sales_1 RIGHT OUTER JOIN
dbo.Companies LEFT OUTER JOIN
dbo.Sales Sales_2 ON dbo.Companies.Sales2 = Sales_2.UserID ON Sales_1.UserID = dbo.Companies.Sales1 ON
dbo.tblPotentials.OrganisationID = dbo.Companies.OrganisationID ON dbo.Classifications.ClassificationID = dbo.Companies.ClassificationID

After:

FROM dbo.tblPotentials RIGHT OUTER JOIN
dbo.Companies ON dbo.tblPotentials.OrganisationID = dbo.Companies.OrganisationID