November 21, 2004 at 5:47 pm
I need to improve performance
in a complex stored procedure "sp_mbvarInit"
that was written 2 years ago by somebody.
It's a bunch of updates and error reporting.
It runs 6-8 min which is outrageous.
I was trying to find bottlnecks,
where sp slows down most of all
and apparently all blocks like
-----------------------------------------------------------
update mbvar_Portfolio
set Status = 160
from Issuers
WHERE
mbvar_Portfolio.IssuerShortName = Issuers.ShortName
and Issuers.TickerSymbol is not null
and Issuers.TickerSymbol not in
(
select [ID] from dbo.fn_EqOnlyBeta(@INTERVAL_ID)
where BETA <> 0
 
--------------------------------------------------------------
are running very slow. The one above updates only 36 records
but runs 57 sec. I mean whenever function is involved it's slow.
fn_EqOnlyBeta is fairly simple.
So I decided to replace function with the actual SELECT from the function.
It ran 2 sec.
My question is -
Is calling a function is generally slow or something is wrong with the function?
November 22, 2004 at 1:25 am
Scalar UDF's in SQL Server are pretty much like cursors. They process row-by-row, which can be performance killers on larger tables.
--
Frank Kalis
Microsoft SQL Server MVP
Webmaster: http://www.insidesql.org/blogs
My blog: http://www.insidesql.org/blogs/frankkalis/[/url]
November 22, 2004 at 6:50 am
Frank!
Thank you for a quick response.
What UDF's stand for?
Sorry I've never used SQL Server functions before.
Should I get rid of them or
change something to improve performance?
Thanks,
Robert
November 22, 2004 at 7:31 am
Sorry, UDF stands for User-Defined Functions.
And no, you shouldn't get rid of them, but rather carefully evaluate your alternatives given.
While this is not the best example, it demonstrates quite nicely that a scalar UDF is processed row-by-row, which in turn potentially can make them very slow on larger tables.
USE northwind
GO
CREATE VIEW foolview
AS
SELECT
GETDATE() AS Jetzt
GO
CREATE FUNCTION fool_me()
RETURNS DATETIME
AS
BEGIN
RETURN (
SELECT
Jetzt
FROM
foolview
 
END
GO
CREATE function you_dont_fool_me(@Jetzt datetime)
RETURNS DATETIME
AS
BEGIN
RETURN @Jetzt
END
GO
DECLARE @Jetzt datetime
SET @Jetzt = GETDATE()
--Test 1 viele Zeilen
SELECT DISTINCT
dbo.fool_me()
FROM
[Order Details] AS od
INNER JOIN
Orders AS o
ON
o.OrderId = od.OrderID
--Test2 eine Zeile
SELECT DISTINCT
dbo.you_dont_fool_me(@Jetzt)
FROM
[Order Details] AS od
INNER JOIN
Orders AS o
ON
o.OrderId = od.OrderID
GO
DROP FUNCTION fool_me
DROP FUNCTION you_dont_fool_me
DROP VIEW foolview
IMHO, you should use scalar UDF's only as some kind of last resort. If you can find a set based solution you should use this.
--
Frank Kalis
Microsoft SQL Server MVP
Webmaster: http://www.insidesql.org/blogs
My blog: http://www.insidesql.org/blogs/frankkalis/[/url]
November 22, 2004 at 8:20 am
So , Frank
The idea is (if I understood your example correctly)
to avoid calling other objects from UDF (sp,views)
and instead to retrieve all data
inside the UDF.
Am I right?
November 22, 2004 at 8:34 am
I said, the example isn't the best
Either that way or to try and avoid the UDF completely. However, you might want to read about UDF's in BOL. Contains "best practices", examples and a lot more. Search for "user-defined functions"
--
Frank Kalis
Microsoft SQL Server MVP
Webmaster: http://www.insidesql.org/blogs
My blog: http://www.insidesql.org/blogs/frankkalis/[/url]
November 22, 2004 at 8:50 am
Frank,
I just read a couple of articles
in BOL and the article
"SQL User-Defined Functions" has an example
with CREATE FUNCTION LargeOrderShippers ( @FreightParm money )...
This is the exact method that is being used widely
in my case but all these functions are terribly slow!
I'm afraid i'll have to replace them by SELECT ... statements
November 22, 2004 at 9:05 am
The main problem with encapsulating such logic within a function is that the query processing might not be able to optimize based on indexes and relationships that it "can't see".
You can take a look with the execution plan viewer on the query with the function and the query with a correlated subquery or join (a kind of "in-lining" of the function).
If there are other criteria in the query that allow the use of indexes to filter the data right away, then the function will be called fewer times, thus speeding the query. That is a situation where using the function has a net gain.
There is some penalty in calling the function, but that is usually minor compared to the need to call it for every row in the main table in order to decide if the row should be included.
November 23, 2004 at 11:37 am
and Issuers.TickerSymbol not in
(
select [ID] from dbo.fn_EqOnlyBeta(@INTERVAL_ID)
where BETA <> 0
This looks very ugly to me. Make sure that not very many records are in the TABLE data_type returned - are you running out of RAM. Also, find an alternate to "not in" or "in" if possible. A temporary table might be better if the same data is used more than once.
RandyHelpdesk: Perhaps Im not the only one that does not know what you are doing. 😉
November 23, 2004 at 11:39 am
...maybe a left join and a where clause looking for no record joined (i.e., [ID] is null)....
RandyHelpdesk: Perhaps Im not the only one that does not know what you are doing. 😉
November 23, 2004 at 11:46 am
A lot of things in the existing procedures look ugly to me.
The problem is I don't have a mandate to spend time on optimization of the existing code.
I just need to find the worst bottlenecks and optimize them otherwise our C++ code calling these procedures will take forever to execute.
So, if
select [ID] from dbo.fn_EqOnlyBeta(@INTERVAL_ID)
returns to many records the whole thing
slowas down considerably?
November 23, 2004 at 4:06 pm
The udf is returning TABLE data_type - it's used in a FROM. Unlike in a temporary table, the storage will be in RAM. Something to worry about if there is a lot of data. If you run out of RAM, it will not be pretty. I doubt this is the case.
Another thing to worry about is how often the udf is called. Perhaps it only needs to be called once rather then potentially be in a subquery nested loop - check the execution plan to see if it is. If so, consider using a local table variable loaded with the udf call results. The local table variable can then be placed in the query rather than the udf.
The "not in" requires the entire list to be searched. This can be slow if the list is large. I don't know if SQL Server will optimize the search in a binary fashion. I would rather load a local table data_type (few records) or a temporary table (lots of rows) and use a left join. A group by or distinct might be required when loading the temporary table to prevent extra records in final result. With an index on the joined column, I'd expect the join to proceed much faster than the "not in" search on a large list.
RandyHelpdesk: Perhaps Im not the only one that does not know what you are doing. 😉
Viewing 12 posts - 1 through 11 (of 11 total)
You must be logged in to reply to this topic. Login to reply