Functions

  • I need to improve performance

    in a complex stored procedure "sp_mbvarInit"

    that was written 2 years ago by somebody.

    It's a bunch of updates and error reporting.

    It runs 6-8 min which is outrageous.

    I was trying to find bottlnecks,

    where sp slows down most of all

    and apparently all blocks like

    -----------------------------------------------------------

    update mbvar_Portfolio

    set Status = 160

    from Issuers

     WHERE 

       mbvar_Portfolio.IssuerShortName = Issuers.ShortName

        and Issuers.TickerSymbol is not null

        and Issuers.TickerSymbol not in

      (

      select [ID] from dbo.fn_EqOnlyBeta(@INTERVAL_ID)

       where BETA <> 0

     &nbsp

    --------------------------------------------------------------

    are running very slow. The one above updates only 36 records

    but runs 57 sec. I mean whenever function is involved it's slow.

    fn_EqOnlyBeta is fairly simple.

    So I decided to replace function with the actual SELECT from the function.

    It ran 2 sec.

    My question is -

    Is calling a function is generally slow or something is wrong with the function?

     

  • Scalar UDF's in SQL Server are pretty much like cursors. They process row-by-row, which can be performance killers on larger tables.

    --
    Frank Kalis
    Microsoft SQL Server MVP
    Webmaster: http://www.insidesql.org/blogs
    My blog: http://www.insidesql.org/blogs/frankkalis/[/url]

  • Frank!

    Thank you for a quick response.

    What UDF's stand for?

    Sorry I've never used SQL Server functions before.

    Should I get rid of them or

    change something to improve performance?

    Thanks,

    Robert

  • Sorry, UDF stands for User-Defined Functions.

    And no, you shouldn't get rid of them, but rather carefully evaluate your alternatives given.

    While this is not the best example, it demonstrates quite nicely that a scalar UDF is processed row-by-row, which in turn potentially can make them very slow on larger tables.

    USE northwind

    GO

    CREATE VIEW foolview

    AS

     SELECT

      GETDATE() AS Jetzt

    GO

    CREATE FUNCTION fool_me()

    RETURNS DATETIME

    AS

     BEGIN

      RETURN (

          SELECT

           Jetzt

          FROM

           foolview

         &nbsp

     END

    GO

    CREATE function you_dont_fool_me(@Jetzt datetime)

    RETURNS DATETIME

    AS

     BEGIN

      RETURN @Jetzt

     END

    GO

    DECLARE @Jetzt datetime

    SET @Jetzt = GETDATE()

    --Test 1 viele Zeilen

    SELECT DISTINCT

     dbo.fool_me()

    FROM

     [Order Details] AS od

    INNER JOIN

     Orders AS o

    ON

     o.OrderId = od.OrderID

    --Test2 eine Zeile

    SELECT DISTINCT

     dbo.you_dont_fool_me(@Jetzt)

    FROM

     [Order Details] AS od

    INNER JOIN

     Orders AS o

    ON

     o.OrderId = od.OrderID

    GO

    DROP FUNCTION fool_me

    DROP FUNCTION you_dont_fool_me

    DROP VIEW foolview

    IMHO, you should use scalar UDF's only as some kind of last resort. If you can find a set based solution you should use this.

    --
    Frank Kalis
    Microsoft SQL Server MVP
    Webmaster: http://www.insidesql.org/blogs
    My blog: http://www.insidesql.org/blogs/frankkalis/[/url]

  • So , Frank

    The idea is (if I understood your example correctly)

    to avoid calling other objects from UDF (sp,views)

    and instead to retrieve all data

    inside the UDF.

    Am I right?

  • I said, the example isn't the best

    Either that way or to try and avoid the UDF completely. However, you might want to read about UDF's in BOL. Contains "best practices", examples and a lot more. Search for "user-defined functions"

    --
    Frank Kalis
    Microsoft SQL Server MVP
    Webmaster: http://www.insidesql.org/blogs
    My blog: http://www.insidesql.org/blogs/frankkalis/[/url]

  • Frank,

    I just read a couple of articles

    in BOL and the article

    "SQL User-Defined Functions" has an example

    with CREATE FUNCTION LargeOrderShippers ( @FreightParm money )...

    This is the exact method that is being used widely

    in my case but all these functions are terribly slow!

    I'm afraid i'll have to replace them by SELECT ... statements

  • The main problem with encapsulating such logic within a function is that the query processing might not be able to optimize based on indexes and relationships that it "can't see".

    You can take a look with the execution plan viewer on the query with the function and the query with a correlated subquery or join (a kind of "in-lining" of the function).

    If there are other criteria in the query that allow the use of indexes to filter the data right away, then the function will be called fewer times, thus speeding the query. That is a situation where using the function has a net gain.

    There is some penalty in calling the function, but that is usually minor compared to the need to call it for every row in the main table in order to decide if the row should be included.

  • and Issuers.TickerSymbol not in

      (

      select [ID] from dbo.fn_EqOnlyBeta(@INTERVAL_ID)

       where BETA <> 0

    This looks very ugly to me.  Make sure that not very many records are in the TABLE data_type returned - are you running out of RAM.  Also, find an alternate to "not in" or "in" if possible.  A temporary table might be better if the same data is used more than once. 

    RandyHelpdesk: Perhaps Im not the only one that does not know what you are doing. 😉

  • ...maybe a left join and a where clause looking for no record joined (i.e., [ID] is null)....

    RandyHelpdesk: Perhaps Im not the only one that does not know what you are doing. 😉

  • A lot of things in the existing procedures look ugly to me.

    The problem is I don't have a mandate to spend time on optimization of the existing code.

    I just need to find the worst bottlenecks and optimize them otherwise our C++ code calling these procedures will take forever to execute.

    So, if

    select [ID] from dbo.fn_EqOnlyBeta(@INTERVAL_ID)

    returns to many records the whole thing

    slowas down considerably?

  • The udf is returning TABLE data_type - it's used in a FROM.  Unlike in a temporary table, the storage will be in RAM.  Something to worry about if there is a lot of data.  If you run out of RAM, it will not be pretty.  I doubt this is the case. 

    Another thing to worry about is how often the udf is called.  Perhaps it only needs to be called once rather then potentially be in a subquery nested loop - check the execution plan to see if it is.  If so, consider using a local table variable loaded with the udf call results.  The local table variable can then be placed in the query rather than the udf. 

    The "not in" requires the entire list to be searched.  This can be slow if the list is large.  I don't know if SQL Server will optimize the search in a binary fashion.  I would rather load a local table data_type (few records) or a temporary table (lots of rows) and use a left join.  A group by or distinct might be required when loading the temporary table to prevent extra records in final result.  With an index on the joined column, I'd expect the join to proceed much faster than the "not in" search on a large list. 

    RandyHelpdesk: Perhaps Im not the only one that does not know what you are doing. 😉

Viewing 12 posts - 1 through 11 (of 11 total)

You must be logged in to reply to this topic. Login to reply