Are tabular udfs helpful or hurtful?

  • I have a report (SSRS 2005) that is based on a stored procedure, but the store procedure includes joins to three tabular user-defined functions in addition to a couple of other tables. These tabular udfs are parameter driven; consequently, only relevant data is being returned from the udfs.

    I've begun to detect a performance lag with this report and I'm wondering if the problem is related to the tabular udfs. Perhaps tabular udfs don't scale well? I'm far from being an optimization detective, so I'm not sure how best to prove if the udfs are the source of the bottleneck.

    I've done some assessment of missing indexes, though, and everything appears OK. Note, I am not using scalar udfs, which I do realize can cause significant performance problems if not leveraged correctly in a query.

    Thanks in advance,

    Pete

  • We're gonna need more details from you to give you a satisfactory answer. Think we can get you to post some code/table definitions of the poorly performing proc?

    A good way to figure out what part of a proc is causing slowness is to have the proc print timestamps after each statement. you can measure diffs to see what code segment is taking all the time.

  • whenever i see a performance issue featuring "stored procedure" and "parameter driven", i think either out of date statistics or parameter sniffing as the core causes to look at first.

    We'd really need to see the stored proc to help, i think...are there "default" values on the procs parameters?(parameter sniffing)

    Lowell


    --help us help you! If you post a question, make sure you include a CREATE TABLE... statement and INSERT INTO... statement into that table to give the volunteers here representative data. with your description of the problem, we can provide a tested, verifiable solution to your question! asking the question the right way gets you a tested answer the fastest way possible!

  • Typically table-based UDFs are not in themselves a performance issue.

    So the issue is more likely what you and others have mentioned:

    indexes, stats, or the actual code in the functions.

    Scott Pletcher, SQL Server MVP 2008-2010

  • Inline or multi-statement table valued udfs?

    The multi-statement can be nasty with larger row counts.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • Wow, thanks for the quick replies --

    First, as useful as sharing examples of the code can be for others to help assess the problem with query performance, I'm going to hold off for now simply because the code is very long (i.e., a lot of required fields are being pulled along with several joins) within the stored proc, not to mention all of the code in the various tabular udfs to be reviewed as well.

    Second, I'd like to tackle this by at least knowing if tabular udfs are better/worse/no different than using views, CTEs, or indexed temp tables. When it comes to "scalar" udfs, I've certainly seen the severe impact they can have if not properly used; I just haven't seen much written about the "gotcha's" of tabular udfs, though.

    Third, "parameter sniffing" -- good idea. I've read articles that suggest reducing parameter sniffing by declaring variables within the query and then assigning the store proc parameters to these variables. (In fact, I recall a workshop taught by Itzik Ben-Gan where he discusses how to reduce parameter sniffing -- how could I forget..) The stored proc does not include this technique; I can certainly build this into the query. Also, I'm not using any Optimize hints in the stored proc.

    Lastly -- as I'm writing this response, I see that more people have responded to my inquiry. So, please note that I'm posting this without having fully read other peoples comments.

    thanks,.

    pete

    -- Pete

  • Regarding parameter sniffing, how much of a performance impact does it have? Has anyone seen a query/report improve from hours to minutes or from a minute to a few seconds when a query is re-written to avoid parameter sniffing? Just wondering how much gain others have have seen.

  • GilaMonster (10/19/2010)


    Inline or multi-statement table valued udfs?

    The multi-statement can be nasty with larger row counts.

    GilaMonster - are you saying that joining to tabular udfs on large row counts can drastically reduce query performace? If so, what is the recommended approach?

  • You really need to specify if these are inline or multi statement. The SQL of Inline UDFs is decomposed into the entire execution plan. Multi statement table UDFs are like a black box to the optimiser as far as the statement as a whole is concerned.

    In short, Inline: good. Multi statement: potentially bad.

  • peterzeke (10/20/2010)


    Regarding parameter sniffing, how much of a performance impact does it have? Has anyone seen a query/report improve from hours to minutes or from a minute to a few seconds when a query is re-written to avoid parameter sniffing? Just wondering how much gain others have have seen.

    Parameter sniffing more than almost anything else can utterly destroy otherwise good query performance.

    For example, if the optimiser sniffs a very highly selective value as a predicate on an index it may cache a plan that results in an index seek, returning only a very few rows for that value. If it then uses that same plan for a poorly selective value it may performs hundreds or more seeks that in total are substantially more expensive than a single scan would have been.

  • peterzeke (10/20/2010)


    GilaMonster - are you saying that joining to tabular udfs on large row counts can drastically reduce query performace? If so, what is the recommended approach?

    Multi-statement ones can be, yes. Recommended approach - depends what you're doing.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • peterzeke (10/20/2010)


    Regarding parameter sniffing, how much of a performance impact does it have? Has anyone seen a query/report improve from hours to minutes or from a minute to a few seconds when a query is re-written to avoid parameter sniffing?

    Absolutely. I've seen procedures go from not completing after a couple of hours to under a second because of parameter sniffing.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • OK -- I'm getting a better sense of parameter sniffing performance issues.

    Since people have been asking about whether the udfs are inline or multi-statement, below is the code for one of the three tabular udfs being referenced in the stored proc. This udf identifies the performance rates (from the previous quarter) across our physician network for various clinical measures (e.g., lab tests, office visits) according to conditions (e.g., hypertension, diabetes, hypothyroidism...) that are relevant to the physican (aka UserID or Providerkey). The udf returns a list of rates per condition which ultimately become part of the output returned by the stored proc which contains this udf:

    /* PURPOSE:AGGREGATE NETWORK LEVEL CONDITION PERFORMANCE RATES */

    ALTER FUNCTION [dbo].[udf_RPT_NETWORK_PREVQTR_COND_RATE] ( @user-id INT )

    RETURNS @COND_RATES TABLE

    (

    CONDITIONKEY INT

    ,NETWORK_COND_PREVQTR_RATE DECIMAL(10,3)

    )

    AS

    BEGIN

    DECLARE

    @PREV_QTR_DATEKEY INT

    ,@DATEKEY INT

    ,@PROVIDERKEY INT

    /* ASSIGN CURRENT DATEKEY AND PREVIOUS QTR DATEKEY */

    SELECT

    @PREV_QTR_DATEKEY = [dbo].[udf_RPT_PREVIOUS_QTR_DATEKEY]( ),

    @DATEKEY = dbo.udf_RPT_CURRENT_DATEKEY()

    /* TRANSLATE THE USERID TO THE PROVIDERKEY */

    SELECT @PROVIDERKEY = PROVIDERKEY

    FROMDBO.DIM_PROVIDER

    WHERE[ProviderID] = @user-id

    /* RETRIEVE THE PERFORMANCE RATES PER CONDITIONKEY */

    INSERT

    @COND_RATES

    (

    CONDITIONKEY

    ,NETWORK_COND_PREVQTR_RATE

    )

    SELECT

    A.CONDITIONKEY

    ,NETWORK_COND_PREVQTR_RATE =

    1.* SUM(A.SERVICEMETRICCOUNT)/SUM(A.[DenominatorCount])

    FROM

    [NETWORK_ACHIEVEMENT] a WITH (NOLOCK)

    JOIN

    dbo.[PROVIDER_ACHIEVEMENT] b WITH (NOLOCK)

    ON a.CONDITIONKEY = B.CONDITIONKEY

    AND A.SUBPOPULATIONKEY = B.SUBPOPULATIONKEY

    AND A.SERVICEMETRICKEY = B.SERVICEMETRICKEY

    WHERE

    A.DATEKEY = @PREV_QTR_DATEKEY

    AND B.PROVIDERKEY = @PROVIDERKEY

    AND B.DATEKEY = @DATEKEY

    AND B.[PARIndicator] = 1

    GROUP BY A.CONDITIONKEY

    RETURN

    END

    All three tabular udfs are essentially similar in length and complexity. So, I imagine any suggestions for improvement with the udf above will have relevance to the other udfs.

    --Pete

  • You could write that as an inline UDF - an inline UDF is a single SQL statement. Think of it like a view but that accepts parameters.

    Do you see how you could rewrite it?

  • e.g.:

    ALTER FUNCTION [dbo].[udf_RPT_NETWORK_PREVQTR_COND_RATE] ( @user-id INT )

    RETURNS TABLE

    AS

    RETURN

    (

    SELECT

    A.CONDITIONKEY

    ,NETWORK_COND_PREVQTR_RATE = 1.* SUM(A.SERVICEMETRICCOUNT)/SUM(A.[DenominatorCount])

    FROM

    [NETWORK_ACHIEVEMENT] a WITH (NOLOCK)

    JOIN

    dbo.[PROVIDER_ACHIEVEMENT] b WITH (NOLOCK)

    ON a.CONDITIONKEY = B.CONDITIONKEY

    AND A.SUBPOPULATIONKEY = B.SUBPOPULATIONKEY

    AND A.SERVICEMETRICKEY = B.SERVICEMETRICKEY

    CROSS JOIN

    (

    SELECT

    PREV_QTR_DATEKEY = [dbo].[udf_RPT_PREVIOUS_QTR_DATEKEY]( ),

    CURRENT_DATEKEY = dbo.udf_RPT_CURRENT_DATEKEY()

    ) AS DATEKEYS

    JOIN

    DBO.DIM_PROVIDER

    ON B.PROVIDERKEY = DIM_PROVIDER.PROVIDERKEY

    WHERE

    A.DATEKEY = DATEKEYS.PREV_QTR_DATEKEY

    AND B.DATEKEY = DATEKEYS.CURRENT_DATEKEY

    AND B.[PARIndicator] = 1

    AND DIM_PROVIDER.[ProviderID] = @user-id

    GROUP BY A.CONDITIONKEY

    )

    GO

Viewing 15 posts - 1 through 15 (of 30 total)

You must be logged in to reply to this topic. Login to reply