TVF is slower when running as paramerized

  • Thanks Lynn. I updated the post to reflect the correct name of that table. You caught my refactoring error.

    Some of the tables that this query is dealing with are of a decent size (1 million and 10 million+) so I will look into what you pointed out. What initially didn't make sense to me, though, was the discrepancy between the time it took to iteratively run this in a dynamic query vs the time to run it with static parameters. I guess what you're saying is that this query just needs to be further optimized? Like I said in the original post, this runs in 2-5 seconds when I'm running the history function directly which seems completely reasonable to me.

  • No promises, but here is one possible rewrite:

    CREATE FUNCTION [dbo].[SubQuery] (

    @OnDate DATE,

    @EarliestDated DATE,

    @LatestDated DATE,

    @Number FLOAT,

    @Year INT,

    @Substitution INT

    )

    RETURNS TABLE

    AS

    RETURN (

    SELECT

    Object.*,

    Sums.ModifiedSize AS Amt,

    Year(Object.Date) as Year

    FROM (

    SELECT

    Transactions.Object,

    sum(ModifiedPar) AS ModifiedSize

    FROM (

    SELECT

    FilteredS.Object,

    CASE TypeIndicator

    WHEN 'P'

    THEN isnull(Amt, @Substitution) * - 1

    WHEN 'S'

    THEN isnull(Amt, @Substitution)

    ELSE 0

    END AS ModifiedSize

    FROM (

    SELECT

    o.*

    FROM

    Object o

    cross apply (select min(o1.Number), max(o1.Number), min(o1.Date), max(o1.Date) from Object o1)dt(MinNumber,MaxNumber,MinDate,MaxDate)

    WHERE

    o.DatedDate >= @EarliestDated and o.DatedDate <= isnull(@LatestDated,cast('9999-12-31' as date)) AND

    --AND (

    -- @LatestDated IS NULL

    -- OR DatedDate <= @LatestDated

    -- )

    o.Number between isnull(@Number,dt.MinNumber) and isnull(@Number,dt.MaxNumber) AND

    --AND (

    -- @Number IS NULL

    -- OR Object.Number = @Number

    -- )

    o.Date >= dateadd(year,isnull(@Year,year(MinDate)) - 1900, 0) AND o.Date < dateadd(year,(isnull(@Year,year(MaxDate)) - 1900) + 1, 0)

    --AND (

    -- @Year IS NULL

    -- OR year(Object.Date) = @Year

    -- )

    AND FinalSize IS NOT NULL

    ) FilteredS

    LEFT JOIN (

    SELECT

    *

    FROM

    transactions

    WHERE

    TypeIndicator != 'D' AND

    TransDate <= isnull(@OnDate,cast('9999-12-31' as date))

    --AND (

    -- @OnDate IS NULL

    -- OR TransDate <= @OnDate

    -- )

    ) Transactions

    ON FilteredS.Object = Transactions.Object

    ) AS Transactions

    GROUP BY

    Transactions.Object

    ) Sums

    LEFT JOIN Object

    ON Object.Object = Sums.Object

    );

  • Lynn Pettis (6/15/2015)


    No promises, but here is one possible rewrite:

    Thanks for taking the time to do this, but for some reason it actually worsened performance when I ran it using both methods. The dynamic query took 1:43 and the static query took 2-5 seconds. I was, however, able to get the static query down to 1 second by removing the year() function per your suggestion earlier.

    I'm still leaning towards this issue having something to do with the CTE/tally table/cross apply in the outer query. The execution plan shows the same operation taking up the majority (~85%) of the cost regardless of whether or not it takes 1 second or 1:30 to run.

    Here's the outer function I am currently working with (changing "where" to "top" reduced the dynamic query to 20 seconds, which is still 5-10x longer than running the static query). I was not able to successfully move the cross apply to nums since the subquery has a dependency on num. Also tried converting this to a multistatement TVF and declaring variables locally, as I've read helps in some cases, but that also resulted in worsened performance.

    ALTER FUNCTION [dbo].[history] (

    @Since DATE,

    @EarliestDated DATE = '6/1/2014',

    @LatestDated DATE = NULL,

    @Number FLOAT,

    @Year INT,

    @Substitution INT = 5000000

    )

    RETURNS TABLE

    AS

    RETURN (

    WITH N0 AS (

    SELECT 1 AS n

    UNION ALL

    SELECT 1

    ),

    N1 AS (

    SELECT 1 AS n

    FROM N0 t1,

    N0 t2

    ),

    N2 AS (

    SELECT 1 AS n

    FROM N1 t1,

    N1 t2

    ),

    N3 AS (

    SELECT 1 AS n

    FROM N2 t1,

    N2 t2

    ),

    nums AS (

    SELECT TOP (DATEDIFF(day, @Since, GETDATE()) + 1) ROW_NUMBER() OVER (

    ORDER BY (

    SELECT 1

    )

    ) AS num

    FROM N3

    )

    SELECT DATEADD(day, num - 1, @Since) AS thedate,

    *

    FROM nums

    CROSS APPLY (

    SELECT count(*) AS objects,

    sum(totalamt) AS totalamt,

    sum(size) AS totalsize

    FROM dbo.SubQuery(DATEADD(day, num - 1, @Since), @EarliestDated, @LatestDated, @Number, @Year, @Substitution)

    GROUP BY Number,

    Year

    ) thetable

    )

  • Maybe this:

    ALTER FUNCTION [dbo].[history] (

    @Since DATE,

    @EarliestDated DATE = '6/1/2014',

    @LatestDated DATE = NULL,

    @Number FLOAT,

    @Year INT,

    @Substitution INT = 5000000

    )

    RETURNS TABLE

    AS

    RETURN (

    WITH

    N0(n) AS (SELECT 1 FROM (VALUES (1),(1),(1),(1),(1),(1),(1),(1),(1),(1))dt(n)), -- 10 rows

    N1(n) AS (SELECT 1 FROM N0 a cross join N0 b), -- 100 rows

    N2(n) AS (SELECT 1 FROM N1 a cross join N1 b), -- 10,000 rows

    Dates(TheDate) AS (SELECT TOP (DATEDIFF(day, @Since, GETDATE()) + 1) dateadd(day,ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) - 1,@Since) FROM N2 a cross join N1 b) -- up to 1,000,000 dates (probably over flow the date data type)

    select

    d.TheDate,

    thetable.*

    from

    Dates d

    CROSS APPLY ( SELECT

    count(*) AS objects,

    sum(totalamt) AS totalamt,

    sum(size) AS totalsize

    FROM

    dbo.SubQuery(d.TheDate, @EarliestDated, @LatestDated, @Number, @Year, @Substitution)

    GROUP BY

    Number,

    Year

    ) thetable

    );

    I still think the subquery function needs to be rewritten and perhaps some indexing to support it.

  • Lynn Pettis (6/15/2015)


    Maybe this:

    I still think the subquery function needs to be rewritten and perhaps some indexing to support it.

    Gave this a try, but really didn't change anything. It seems like no matter what I do, when I call the function from a parameterized query, it takes exponentially longer. I added some indexes (eliminated that 80% key lookup from query plan) which dropped the static call to 1s consistently, but had no impact on the parameterized call. I'd expect the 2 calls to be the same or close, especially with the same query plan?

    All of these changes have had some impact on the static call, but still the only one that really affected the parameterized call positively was changing that "where" clause to "top".

    I'm running 2008r2 RTM, wondering if that could have something to do with it? At this point I'm ready to write my application to just make the static call, but that will make things much more difficult when I eventually add a TVP for additional filtering.

  • So it seems this is definitely an issue of parameter sniffing.

    While option(recompile) did nothing, option(optimize for unknown) brought the dynamic query down to 3 seconds. Still slower, but I can live with this.

    Is there some more correct way to fix this, or would simply leaving that as part of the query be the best solution?

  • phosplait (6/16/2015)


    So it seems this is definitely an issue of parameter sniffing.

    While option(recompile) did nothing, option(optimize for unknown) brought the dynamic query down to 3 seconds. Still slower, but I can live with this.

    Is there some more correct way to fix this, or would simply leaving that as part of the query be the best solution?

    Just curious if you were to use a couple CTEs that create virtual tables with all the parameter values as records, whether that might eliminate the sniffing problem. It would add a join to every query where a parameter appears. It would go something like this:

    WITH DATE_PARMS AS (

    SELECT 'Since' AS PARM_NAME, @Since AS DATE_VALUE

    UNION ALL

    SELECT 'EarliestDated', @EarliestDated

    UNION ALL

    SELECT 'LatestDated', @LatestDated

    ),

    FLOAT_PARMS AS (

    SELECT 'Number' AS PARM_NAME, @Number AS FLOAT_VALUE

    ),

    INT_PARMS AS (

    SELECT 'Year' AS PARM_NAME, @Year AS INT_VALUE

    UNION ALL

    SELECT 'Substitution', @Substitution

    )

    ...

    Steve (aka sgmunson) πŸ™‚ πŸ™‚ πŸ™‚
    Rent Servers for Income (picks and shovels strategy)

  • phosplait (6/16/2015)


    So it seems this is definitely an issue of parameter sniffing.

    While option(recompile) did nothing, option(optimize for unknown) brought the dynamic query down to 3 seconds. Still slower, but I can live with this.

    Is there some more correct way to fix this, or would simply leaving that as part of the query be the best solution?

    The first posting of the History function looks like this:

    CREATE FUNCTION [dbo].[history] (

    @First DATE,

    @DatedStart DATE = '6/1/2014',

    @DatedEnd DATE = NULL,

    @Number FLOAT,

    @Year INT,

    @Sub INT = 5000000

    )

    ...

    SELECT DATEADD(day, num - 1, @Sub) AS thedate,

    which of course throws an error. I might be wrong, but this suggests that you're in the middle of designing the process, rather than query tuning a process which has been identified as a bottleneck. If this is the case you might want to reconsider the design and configure the inner iTVF as a bogstandard query and reconfigure the outer iTVF as a new function returning a bunch of dates. I'm confident you'd have a far less messy structure to work with. Rejigging your inner iTVF results in something like this:

    SELECT

    [Year] = YEAR(o.Date),

    o.Object,

    Amt = SUM(x.ModifiedSize)

    FROM [Object] o

    INNER JOIN transactions t

    ON t.object = o.object

    CROSS APPLY (

    SELECT ModifiedSize = CASE t.TypeIndicator

    WHEN 'P' THEN isnull(Amt, @Substitution) * - 1

    WHEN 'S' THEN isnull(Amt, @Substitution)

    ELSE 0 END

    ) x

    WHERE o.DatedDate >= @EarliestDated

    AND (@LatestDated IS NULL OR o.DatedDate <= @LatestDated)

    AND (@Number IS NULL OR o.Number = @Number)

    AND (@Year IS NULL OR YEAR(o.Date) = @Year)

    AND o.FinalSize IS NOT NULL

    AND t.TypeIndicator != 'D'

    AND (@OnDate IS NULL OR t.TransDate <= @OnDate)

    GROUP BY

    YEAR(o.Date),

    o.Object

    which is simple, easy to adapt, and easy to mark as a problem catch-all query to which the usual caveats and cures apply. The point to remember is this: if you run the process with a single date and then with the widest date range which makes sense with your data, the two execution plans are unlikely to be the same.

    β€œWrite the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw

    For fast, accurate and documented assistance in answering your questions, please read this article.
    Understanding and using APPLY, (I) and (II) Paul White
    Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

Viewing 8 posts - 16 through 22 (of 22 total)

You must be logged in to reply to this topic. Login to reply