TVF is slower when running as paramerized

Question

TVF is slower when running as paramerized

Viewing 8 posts - 16 through 22 (of 22 total)

You must be logged in to reply to this topic. Login to reply

phosplait SSC Veteran Points: 280 More actions · Answer 1

Thanks Lynn. I updated the post to reflect the correct name of that table. You caught my refactoring error.

Some of the tables that this query is dealing with are of a decent size (1 million and 10 million+) so I will look into what you pointed out. What initially didn't make sense to me, though, was the discrepancy between the time it took to iteratively run this in a dynamic query vs the time to run it with static parameters. I guess what you're saying is that this query just needs to be further optimized? Like I said in the original post, this runs in 2-5 seconds when I'm running the history function directly which seems completely reasonable to me.

Lynn Pettis SSC Guru Points: 442467 More actions · Answer 2

No promises, but here is one possible rewrite:

CREATE FUNCTION [dbo].[SubQuery] (

@OnDate DATE,

@EarliestDated DATE,

@LatestDated DATE,

@Number FLOAT,

@Year INT,

@Substitution INT

)

RETURNS TABLE

AS

RETURN (

SELECT

Object.*,

Sums.ModifiedSize AS Amt,

Year(Object.Date) as Year

FROM (

SELECT

Transactions.Object,

sum(ModifiedPar) AS ModifiedSize

FROM (

SELECT

FilteredS.Object,

CASE TypeIndicator

WHEN 'P'

THEN isnull(Amt, @Substitution) * - 1

WHEN 'S'

THEN isnull(Amt, @Substitution)

ELSE 0

END AS ModifiedSize

FROM (

SELECT

o.*

FROM

Object o

cross apply (select min(o1.Number), max(o1.Number), min(o1.Date), max(o1.Date) from Object o1)dt(MinNumber,MaxNumber,MinDate,MaxDate)

WHERE

o.DatedDate >= @EarliestDated and o.DatedDate <= isnull(@LatestDated,cast('9999-12-31' as date)) AND

--AND (

-- @LatestDated IS NULL

-- OR DatedDate <= @LatestDated

-- )

o.Number between isnull(@Number,dt.MinNumber) and isnull(@Number,dt.MaxNumber) AND

--AND (

-- @Number IS NULL

-- OR Object.Number = @Number

-- )

o.Date >= dateadd(year,isnull(@Year,year(MinDate)) - 1900, 0) AND o.Date < dateadd(year,(isnull(@Year,year(MaxDate)) - 1900) + 1, 0)

--AND (

-- @Year IS NULL

-- OR year(Object.Date) = @Year

-- )

AND FinalSize IS NOT NULL

) FilteredS

LEFT JOIN (

SELECT

*

FROM

transactions

WHERE

TypeIndicator != 'D' AND

TransDate <= isnull(@OnDate,cast('9999-12-31' as date))

--AND (

-- @OnDate IS NULL

-- OR TransDate <= @OnDate

-- )

) Transactions

ON FilteredS.Object = Transactions.Object

) AS Transactions

GROUP BY

Transactions.Object

) Sums

LEFT JOIN Object

ON Object.Object = Sums.Object

);

phosplait SSC Veteran Points: 280 More actions · Answer 3

Lynn Pettis (6/15/2015)
No promises, but here is one possible rewrite:

Thanks for taking the time to do this, but for some reason it actually worsened performance when I ran it using both methods. The dynamic query took 1:43 and the static query took 2-5 seconds. I was, however, able to get the static query down to 1 second by removing the year() function per your suggestion earlier.

I'm still leaning towards this issue having something to do with the CTE/tally table/cross apply in the outer query. The execution plan shows the same operation taking up the majority (~85%) of the cost regardless of whether or not it takes 1 second or 1:30 to run.

Here's the outer function I am currently working with (changing "where" to "top" reduced the dynamic query to 20 seconds, which is still 5-10x longer than running the static query). I was not able to successfully move the cross apply to nums since the subquery has a dependency on num. Also tried converting this to a multistatement TVF and declaring variables locally, as I've read helps in some cases, but that also resulted in worsened performance.

ALTER FUNCTION [dbo].[history] (

@Since DATE,

@EarliestDated DATE = '6/1/2014',

@LatestDated DATE = NULL,

@Number FLOAT,

@Year INT,

@Substitution INT = 5000000

)

RETURNS TABLE

AS

RETURN (

WITH N0 AS (

SELECT 1 AS n

UNION ALL

SELECT 1

),

N1 AS (

SELECT 1 AS n

FROM N0 t1,

N0 t2

),

N2 AS (

SELECT 1 AS n

FROM N1 t1,

N1 t2

),

N3 AS (

SELECT 1 AS n

FROM N2 t1,

N2 t2

),

nums AS (

SELECT TOP (DATEDIFF(day, @Since, GETDATE()) + 1) ROW_NUMBER() OVER (

ORDER BY (

SELECT 1

)

) AS num

FROM N3

)

SELECT DATEADD(day, num - 1, @Since) AS thedate,

*

FROM nums

CROSS APPLY (

SELECT count(*) AS objects,

sum(totalamt) AS totalamt,

sum(size) AS totalsize

FROM dbo.SubQuery(DATEADD(day, num - 1, @Since), @EarliestDated, @LatestDated, @Number, @Year, @Substitution)

GROUP BY Number,

Year

) thetable

)

Lynn Pettis SSC Guru Points: 442467 More actions · Answer 4

Maybe this:

ALTER FUNCTION [dbo].[history] (

@Since DATE,

@EarliestDated DATE = '6/1/2014',

@LatestDated DATE = NULL,

@Number FLOAT,

@Year INT,

@Substitution INT = 5000000

)

RETURNS TABLE

AS

RETURN (

WITH

N0(n) AS (SELECT 1 FROM (VALUES (1),(1),(1),(1),(1),(1),(1),(1),(1),(1))dt(n)), -- 10 rows

N1(n) AS (SELECT 1 FROM N0 a cross join N0 b), -- 100 rows

N2(n) AS (SELECT 1 FROM N1 a cross join N1 b), -- 10,000 rows

Dates(TheDate) AS (SELECT TOP (DATEDIFF(day, @Since, GETDATE()) + 1) dateadd(day,ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) - 1,@Since) FROM N2 a cross join N1 b) -- up to 1,000,000 dates (probably over flow the date data type)

select

d.TheDate,

thetable.*

from

Dates d

CROSS APPLY ( SELECT

count(*) AS objects,

sum(totalamt) AS totalamt,

sum(size) AS totalsize

FROM

dbo.SubQuery(d.TheDate, @EarliestDated, @LatestDated, @Number, @Year, @Substitution)

GROUP BY

Number,

Year

) thetable

);

I still think the subquery function needs to be rewritten and perhaps some indexing to support it.

phosplait SSC Veteran Points: 280 More actions · Answer 5

Lynn Pettis (6/15/2015)
Maybe this:
I still think the subquery function needs to be rewritten and perhaps some indexing to support it.

Gave this a try, but really didn't change anything. It seems like no matter what I do, when I call the function from a parameterized query, it takes exponentially longer. I added some indexes (eliminated that 80% key lookup from query plan) which dropped the static call to 1s consistently, but had no impact on the parameterized call. I'd expect the 2 calls to be the same or close, especially with the same query plan?

All of these changes have had some impact on the static call, but still the only one that really affected the parameterized call positively was changing that "where" clause to "top".

I'm running 2008r2 RTM, wondering if that could have something to do with it? At this point I'm ready to write my application to just make the static call, but that will make things much more difficult when I eventually add a TVP for additional filtering.

phosplait SSC Veteran Points: 280 More actions · Answer 6

So it seems this is definitely an issue of parameter sniffing.

While option(recompile) did nothing, option(optimize for unknown) brought the dynamic query down to 3 seconds. Still slower, but I can live with this.

Is there some more correct way to fix this, or would simply leaving that as part of the query be the best solution?

sgmunson SSC Guru Points: 110640 More actions · Answer 7

phosplait (6/16/2015)
So it seems this is definitely an issue of parameter sniffing.
While option(recompile) did nothing, option(optimize for unknown) brought the dynamic query down to 3 seconds. Still slower, but I can live with this.
Is there some more correct way to fix this, or would simply leaving that as part of the query be the best solution?

Just curious if you were to use a couple CTEs that create virtual tables with all the parameter values as records, whether that might eliminate the sniffing problem. It would add a join to every query where a parameter appears. It would go something like this:

WITH DATE_PARMS AS (

SELECT 'Since' AS PARM_NAME, @Since AS DATE_VALUE

UNION ALL

SELECT 'EarliestDated', @EarliestDated

UNION ALL

SELECT 'LatestDated', @LatestDated

),

FLOAT_PARMS AS (

SELECT 'Number' AS PARM_NAME, @Number AS FLOAT_VALUE

),

INT_PARMS AS (

SELECT 'Year' AS PARM_NAME, @Year AS INT_VALUE

UNION ALL

SELECT 'Substitution', @Substitution

)

...

Steve (aka sgmunson) 🙂 🙂 🙂
Rent Servers for Income (picks and shovels strategy)

ChrisM@Work SSC Guru Points: 186127 More actions · Answer 8

phosplait (6/16/2015)
So it seems this is definitely an issue of parameter sniffing.
While option(recompile) did nothing, option(optimize for unknown) brought the dynamic query down to 3 seconds. Still slower, but I can live with this.
Is there some more correct way to fix this, or would simply leaving that as part of the query be the best solution?

The first posting of the History function looks like this:

CREATE FUNCTION [dbo].[history] (

@First DATE,

@DatedStart DATE = '6/1/2014',

@DatedEnd DATE = NULL,

@Number FLOAT,

@Year INT,

@Sub INT = 5000000

)

...

SELECT DATEADD(day, num - 1, @Sub) AS thedate,

which of course throws an error. I might be wrong, but this suggests that you're in the middle of designing the process, rather than query tuning a process which has been identified as a bottleneck. If this is the case you might want to reconsider the design and configure the inner iTVF as a bogstandard query and reconfigure the outer iTVF as a new function returning a bunch of dates. I'm confident you'd have a far less messy structure to work with. Rejigging your inner iTVF results in something like this:

SELECT

[Year] = YEAR(o.Date),

o.Object,

Amt = SUM(x.ModifiedSize)

FROM [Object] o

INNER JOIN transactions t

ON t.object = o.object

CROSS APPLY (

SELECT ModifiedSize = CASE t.TypeIndicator

WHEN 'P' THEN isnull(Amt, @Substitution) * - 1

WHEN 'S' THEN isnull(Amt, @Substitution)

ELSE 0 END

) x

WHERE o.DatedDate >= @EarliestDated

AND (@LatestDated IS NULL OR o.DatedDate <= @LatestDated)

AND (@Number IS NULL OR o.Number = @Number)

AND (@Year IS NULL OR YEAR(o.Date) = @Year)

AND o.FinalSize IS NOT NULL

AND t.TypeIndicator != 'D'

AND (@OnDate IS NULL OR t.TransDate <= @OnDate)

GROUP BY

YEAR(o.Date),

o.Object

which is simple, easy to adapt, and easy to mark as a problem catch-all query to which the usual caveats and cures apply. The point to remember is this: if you run the process with a single date and then with the widest date range which makes sense with your data, the two execution plans are unlikely to be the same.

^{“Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw}

For fast, accurate and documented assistance in answering your questions, please read this article.
Understanding and using APPLY, (I) and (II) Paul White
Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden