February 24, 2009 at 12:36 pm
Hi I am trying to use Cross Apply with 2 UDF's. The first returns time slices over a period (ie. all the begin and end dates for each month in a period). The second uses these begin and end dates to do some calculations, a perfect match for Cross Apply (you would think). But alas, the query is dog slow, I can generate the result set by looping in a mere fraction of the time. If anyone has any ideas, that would be great or could it be that cross apply is not all that it is cracked up to be?
DECLARE @periodStart SMALLDATETIME,
@periodEnd SMALLDATETIME,
@portfolio varchar(MAX),
@subbusunit varchar(MAX),
@interval VARCHAR(2)
SET @periodStart = '2009-01-01'
SET @periodEnd = '2009-02-23'
SET @portfolio = 'aof, dof, lyx, off, rof, sof, wtl'
SET @subbusunit = 'ARB1,ARBZ,ARIC,ARIZ,BARB,BARZ,ARS1,ARSZ'
SET @interval = 'wk'
SELECT *
FROM perf.udf_generatetimeseries(@periodStart, @periodEnd, @interval) AS t
CROSS APPLY perf.udf_portfolio_indicators(t.begin_date, t.end_date, @portfolio, @subbusunit)
Thanks:D
February 24, 2009 at 1:09 pm
CROSS APPLY works. I've even seen it increase performance. Have you looked at the execution plans? Are those inline or multi-statement UDF's?
"The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
- Theodore Roosevelt
Author of:
SQL Server Execution Plans
SQL Server Query Performance Tuning
February 24, 2009 at 1:45 pm
Can you provide the code for the two UDFs?
- Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
Property of The Thread
"Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon
February 24, 2009 at 2:27 pm
Thanks for the reply Grant,
udf_GenerateTimeSeries is a multi-statement function, and udf_Portfolio_Indicators is an inline function that has nested function calls. There doesn't seem to be anything out of the ordinary looking over the execution plan. I should note that individually both of these functions return in under a second. When looping through the values provided in the first to run the second the result set also returns relatively quickly.
~Jeff
February 24, 2009 at 2:40 pm
GSquared,
Here they are:)
CREATE FUNCTION [Perf].[udf_GenerateTimeSeries](
@periodStart SMALLDATETIME,
@periodEnd SMALLDATETIME,
@interval VARCHAR(2)
)
RETURNS @tbl TABLE
(
ID INT IDENTITY(1,1) PRIMARY KEY CLUSTERED,
Begin_Date SMALLDATETIME,
End_Date SMALLDATETIME
)
AS
BEGIN
DECLARE @start SMALLDATETIME,
@end SMALLDATETIME
IF @interval ='d'
BEGIN
INSERT INTO @tbl
SELECT a.dte Begin_Date, b.dte End_Date
FROM(
SELECT DATEADD(d, 1, DATEADD(d, -1*(N+1), @periodEnd)) dte,
RANK() OVER (PARTITION BY 1 ORDER BY DATEADD(d, N, @periodStart)) ID
FROM tools.dbo.tally
WHERE N <=DATEDIFF(d, @periodStart, @periodEnd)
) a
INNER JOIN (
SELECT DATEADD(d, -1*N, @periodEnd) dte,
RANK() OVER (PARTITION BY 1 ORDER BY DATEADD(d, N, @periodStart)) ID
FROM tools.dbo.tally
WHERE N <=DATEDIFF(d, @periodStart, @periodEnd)
) b ON a.ID = b.ID
END
ELSE IF @interval = 'wk'
BEGIN
SELECT @start = DATEADD(wk, -1, DATEADD(d, 7-DATEPART(dw, @periodStart), @periodStart))
SELECT @end = DATEADD(d, 6-DATEPART(dw, @periodEnd), @periodEnd)
INSERT INTO @tbl
SELECT a.dte, b.dte
FROM (
SELECT CASE WHEN DATEADD(wk, N, @start) < @periodStart THEN @periodStart ELSE DATEADD(wk, N, @start) END dte,
RANK() OVER (PARTITION BY 1 ORDER BY DATEADD(wk, N, @start)) ID
FROM tools.dbo.tally
WHERE N < DATEDIFF(wk, @Start, @End)
) a
INNER JOIN (
SELECT CASE WHEN DATEADD(wk, -1*N, @end) > @periodEnd THEN @periodEnd ELSE DATEADD(wk, -1*N, @end) END dte,
RANK() OVER (PARTITION BY 1 ORDER BY DATEADD(wk, -1*N, @end)) ID
FROM tools.dbo.tally
WHERE N < DATEDIFF(wk, @Start, @End)
)b ON a.ID = b.ID
END
ELSE IF @interval = 'm'
BEGIN
SELECT @start = DATEADD(m, DATEDIFF(m, 0, @periodStart),0)
SELECT @end = DATEADD(d, -1, DATEADD(m, DATEDIFF(m, 0, @periodEnd)+1,0))
INSERT INTO @tbl
SELECT a.dte, b.dte
FROM (
SELECT CASE WHEN DATEADD(m, N, @start) < @periodStart THEN @periodStart ELSE DATEADD(m, N, @start) END dte,
RANK() OVER (PARTITION BY 1 ORDER BY DATEADD(m, N, @start)) ID
FROM tools.dbo.tally
WHERE N <= DATEDIFF(m, @Start, @End)
) a
INNER JOIN (
SELECT CASE WHEN DATEADD(m, -1*N, @end) > @periodEnd THEN @periodEnd ELSE DATEADD(m, -1*N, @end) END dte,
RANK() OVER (PARTITION BY 1 ORDER BY DATEADD(m, -1*N, @end)) ID
FROM tools.dbo.tally
WHERE N <= DATEDIFF(m, @Start, @End)
)b ON a.ID = b.ID
END
ELSE IF @interval = 'q'
BEGIN
SELECT @start = DATEADD(q, DATEDIFF(q, 0, @periodStart),0)
SELECT @end = DATEADD(q, DATEDIFF(q, 0, @periodEnd)+1,0)--)
INSERT INTO @tbl
SELECT a.dte, b.dte
FROM (
SELECT CASE WHEN DATEADD(q, N, @start) < @periodStart THEN @periodStart ELSE DATEADD(q, N, @start) END dte,
RANK() OVER (PARTITION BY 1 ORDER BY DATEADD(q, N, @start)) ID
FROM tools.dbo.tally
WHERE N <= DATEDIFF(q, @Start, @End)
) a
INNER JOIN (
SELECT CASE WHEN DATEADD(d,-1,DATEADD(q, -1*N, @end)) > @periodEnd THEN @periodEnd ELSE DATEADD(d,-1,DATEADD(q, -1*N, @end)) END dte,
RANK() OVER (PARTITION BY 1 ORDER BY DATEADD(d,-1,DATEADD(q, -1*N, @end))) ID
FROM tools.dbo.tally
WHERE N < DATEDIFF(q, @Start, @End)
)b ON a.ID = b.ID
END
ELSE IF @interval = 'yy'
BEGIN
SELECT @start = DATEADD(yy, DATEDIFF(yy, 0, @periodStart),0)
SELECT @end = DATEADD(d, -1, DATEADD(yy, DATEDIFF(yy, 0, @periodEnd)+1,0))
INSERT INTO @tbl
SELECT a.dte, b.dte
FROM (
SELECT CASE WHEN DATEADD(yy, N, @start) < @periodStart THEN @periodStart ELSE DATEADD(yy, N, @start) END dte,
RANK() OVER (PARTITION BY 1 ORDER BY DATEADD(yy, N, @start)) ID
FROM tools.dbo.tally
WHERE N <= DATEDIFF(yy, @Start, @End)
) a
INNER JOIN (
SELECT CASE WHEN DATEADD(yy, -1*N, @end) > @periodEnd THEN @periodEnd ELSE DATEADD(yy, -1*N, @end) END dte,
RANK() OVER (PARTITION BY 1 ORDER BY DATEADD(yy, -1*N, @end)) ID
FROM tools.dbo.tally
WHERE N <= DATEDIFF(yy, @Start, @End)
)b ON a.ID = b.ID
END
RETURN
END
Whew~! And if that wasn't enough... here's the second
Alter FUNCTION perf.udf_Portfolio_Indicators(
@periodStart SMALLDATETIME,
@periodEnd SMALLDATETIME,
@portfolio varchar(MAX),
@subbusunit varchar(MAX)
)
RETURNS TABLE
AS RETURN
(
SELECT p.PNL, p.LMV, p.PNL/p.LMV ROLMV,
((CAST(p.Total AS DECIMAL(10,5))/p.Avg_Num)-1) * (365.0/(DATEDIFF(d, @periodStart, @periodEnd) + 1)) Turnover,
CAST(p.Winners AS DECIMAL(10,5))/CAST(p.Total AS DECIMAL(10,5)) BA,
p.Win_PNL/p.Win_LMV ROLMV_W,
p.Los_PNL/p.Los_LMV ROLMV_L,
p.[>50%], p.[40% to 50%], p.[30% to 40%], p.[20% to 30%], p.[10% to 20%], p.[0% to 10%],
p.[-10% to 0%], p.[-20% to -10%], p.[-30% to -20%], p.[-40% to -30%], p.[-50% to -40%], p.[< -50%]
FROM (
SELECT SUM(PNL) PNL,
SUM(Long_MV)/(DATEDIFF(d, MIN(begin_date), MAX(end_date))+1) LMV,
SUM(Total) Total,
SUM(Winner) Winners,
SUM(CASE WHEN winner = 1 THEN PNL ELSE 0 END) Win_PNL,
SUM(CASE WHEN winner = 1 THEN Long_MV ELSE 0 END)/(DATEDIFF(d, MIN(begin_date), MAX(end_date))+1) Win_LMV,
SUM(Loser) Losers,
SUM(CASE WHEN loser = 1 THEN PNL ELSE 0 END) Los_PNL,
SUM(CASE WHEN loser = 1 THEN Long_MV ELSE 0 END)/(DATEDIFF(d, MIN(begin_date), MAX(end_date))+1) Los_LMV,
DATEDIFF(d, MIN(begin_date), MAX(end_date))+1 days,
SUM(CASE WHEN Tier = 0 THEN 1 ELSE 0 END) [9998],
SUM(CASE WHEN Tier = 1 THEN 1 ELSE 0 END) [>50%],
SUM(CASE WHEN Tier = 2 THEN 1 ELSE 0 END) [40% to 50%],
SUM(CASE WHEN Tier = 3 THEN 1 ELSE 0 END) [30% to 40%],
SUM(CASE WHEN Tier = 4 THEN 1 ELSE 0 END) [20% to 30%],
SUM(CASE WHEN Tier = 5 THEN 1 ELSE 0 END) [10% to 20%],
SUM(CASE WHEN Tier = 6 THEN 1 ELSE 0 END) [0% to 10%],
SUM(CASE WHEN Tier = 7 THEN 1 ELSE 0 END) [-10% to 0%],
SUM(CASE WHEN Tier = 8 THEN 1 ELSE 0 END) [-20% to -10%],
SUM(CASE WHEN Tier = 9 THEN 1 ELSE 0 END) [-30% to -20%],
SUM(CASE WHEN Tier = 10 THEN 1 ELSE 0 END)[-40% to -30%],
SUM(CASE WHEN Tier = 11THEN 1 ELSE 0 END) [-50% to -40%],
SUM(CASE WHEN Tier = 12 THEN 1 ELSE 0 END)[< -50%],
SUM(CASE WHEN Total = 1 THEN Strategy_Days/(DATEDIFF(d, @periodStart, @periodEnd)+1) ELSE 0 End) Avg_Num
FROM perf.udf_portfolio_indicators_strategy (@periodStart, @periodEnd, @portfolio) p
INNER JOIN tools.dbo.udf_parselist(@subbusunit, ',') list ON p.subbus_unit_code = list.token
) p
CROSS JOIN (
SELECT Sum(Capital)/COUNT(DISTINCT rundate) Capital
FROM perf.Allocated_Capital_View ac
INNER JOIN tools.dbo.udf_parselist(@portfolio, ',') p ON ac.portfolio = p.token
INNER JOIN tools.dbo.udf_parselist(@subbusunit, ',') s ON ac.subbusunit = s.token
WHERE Rundate BETWEEN @periodStart AND @periodEnd
) ac
)
Thanks for taking the time to take a look!
~Jeff
February 24, 2009 at 3:07 pm
You don't get the speed benefits of cross apply unless the UDF is a an inline table valued function. Because an inline table valued function is a single query, the optimizer can roll it into the execution plan like a view or cte. Multi-statement functions wind up being RBAR.
__________________________________________________
Against stupidity the gods themselves contend in vain. -- Friedrich Schiller
Stop, children, what's that sound? Everybody look what's going down. -- Stephen Stills
February 24, 2009 at 3:08 pm
I can't help but thinking that cross applying an aggregate query has got to be murder (especially an aggregate query CROSS JOINED to yet another aggregate query, and then pollenated by a table variable to pull out various subsets (thereby running those aggregates multiple times)).
You have to remember that these functions are putting out table variables, which will essentially act as completely unindexed heaps. The fact that you're using functions to return tables within functions that return tables is going to essentially kill any hope for decent perf you might have....
----------------------------------------------------------------------------------
Your lack of planning does not constitute an emergency on my part...unless you're my manager...or a director and above...or a really loud-spoken end-user..All right - what was my emergency again?
February 24, 2009 at 3:48 pm
JeffJ (2/24/2009)
GSquared,Here they are:)
CREATE FUNCTION [Perf].[udf_GenerateTimeSeries](
@periodStart SMALLDATETIME,
@periodEnd SMALLDATETIME,
@interval VARCHAR(2)
)
RETURNS @tbl TABLE
(
ID INT IDENTITY(1,1) PRIMARY KEY CLUSTERED,
Begin_Date SMALLDATETIME,
End_Date SMALLDATETIME
)
AS
BEGIN
DECLARE @start SMALLDATETIME,
@end SMALLDATETIME
IF @interval ='d'
BEGIN
INSERT INTO @tbl
SELECT a.dte Begin_Date, b.dte End_Date
FROM(
SELECT DATEADD(d, 1, DATEADD(d, -1*(N+1), @periodEnd)) dte,
RANK() OVER (PARTITION BY 1 ORDER BY DATEADD(d, N, @periodStart)) ID
FROM tools.dbo.tally
WHERE N <=DATEDIFF(d, @periodStart, @periodEnd)
) a
INNER JOIN (
SELECT DATEADD(d, -1*N, @periodEnd) dte,
RANK() OVER (PARTITION BY 1 ORDER BY DATEADD(d, N, @periodStart)) ID
FROM tools.dbo.tally
WHERE N <=DATEDIFF(d, @periodStart, @periodEnd)
) b ON a.ID = b.ID
END
ELSE IF @interval = 'wk'
BEGIN
SELECT @start = DATEADD(wk, -1, DATEADD(d, 7-DATEPART(dw, @periodStart), @periodStart))
SELECT @end = DATEADD(d, 6-DATEPART(dw, @periodEnd), @periodEnd)
INSERT INTO @tbl
SELECT a.dte, b.dte
FROM (
SELECT CASE WHEN DATEADD(wk, N, @start) < @periodStart THEN @periodStart ELSE DATEADD(wk, N, @start) END dte,
RANK() OVER (PARTITION BY 1 ORDER BY DATEADD(wk, N, @start)) ID
FROM tools.dbo.tally
WHERE N < DATEDIFF(wk, @Start, @End)
) a
INNER JOIN (
SELECT CASE WHEN DATEADD(wk, -1*N, @end) > @periodEnd THEN @periodEnd ELSE DATEADD(wk, -1*N, @end) END dte,
RANK() OVER (PARTITION BY 1 ORDER BY DATEADD(wk, -1*N, @end)) ID
FROM tools.dbo.tally
WHERE N < DATEDIFF(wk, @Start, @End)
)b ON a.ID = b.ID
END
ELSE IF @interval = 'm'
BEGIN
SELECT @start = DATEADD(m, DATEDIFF(m, 0, @periodStart),0)
SELECT @end = DATEADD(d, -1, DATEADD(m, DATEDIFF(m, 0, @periodEnd)+1,0))
INSERT INTO @tbl
SELECT a.dte, b.dte
FROM (
SELECT CASE WHEN DATEADD(m, N, @start) < @periodStart THEN @periodStart ELSE DATEADD(m, N, @start) END dte,
RANK() OVER (PARTITION BY 1 ORDER BY DATEADD(m, N, @start)) ID
FROM tools.dbo.tally
WHERE N <= DATEDIFF(m, @Start, @End)
) a
INNER JOIN (
SELECT CASE WHEN DATEADD(m, -1*N, @end) > @periodEnd THEN @periodEnd ELSE DATEADD(m, -1*N, @end) END dte,
RANK() OVER (PARTITION BY 1 ORDER BY DATEADD(m, -1*N, @end)) ID
FROM tools.dbo.tally
WHERE N <= DATEDIFF(m, @Start, @End)
)b ON a.ID = b.ID
END
ELSE IF @interval = 'q'
BEGIN
SELECT @start = DATEADD(q, DATEDIFF(q, 0, @periodStart),0)
SELECT @end = DATEADD(q, DATEDIFF(q, 0, @periodEnd)+1,0)--)
INSERT INTO @tbl
SELECT a.dte, b.dte
FROM (
SELECT CASE WHEN DATEADD(q, N, @start) < @periodStart THEN @periodStart ELSE DATEADD(q, N, @start) END dte,
RANK() OVER (PARTITION BY 1 ORDER BY DATEADD(q, N, @start)) ID
FROM tools.dbo.tally
WHERE N <= DATEDIFF(q, @Start, @End)
) a
INNER JOIN (
SELECT CASE WHEN DATEADD(d,-1,DATEADD(q, -1*N, @end)) > @periodEnd THEN @periodEnd ELSE DATEADD(d,-1,DATEADD(q, -1*N, @end)) END dte,
RANK() OVER (PARTITION BY 1 ORDER BY DATEADD(d,-1,DATEADD(q, -1*N, @end))) ID
FROM tools.dbo.tally
WHERE N < DATEDIFF(q, @Start, @End)
)b ON a.ID = b.ID
END
ELSE IF @interval = 'yy'
BEGIN
SELECT @start = DATEADD(yy, DATEDIFF(yy, 0, @periodStart),0)
SELECT @end = DATEADD(d, -1, DATEADD(yy, DATEDIFF(yy, 0, @periodEnd)+1,0))
INSERT INTO @tbl
SELECT a.dte, b.dte
FROM (
SELECT CASE WHEN DATEADD(yy, N, @start) < @periodStart THEN @periodStart ELSE DATEADD(yy, N, @start) END dte,
RANK() OVER (PARTITION BY 1 ORDER BY DATEADD(yy, N, @start)) ID
FROM tools.dbo.tally
WHERE N <= DATEDIFF(yy, @Start, @End)
) a
INNER JOIN (
SELECT CASE WHEN DATEADD(yy, -1*N, @end) > @periodEnd THEN @periodEnd ELSE DATEADD(yy, -1*N, @end) END dte,
RANK() OVER (PARTITION BY 1 ORDER BY DATEADD(yy, -1*N, @end)) ID
FROM tools.dbo.tally
WHERE N <= DATEDIFF(yy, @Start, @End)
)b ON a.ID = b.ID
END
RETURN
END
Whew~! And if that wasn't enough... here's the second
Alter FUNCTION perf.udf_Portfolio_Indicators(
@periodStart SMALLDATETIME,
@periodEnd SMALLDATETIME,
@portfolio varchar(MAX),
@subbusunit varchar(MAX)
)
RETURNS TABLE
AS RETURN
(
SELECT p.PNL, p.LMV, p.PNL/p.LMV ROLMV,
((CAST(p.Total AS DECIMAL(10,5))/p.Avg_Num)-1) * (365.0/(DATEDIFF(d, @periodStart, @periodEnd) + 1)) Turnover,
CAST(p.Winners AS DECIMAL(10,5))/CAST(p.Total AS DECIMAL(10,5)) BA,
p.Win_PNL/p.Win_LMV ROLMV_W,
p.Los_PNL/p.Los_LMV ROLMV_L,
p.[>50%], p.[40% to 50%], p.[30% to 40%], p.[20% to 30%], p.[10% to 20%], p.[0% to 10%],
p.[-10% to 0%], p.[-20% to -10%], p.[-30% to -20%], p.[-40% to -30%], p.[-50% to -40%], p.[< -50%]
FROM (
SELECT SUM(PNL) PNL,
SUM(Long_MV)/(DATEDIFF(d, MIN(begin_date), MAX(end_date))+1) LMV,
SUM(Total) Total,
SUM(Winner) Winners,
SUM(CASE WHEN winner = 1 THEN PNL ELSE 0 END) Win_PNL,
SUM(CASE WHEN winner = 1 THEN Long_MV ELSE 0 END)/(DATEDIFF(d, MIN(begin_date), MAX(end_date))+1) Win_LMV,
SUM(Loser) Losers,
SUM(CASE WHEN loser = 1 THEN PNL ELSE 0 END) Los_PNL,
SUM(CASE WHEN loser = 1 THEN Long_MV ELSE 0 END)/(DATEDIFF(d, MIN(begin_date), MAX(end_date))+1) Los_LMV,
DATEDIFF(d, MIN(begin_date), MAX(end_date))+1 days,
SUM(CASE WHEN Tier = 0 THEN 1 ELSE 0 END) [9998],
SUM(CASE WHEN Tier = 1 THEN 1 ELSE 0 END) [>50%],
SUM(CASE WHEN Tier = 2 THEN 1 ELSE 0 END) [40% to 50%],
SUM(CASE WHEN Tier = 3 THEN 1 ELSE 0 END) [30% to 40%],
SUM(CASE WHEN Tier = 4 THEN 1 ELSE 0 END) [20% to 30%],
SUM(CASE WHEN Tier = 5 THEN 1 ELSE 0 END) [10% to 20%],
SUM(CASE WHEN Tier = 6 THEN 1 ELSE 0 END) [0% to 10%],
SUM(CASE WHEN Tier = 7 THEN 1 ELSE 0 END) [-10% to 0%],
SUM(CASE WHEN Tier = 8 THEN 1 ELSE 0 END) [-20% to -10%],
SUM(CASE WHEN Tier = 9 THEN 1 ELSE 0 END) [-30% to -20%],
SUM(CASE WHEN Tier = 10 THEN 1 ELSE 0 END)[-40% to -30%],
SUM(CASE WHEN Tier = 11THEN 1 ELSE 0 END) [-50% to -40%],
SUM(CASE WHEN Tier = 12 THEN 1 ELSE 0 END)[< -50%],
SUM(CASE WHEN Total = 1 THEN Strategy_Days/(DATEDIFF(d, @periodStart, @periodEnd)+1) ELSE 0 End) Avg_Num
FROM perf.udf_portfolio_indicators_strategy (@periodStart, @periodEnd, @portfolio) p
INNER JOIN tools.dbo.udf_parselist(@subbusunit, ',') list ON p.subbus_unit_code = list.token
) p
CROSS JOIN (
SELECT Sum(Capital)/COUNT(DISTINCT rundate) Capital
FROM perf.Allocated_Capital_View ac
INNER JOIN tools.dbo.udf_parselist(@portfolio, ',') p ON ac.portfolio = p.token
INNER JOIN tools.dbo.udf_parselist(@subbusunit, ',') s ON ac.subbusunit = s.token
WHERE Rundate BETWEEN @periodStart AND @periodEnd
) ac
)
Thanks for taking the time to take a look!
~Jeff
You should seriously need to look into a "CALENDAR" Table!
* Noel
February 24, 2009 at 4:05 pm
Everyone else has defined it already. You're getting RBAR because of the multi-statement UDF, not to mention that when you cross apply this query, it has no stats, so it's treated a single row statement even though it's returning, I assume, many rows (you mention looping). Multi-statement UDF's are notorious.
"The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
- Theodore Roosevelt
Author of:
SQL Server Execution Plans
SQL Server Query Performance Tuning
February 25, 2009 at 5:56 am
Thank you to everyone who replied. I will definitely look into creating a calendar table and rewok the functions to be inline. I was building on some previously built code but it looks like its time to revisit it.
~Jeff
Viewing 10 posts - 1 through 9 (of 9 total)
You must be logged in to reply to this topic. Login to reply