February 2, 2010 at 10:52 am
I am looking at a stored procedure that uses a lot of functions. When I look at the DMV's and DMF's I see astronomical numbers in total_worker_time, total_logical_reads, and total_elapsed_time. I am not so familiar with functions, so can someone please explain why functions (and not all functions) would have performance issues.
February 2, 2010 at 10:59 am
Depends on the type of function, but a common error is to do data access inside a scalar function (for example). The idea sounds reasonable until you realise that the function is called once per matching row of the input, which may be quite large. The data access inside the function therefore also executes once per row, quite separately, and without the benefits of the set-based processing SQL Server is so good at.
In effect, it creates a socking great cursor, where the only available join back to the driving data set is a loop join. It is unutterably horrible.
Functions should be thought of as the mathematical type - not the programming language type. I can't think of one valid reason to do data access from inside one - and come to think of it, I can't think of a case where a T-SQL function would match the performance of a SQLCLR function either.
Paul
February 2, 2010 at 10:51 pm
Paul,
I assume you are talking about data access within scalar functions, not inline table-valued functions. Data access through ITVF's can be ruthlessly efficient (best I could do to match "unutterably horrible").
Best regards,
Bob
__________________________________________________
Against stupidity the gods themselves contend in vain. -- Friedrich Schiller
Stop, children, what's that sound? Everybody look what's going down. -- Stephen Stills
February 2, 2010 at 11:05 pm
The Dixie Flatline (2/2/2010)
I assume you are talking about data access within scalar functions, not inline table-valued functions. Data access through ITVF's can be ruthlessly efficient (best I could do to match "unutterably horrible").
Hi Bob,
Yes - talking about scalar functions that do data access (I did mention the word in the first sentence, but not subsequently). In-line TVFs are quite different - the ITVF query plan is incorporated directly into the overall plan, and the whole thing can be optimized in the usual way. Nice.
Paul
February 2, 2010 at 11:13 pm
Curiousity question. Have you seen any documentation about the performance of scalar functions which do NOT do data access compared to inline table valued functions which produce the same results? If not, I must do some experimenting.
__________________________________________________
Against stupidity the gods themselves contend in vain. -- Friedrich Schiller
Stop, children, what's that sound? Everybody look what's going down. -- Stephen Stills
February 3, 2010 at 1:26 am
The Dixie Flatline (2/2/2010)
Curiousity question. Have you seen any documentation about the performance of scalar functions which do NOT do data access compared to inline table valued functions which produce the same results? If not, I must do some experimenting.
I blogged this a little while back..
http://sqlblogcasts.com/blogs/sqlandthelike/archive/2009/10/15/udf-overhead-a-simple-example.aspx
Important to note this to though
http://sqlblogcasts.com/blogs/sqlandthelike/archive/2009/11/24/the-observer-effect-in-action.aspx
February 3, 2010 at 4:39 am
The Dixie Flatline (2/2/2010)
Curiousity question. Have you seen any documentation about the performance of scalar functions which do NOT do data access compared to inline table valued functions which produce the same results? If not, I must do some experimenting.
This is a very interesting question. As Mr. Ballantyne's tests show, the in-line TVF can produce the fastest possible plan. Some results from my machine, based very much on Dave's blog entires, but including a CLR scalar function too:
[font="Courier New"]In-line T-SQL TVF: 85ms
CLR scalar function: 238ms
T-SQL scalar function: 447ms[/font]
(results are worker times - run the full script for more detail)
-- You need this sample database to run these tests
USE AdventureWorks;
GO
-- Turn off stuff we don't want to affect the results
SET NOCOUNT ON;
SET STATISTICS IO, TIME OFF;
GO
-- Reset the system
CHECKPOINT;
DBCC DROPCLEANBUFFERS;
DBCC FREESYSTEMCACHE('ALL');
GO
-- Warm the data cache
DECLARE @m MONEY;
SELECT @m = UnitPrice
FROM Sales.SalesOrderDetail;
GO
-- CLR functionality required
IF NOT EXISTS(SELECT * FROM sys.configurations WHERE name = N'clr enabled' AND value_in_use = 1)
BEGIN
EXECUTE sp_configure 'clr enabled', 1;
RECONFIGURE;
END;
GO
-- Scalar function
CREATE FUNCTION Sales.CalcCommission(@Price MONEY)
RETURNS MONEY
WITH SCHEMABINDING
AS
BEGIN
RETURN (@Price/$100.00) * $5;
END;
GO
-- Inline TVF
CREATE FUNCTION Sales.InlineCalcCommission(@Price MONEY)
RETURNS TABLE
AS
RETURN SELECT (@Price/$100.00) * $5 AS Commission;
GO
-- CLR assembly
CREATE ASSEMBLY [Test]
AUTHORIZATION [dbo]
FROM 
WITH PERMISSION_SET = SAFE;
/*
using System.Data.SqlTypes;
using Microsoft.SqlServer.Server;
public partial class UserDefinedFunctions
{
[Microsoft.SqlServer.Server.SqlFunction(DataAccess = DataAccessKind.None,SystemDataAccess = SystemDataAccessKind.None,IsDeterministic = true,IsPrecise = true)]
[return: SqlFacet(IsNullable = false)]
public static SqlMoney clrCalcCommission([SqlFacet(IsNullable = false)] SqlMoney Price)
{ return new SqlMoney(Price.Value * 0.05M); }
};
*/
GO
-- CLR scalar function
CREATE FUNCTION dbo.clrCalcCommission(@Price MONEY)
RETURNS MONEY
WITH RETURNS NULL ON NULL INPUT
EXTERNAL NAME Test.UserDefinedFunctions.clrCalcCommission;
GO
-- T-SQL Scalar function
DECLARE @Bitbucket MONEY;
SELECT @Bitbucket =
Sales.CalcCommission(UnitPrice)
FROM Sales.SalesOrderDetail;
GO
-- T-SQL inline TVF
DECLARE @Bitbucket MONEY;
SELECT @Bitbucket =
Commission
FROM Sales.SalesOrderDetail
CROSS
APPLY Sales.InlineCalcCommission(UnitPrice);
GO
-- CLR scalar function
DECLARE @Bitbucket MONEY;
SELECT @Bitbucket =
dbo.clrCalcCommission(UnitPrice)
FROM Sales.SalesOrderDetail;
GO
-- Results
SELECT [rank] = RANK() OVER (ORDER BY QS.total_elapsed_time ASC),
ST.text,
QS.execution_count,
elapsed_time_ms = QS.total_elapsed_time / 1000,
logical_reads = QS.total_logical_reads,
cpu_time_ms = QS.total_worker_time / 1000
FROM sys.dm_exec_query_stats QS
CROSS
APPLY sys.dm_exec_sql_text (QS.sql_handle) ST
WHERE ST.text LIKE '%@BitBucket%'
AND ST.text NOT LIKE '%sys.dm_exec_query_stats%'
ORDER BY
[rank] ASC;
GO
-- Tidy up
DROP FUNCTION Sales.CalcCommission;
DROP FUNCTION Sales.InlineCalcCommission;
DROP FUNCTION dbo.clrCalcCommission;
DROP ASSEMBLY Test;
-- End
So, the ITVF is fastest by quite some margin here. The reason being, of course, that the optimizer is able to completely remove the APPLY operation, and place the ITVF computation directly in a single Compute Scalar in the final plan.
Whether this will always occur for more complex requirements is hard to say. I guess it depends on the optimizer - if it is able to omit the APPLY operation completely and represent the computation efficiently then it's hard to see how to beat the ITVF. I suppose I should also mention that ITVF solutions are required to express all the computational logic in a single SELECT statement.
If I get a minute, I might try some more complex examples and post them here if they are interesting.
Paul
edit: for layout
February 3, 2010 at 6:45 pm
Thanks, Dave. Exactly the kind of information I was looking for.
__________________________________________________
Against stupidity the gods themselves contend in vain. -- Friedrich Schiller
Stop, children, what's that sound? Everybody look what's going down. -- Stephen Stills
February 3, 2010 at 8:23 pm
Thanks to you as well, Emperor Paulpatine.
I'm not surprised that the ITVFs are faster than the scalar functions, but I am surprised by the percentage difference for functions that don't access data themselves. I was also surprised by the CLR performance. I would have expected the overhead to be greater for calling CLR routines than for internal user functions, but obviously my expectations were misplaced. If you find anything interesting please share.
__________________________________________________
Against stupidity the gods themselves contend in vain. -- Friedrich Schiller
Stop, children, what's that sound? Everybody look what's going down. -- Stephen Stills
February 3, 2010 at 11:42 pm
The Dixie Flatline (2/3/2010)
Thanks to you as well, Emperor Paulpatine. I'm not surprised that the ITVFs are faster than the scalar functions, but I am surprised by the percentage difference for functions that don't access data themselves. I was also surprised by the CLR performance. I would have expected the overhead to be greater for calling CLR routines than for internal user functions, but obviously my expectations were misplaced. If you find anything interesting please share.
Cheers Bob - I will. On the subject of SQLCLR overhead for scalar functions, BOL says (under Performance of CLR Integration):
"CLR functions benefit from a quicker invocation path than that of Transact-SQL user-defined functions. Additionally, managed code has a decisive performance advantage over Transact-SQL in terms of procedural code, computation, and string manipulation. CLR functions that are computing-intensive and that do not perform data access are better written in managed code. Transact-SQL functions do, however, perform data access more efficiently than CLR integration."
February 4, 2010 at 7:22 am
Thanks Paul. I really do have to get on the CLR train, don't I? I will get to it right after I rewrite all of our scalar functions to be inline table variable functions. I'm getting a lot of practice using CTEs to mimic variables.
__________________________________________________
Against stupidity the gods themselves contend in vain. -- Friedrich Schiller
Stop, children, what's that sound? Everybody look what's going down. -- Stephen Stills
February 4, 2010 at 7:38 am
The Dixie Flatline (2/4/2010)
Thanks Paul. I really do have to get on the CLR train, don't I?
No - my future consulting daily rates depend on the majority of skilled SQL Server people ignoring SQLCLR 😀
The Dixie Flatline (2/4/2010)
I will get to it right after I rewrite all of our scalar functions to be inline table variable functions. I'm getting a lot of practice using CTEs to mimic variables.
Console yourself with the fact that your current work sounds a good deal more interesting than mine!
Luckily I work harder on 'fun stuff' like SSC than anything else - just don't tell anyone...
February 4, 2010 at 2:40 pm
Mum's the word.
__________________________________________________
Against stupidity the gods themselves contend in vain. -- Friedrich Schiller
Stop, children, what's that sound? Everybody look what's going down. -- Stephen Stills
February 4, 2010 at 2:51 pm
I think the snake's out of the bag on that one 🙂
Jason...AKA CirqueDeSQLeil
_______________________________________________
I have given a name to my pain...MCM SQL Server, MVP
SQL RNNR
Posting Performance Based Questions - Gail Shaw[/url]
Learn Extended Events
Viewing 14 posts - 1 through 13 (of 13 total)
You must be logged in to reply to this topic. Login to reply