April 1, 2008 at 12:02 pm
If anything, this discussion has opened my appetite to learn more about this subject. I'm not convined, but, if in the process, I become a condemner of SQL CLR, having started as a supporter, then so be it.
Heh, we are all here to learn, and sometimes heated exchanges like this get the search for knowledge going.
So, no hard feelings... 🙂
And yes, I do want to post any benchmark tests I do in this forum, and I'm looking forward to that debate! 🙂
__________________________________________________________________________________
SQL Server 2016 Columnstore Index Enhancements - System Views for Disk-Based Tables[/url]
Persisting SQL Server Index-Usage Statistics with MERGE[/url]
Turbocharge Your Database Maintenance With Service Broker: Part 2[/url]
April 1, 2008 at 12:08 pm
Look at that - I just got kicked upstairs to referee...hehe.
Not a problem...Just let me know when to go dust off the black and white checkered polo shirt and the whistle!!!!!
----------------------------------------------------------------------------------
Your lack of planning does not constitute an emergency on my part...unless you're my manager...or a director and above...or a really loud-spoken end-user..All right - what was my emergency again?
April 1, 2008 at 12:17 pm
Absolutely no hard feelings here, my friend! 🙂 Good heated debates like this either bring out the best in folks or the worst. You've easily been in the "best" category throughout this whole thing. Thank you!
Looking forward to trying to beat your production CLR... it'll be a hell of a race one way or another. If you want to try writing a CLR to beat some of my production T-SQL, I can be taught new tricks as well. Here's a place where I posted an example of my production T-SQL, a million row test data generator, table creation statments, and run results from my humble desktop box.
http://www.sqlservercentral.com/articles/Advanced+Querying/61716/
Matt, you won't have to screen scrape pieces of the code... I'll post a single code window for you if you agree to running the tests. I'm not even setup for CLRs and, besides, if I ran them, it would be like the mouse guarding the cheese! 😛
--Jeff Moden
Change is inevitable... Change for the better is not.
April 1, 2008 at 7:39 pm
Marios,
Some explorer once set off to prove the world was round. Zillions of people said they were crazy, that the world obviously was flat. No matter what the explorer said they said "no, no, NO!", and we all know saying something three times makes it true.
So, calculating the volume of a sphere seemed like a good candidate to use for comparison since there's no existing T-SQL function that does that and T-SQL offers the same functions to compute intermediate values as .NET.
Both versions of the function were run multiple times against a table with 250,000 rows and the results are below.
So it looks like the CLR was over 4 times faster than the T-SQL equivlent in this example. At least it was on the three different servers I ran it on.
--T-SQL User Defined Function
SELECT col1, dbo.ufn_SphereVolume_v1(col1) AS Volume FROM Table1
SQL Server parse and compile time:
CPU time = 0 ms, elapsed time = 1 ms.
SQL Server Execution Times:
CPU time = 19890 ms, elapsed time = 22596 ms.
--CLR User Defined Function
SELECT col1, dbo.ufn_SphereVolume_v2(col1) AS Volume FROM Table1
SQL Server parse and compile time:
CPU time = 0 ms, elapsed time = 1 ms.
SQL Server Execution Times:
CPU time = 141 ms, elapsed time = 6320 ms.
Test setup
[Code]
CREATE TABLE Table1 (col1 FLOAT NOT NULL)
SET NOCOUNT ON
DECLARE @i INT
SET @i = 250000
WHILE @i > 0
BEGIN
DECLARE @myval AS FLOAT
SET @myval = RAND() * 10000
INSERT INTO table1(col1) VALUES(@myval)
SET @i = @i -1
END
-- Test Statement
SET STATISTICS TIME ON
DBCC DROPCLEANBUFFERS
DBCC FREEPROCCACHE
SELECT col1, dbo.ufn_SphereVolume_v1(col1) AS Volume FROM Table1
DBCC DROPCLEANBUFFERS
DBCC FREEPROCCACHE
SELECT col1, dbo.ufn_SphereVolume_v2(col1) AS Volume FROM Table1
[/CODE]
Volume of a Sphere : T-SQL User Defined Function
CREATE FUNCTION dbo.ufn_SphereVolume_v1(@radius FLOAT)
RETURNS FLOAT
AS
BEGIN
DECLARE @result FLOAT
SET @result = 4.0/3 * PI() * POWER(@radius,3)
RETURN @result
END
Volume of a Sphere : CLR User Defined Function
Partial Public Class UserDefinedFunctions
_
Public Shared Function ufn_SphereVolume_v2(ByVal myRadius As SqlDouble) As SqlDouble
Dim myValue As New SqlDouble(4.0 / 3 * Math.PI * Math.Pow(CType(myRadius, Double), 3))
Return myValue
End Function
End Class
April 1, 2008 at 8:51 pm
Todd Engen (4/1/2008)
Marios,Some explorer once set off to prove the world was round. Zillions of people said they were crazy, that the world obviously was flat. No matter what the explorer said they said "no, no, NO!", and we all know saying something three times makes it true.
So, calculating the volume of a sphere seemed like a good candidate to use for comparison since there's no existing T-SQL function that does that and T-SQL offers the same functions to compute intermediate values as .NET.
Interesting - but sometimes the easiest answer is the best.
Add the sphere calculation as a direct calc, and "go for the million"
==================================================
CLR function
==================================================
SQL Server Execution Times:
CPU time = 1653 ms, elapsed time = 1759 ms.
==================================================
t-SQL function
==================================================
SQL Server Execution Times:
CPU time = 92750 ms, elapsed time = 182882 ms.
==================================================
t-SQL select with built-in functions
==================================================
SQL Server Execution Times:
CPU time = 1406 ms, elapsed time = 1527 ms.
==================================================
t-SQL select with no functions
==================================================
SQL Server Execution Times:
CPU time = 0 ms, elapsed time = 1 ms.
SQL Server Execution Times:
CPU time = 906 ms, elapsed time = 1144 ms.
So - function against function, sure. but taking out the penalties of firing functions 1M times will do the trick....
----------------------------------------------------------------------------------
Your lack of planning does not constitute an emergency on my part...unless you're my manager...or a director and above...or a really loud-spoken end-user..All right - what was my emergency again?
April 1, 2008 at 10:41 pm
Thank you, Matt! You beat me to it!
--Jeff Moden
Change is inevitable... Change for the better is not.
April 2, 2008 at 7:10 am
Add the sphere calculation as a direct calc, and "go for the million"
Hi Matt,
Can you post the script you ran to get these results?
__________________________________________________________________________________
SQL Server 2016 Columnstore Index Enhancements - System Views for Disk-Based Tables[/url]
Persisting SQL Server Index-Usage Statistics with MERGE[/url]
Turbocharge Your Database Maintenance With Service Broker: Part 2[/url]
April 2, 2008 at 7:16 am
So - function against function, sure.
So are you guys conceding that a CLR UDF can be much faster than a T-SQL UDF, when it comes to a complex math calculation? Because this is what the results show.
__________________________________________________________________________________
SQL Server 2016 Columnstore Index Enhancements - System Views for Disk-Based Tables[/url]
Persisting SQL Server Index-Usage Statistics with MERGE[/url]
Turbocharge Your Database Maintenance With Service Broker: Part 2[/url]
April 2, 2008 at 7:26 am
Jeff Moden (4/1/2008)
Thank you, Matt! You beat me to it!
No problem - but I guess I need to stay impartial in my "referee" role. That was your "free pass"...:) You're "on your own" from now on in this thread (you called the challenge after all :))
==========================
The interesting thing I saw was that in .NET - PI() is a constant, whereas PI() in T-SQL is a non-precise function, and POWER() seems to suck as a system function. By being a non-precise function, it seems that PI() is eval'ed once per row in T-SQL (strange indeed).
That being said - Todd - for a function that was actually impressive. For something returning in <2s for a 1M rowset (I'm turning off output, so that time doesn't include actually returning the stuff to the screen in either case), that's fairly wicked fast.
----------------------------------------------------------------------------------
Your lack of planning does not constitute an emergency on my part...unless you're my manager...or a director and above...or a really loud-spoken end-user..All right - what was my emergency again?
April 2, 2008 at 7:31 am
The interesting thing I saw was that in .NET - PI() is a constant, whereas PI() in T-SQL is a non-precise function, and POWER() seems to suck as a system function. By being a non-precise function, it seems that PI() is eval'ed once per row in T-SQL (strange indeed).
So the results may be due to current limitations of the T-SQL math functions. Something for MS to improve on in future SQL Server versions...
__________________________________________________________________________________
SQL Server 2016 Columnstore Index Enhancements - System Views for Disk-Based Tables[/url]
Persisting SQL Server Index-Usage Statistics with MERGE[/url]
Turbocharge Your Database Maintenance With Service Broker: Part 2[/url]
April 2, 2008 at 7:40 am
Marios Philippopoulos (4/2/2008)
The interesting thing I saw was that in .NET - PI() is a constant, whereas PI() in T-SQL is a non-precise function, and POWER() seems to suck as a system function. By being a non-precise function, it seems that PI() is eval'ed once per row in T-SQL (strange indeed).
So the results may be due to current limitations of the T-SQL math functions. Something for MS to improve on in future SQL Server versions...
That's ultimately the challenge, isn't it? The function handling (be they built-in or user-defined) seems to be a challenge when running the outer query. Sure you can often achieve gains (like Todd demonstrated) by switching to something compiled rather than interpreted, but they're going to slow the execution down in other ways, just by the fact that it's a function.
So - it ultimately comes down to - how do I make something so much more efficient that I can overcome the baggage that comes with using it?
----------------------------------------------------------------------------------
Your lack of planning does not constitute an emergency on my part...unless you're my manager...or a director and above...or a really loud-spoken end-user..All right - what was my emergency again?
April 2, 2008 at 7:59 am
Marios Philippopoulos (4/2/2008)
Add the sphere calculation as a direct calc, and "go for the million"
Hi Matt,
Can you post the script you ran to get these results?
Here you go:
SET NOCOUNT ON
--=====Set the data up
IF EXISTS (SELECT * FROM sys.objects WHERE object_id = OBJECT_ID(N'[dbo].Table1') AND type in (N'U'))
drop table table1
CREATE TABLE Table1 (col1 FLOAT NOT NULL)
go
insert table1(col1)
select top 1000000 rand(checksum(newid()))
from sys.all_columns sc1, sys.all_columns sc2
go
-- Test Statement
declare @pi float(53)
declare @Tmp float(53)
DBCC DROPCLEANBUFFERS
DBCC FREEPROCCACHE
SET STATISTICS TIME ON
print replicate('=',50)
print 'T-SQL function'
print replicate('=',50)
--just do the assignment so we can see how long it takes to process
SELECT @Tmp=dbo.ufn_SphereVolume_v1(col1) FROM Table1
SET STATISTICS TIME OFF
print replicate('=',50)
print 'CLR function'
print replicate('=',50)
DBCC DROPCLEANBUFFERS
DBCC FREEPROCCACHE
SET STATISTICS TIME ON
--just do the assignment so we can see how long it takes to process
SELECT @Tmp=dbo.ufn_SphereVolume_v2(col1) FROM Table1
SET STATISTICS TIME OFF
print replicate('=',50)
print 'T-SQL select with built-in functions'
print replicate('=',50)
DBCC DROPCLEANBUFFERS
DBCC FREEPROCCACHE
SET STATISTICS TIME ON
--just do the assignment so we can see how long it takes to process
SELECT @Tmp=4.0/3 * PI() * POWER(col1,3) FROM Table1
SET STATISTICS TIME OFF
print replicate('=',50)
print 'T-SQL select with no built-in functions' --none executed 1M times that is
print replicate('=',50)
DBCC DROPCLEANBUFFERS
DBCC FREEPROCCACHE
SET STATISTICS TIME ON
--just do the assignment so we can see how long it takes to process
select @pi=pi()
SELECT @Tmp=4.0/3 * @pi * col1*col1*col1 FROM Table1
SET STATISTICS TIME OFF
Set NOCOUNT OFF
----------------------------------------------------------------------------------
Your lack of planning does not constitute an emergency on my part...unless you're my manager...or a director and above...or a really loud-spoken end-user..All right - what was my emergency again?
April 2, 2008 at 2:29 pm
Thanks Matt.
Here is another interesting test from http://microsoft.apress.com/asptodayarchive/74019/sql-clr-best-practices:
Comparing Performance
Gert now presented some performance comparisons. He took two custom UDFs, one of which summed over a long series of numbers, while the other simply returned NULL . Each function was coded separately in both SQL-CLR and T-SQL. He ran T-SQL code which timed various loops each of which invoked one of these functions a large number of times in a loop. The results were surprising:
...
...
for the numeric-intensive function, SQL-CLR is far quicker.
__________________________________________________________________________________
SQL Server 2016 Columnstore Index Enhancements - System Views for Disk-Based Tables[/url]
Persisting SQL Server Index-Usage Statistics with MERGE[/url]
Turbocharge Your Database Maintenance With Service Broker: Part 2[/url]
April 2, 2008 at 2:49 pm
All the code blocks in that link show up as little red x's for me...
The thing I'm concerned about is that the word "loops" was mentioned in the test... testing using a loop is like throwing gas on a fire to see if it's wet.
--Jeff Moden
Change is inevitable... Change for the better is not.
April 2, 2008 at 2:51 pm
Marios Philippopoulos (4/2/2008)
Thanks Matt.Here is another interesting test from http://microsoft.apress.com/asptodayarchive/74019/sql-clr-best-practices:
Comparing Performance
Gert now presented some performance comparisons. He took two custom UDFs, one of which summed over a long series of numbers, while the other simply returned NULL . Each function was coded separately in both SQL-CLR and T-SQL. He ran T-SQL code which timed various loops each of which invoked one of these functions a large number of times in a loop. The results were surprising:
...
...
for the numeric-intensive function, SQL-CLR is far quicker.
Yup - that's been my observation too. But - as Jeff has often demonstrated - getting rid of all functions (or as many as you can) is almost always the fastest way to get it done.
When you compare functions against functions, compiled should often win over interpreted. Of course, to temper that - CLR tends to be far more memory-intensive, so it can't do everything in every scenario. I've managed to get the "this query has aborted, because of a resource limitation (memory)" message several times on SQLCLR, even after installing SP2 (it was just a memory leak, period, before that). So - CLR is not the panacea.
That being said - it seems you have to find areas where T-SQL is just plain "not very good", or something not efficiently done in T-SQL. Like string manipulation. It's not great at "math" things, but honestly - in most cases the formula is too "simple" a task to get past the function penalty (even after you mark the functions as deterministic and/or precise as appropriate).
So - something INTENSE and in some area T-SQL is so-so at, maybe; otherwise, not likely. You might get fairly close, though.
----------------------------------------------------------------------------------
Your lack of planning does not constitute an emergency on my part...unless you're my manager...or a director and above...or a really loud-spoken end-user..All right - what was my emergency again?
Viewing 15 posts - 31 through 45 (of 45 total)
You must be logged in to reply to this topic. Login to reply