April 28, 2008 at 11:57 pm
Comments posted to this topic are about the item Usage of Functions in Stored Procedures
April 29, 2008 at 1:01 am
The article was nice. I hopoe the code is right. But when I tried to run the code I am getting error in the Insert statement.
BEGIN
INSERT INTO CaseDetail VALUES(200,300,1)
SET @count = @count -1
END
I am not having time to check the error but do we require to write the column names in the syntax. I may done some silly mistakes also.
April 29, 2008 at 2:11 am
The final select statement where you use a local variable is different to the pure select statement, it only works because you've got all the values exactly the same. So the examples you give are broken because you've been lazy and used exactly the same values.
If your data really is like that something's wrong with the data structure.
All this article shows is how you should never use functions unless you know exactly what they do, precisely the opposite of what you say at the beginning. In this case the function is so useless you might as well throw it away.
My attitude with functions is avoid like the plague, as they break the rule of thinking in sets while programming SQL.
April 29, 2008 at 2:43 am
Hi Paul,
The following chunk of code worked fine for me.
CREATE TABLE CaseDetail (CaseId INT IDENTITY(1,1), QTY NUMERIC(14,4), weightedQty NUMERIC(14,4), ShipmentId INT)
--We assume that these cases belongs to Shipment with id 1
DECLARE @count INT
SET @count =15000
WHILE (@count >0)
BEGIN
INSERT INTO CaseDetail VALUES(200,300,1)
SET @count = @count -1
END
select count(caseid) from CaseDetail
Thank you and have a nice day,
Sreeju
April 29, 2008 at 5:09 am
There may be a few instances where functions can speed things along, I've found most of the time they hinder performance instead of helping it.
The times they've been useful to me is when I inherit something that's been designed poorly, and theres no $$$ to redesign it, but I have to make something work.
April 29, 2008 at 5:12 am
Hi Sreeju
I may be wrong. May typed wrongly. I will check it again. Thnks for the info ......
April 29, 2008 at 5:37 am
I've been burned way too many times by poor use of functions to be any kind of advocate for them. However, if you really want to drill down on them, then I recommend Andy Novick's book. He covers appropriate use of functions very well.
"The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
- Theodore Roosevelt
Author of:
SQL Server Execution Plans
SQL Server Query Performance Tuning
April 29, 2008 at 7:34 am
My own rule of thumb is to avoid functions that contain queries. Yes, they make for easy reading, but it's far too likely that they will get and reused in SELECT queries. RBAR bad.
__________________________________________________
Against stupidity the gods themselves contend in vain. -- Friedrich Schiller
Stop, children, what's that sound? Everybody look what's going down. -- Stephen Stills
April 29, 2008 at 8:06 am
What about matrix type calculations?
How can they be accelerated?
What I mean is you have a bunch (about a dozen) of tables with charts of tax rates,freight rates, etc.
Each row will have different values from each table and then they get computed for the final result.
We have functions for various lookups and calculations, some will be repeated, some won't
How can you improve this type of process?
I can see your technique for the 1 function, but I'm not sure how it will scale in a large stored proc with lots of function calls and computations.
April 29, 2008 at 8:08 am
I don't know.... I've run both your inline and variable procedures and on my machine they both take about 470ms, give or take a few ms. In fact, if anything I've found the variable method slightly slower by up to 50ms.
Did you dropcleanbuffers before running each procedure? If not then you might find that the cached results are interfering with your numbers.
Also, if you check out the io statistics, more logical reads are performed with the variable method.
I'm not sure I see the benefit in running the exact same query twice, as you're doing in the variable method. Unless I'm missing something, which is quite possible 😉
April 29, 2008 at 8:11 am
I came up with a different plan from yours for "best solution":
set nocount on
go
CREATE TABLE CaseDetails
(CaseId INT IDENTITY(1,1) primary key,
QTY NUMERIC(14,4),
weightedQty NUMERIC(14,4),
ShipmentId INT,
Total as Qty + WeightedQty)
go
raiserror('First Insert', 10, 1) with nowait
go
insert into dbo.CaseDetails(qty, weightedqty, shipmentid)
select 200, 300, 1
from common.dbo.bignumbers
where number between 1 and 15000
go
raiserror('Second Insert', 10, 1) with nowait
go
insert into dbo.CaseDetails(qty, weightedqty, shipmentid)
select 200, 300, numbers.number
from common.dbo.numbers
cross join common.dbo.numbers n2
where numbers.number between 2 and 10000
go
raiserror('Index', 10, 1) with nowait
go
create index IDX_CaseDetails_TotalQty on dbo.casedetails(shipmentid)
include (qty, weightedqty, total)
go
set statistics io on
set statistics time on
print 'First Method'
select qty, weightedqty, qty + weightedqty
from dbo.casedetails
where shipmentid = 1
/*
SQL Server parse and compile time:
CPU time = 0 ms, elapsed time = 1 ms.
SQL Server parse and compile time:
CPU time = 0 ms, elapsed time = 1 ms.
Table 'CaseDetails'. Scan count 1, logical reads 82,
physical reads 0, read-ahead reads 0, lob logical reads 0,
lob physical reads 0, lob read-ahead reads 0.
SQL Server Execution Times:
CPU time = 16 ms, elapsed time = 231 ms.
*/
print 'Second Method'
declare @Total int
select @total = qty + weightedqty
from dbo.casedetails
where shipmentid = 1
select qty, weightedqty, @total
from dbo.casedetails
where shipmentid = 1
/*
SQL Server parse and compile time:
CPU time = 0 ms, elapsed time = 1 ms.
Table 'CaseDetails'. Scan count 1, logical reads 82,
physical reads 0, read-ahead reads 0, lob logical reads 0,
lob physical reads 0, lob read-ahead reads 0.
SQL Server Execution Times:
CPU time = 0 ms, elapsed time = 8 ms.
Table 'CaseDetails'. Scan count 1, logical reads 82,
physical reads 0, read-ahead reads 0, lob logical reads 0,
lob physical reads 0, lob read-ahead reads 0.
SQL Server Execution Times:
CPU time = 31 ms, elapsed time = 230 ms.
*/
print 'Third Method'
select qty, weightedqty, total
from dbo.casedetails
where shipmentid = 1
/*
SQL Server parse and compile time:
CPU time = 0 ms, elapsed time = 1 ms.
SQL Server parse and compile time:
CPU time = 0 ms, elapsed time = 1 ms.
Table 'CaseDetails'. Scan count 1, logical reads 82,
physical reads 0, read-ahead reads 0, lob logical reads 0,
lob physical reads 0, lob read-ahead reads 0.
SQL Server Execution Times:
CPU time = 0 ms, elapsed time = 226 ms.
*/
As you'll see, I added a calculated column to the test table. Then I inserted 15,000 rows for shipping ID 1, and 100-million rows of data for 10,000 other shipping IDs. (I prefer tests in bigger tables.) Then I indexed the table.
On my computer, populating the tables with over 100-million rows of data took quite a while, but by using a Numbers table solution, it wasn't too bad (under an hour). That might have been overkill, I have to admit. If you decide to test this, I recommend about 10-million rows. Will be easier to deal with and perfectly valid for testing.
I didn't test the inline function, since I already know that will fail horribly.
With the calculated column and the right index, the CPU time on these queries goes WAY down.
The first test was a straight-up calculation, without the calculated column. It used the index, and took 16 milliseconds of CPU time and 1 table scan.
Second test used the "assign the number to a variable" method, as outlined in the article. That used the index, took 31 milliseconds, and took 2 table scans (1 to assign the variable, 1 to select the data).
The final test used the calculated column. That took under 1 millisecond of CPU time (too fast to measure) and 1 table scan.
(All of these "table scans" are just the IO stats. They're actually index seeks in the execution plan.)
Rather than using an inline function for this kind of calculation, I highly recommend non-persisted, indexed, calculated columns. I have used them for many things, and they are very fast, don't slow down insert/update/delete by too much (I guess if the calculation is complex enough and the insert/update is big enough, it might slow it down measurably, but I haven't seen that yet). Obviously, they do have some cost for those operations, and do add to the size of the index, but the cost is quite good compared to the benefit.
This way, you don't have to count on the data being the same in all rows, you cut IO in half, and you get your select done so fast that "statistics time" can't really measure it. And that's in a table with 100-million rows.
- Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
Property of The Thread
"Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon
April 29, 2008 at 8:56 am
There is a big difference in writing :
>> Unoptimsed SP method for lookup using local var
create procedure mysum (@myid int)
as
declare @myint int
set @myint= @myid
select x,y,z from mytable
where mylookupfilter = @myint
and:
>>Optimised using input variable allows select statement
>> to be properly cached
create procedure mysum (@myid int)
as
select x,y,z from mytable
where mylookupfilter = @myid
If to test the true performance, would need to correctly structure the sp to use the input parameter on the where clause filter, otherwise the query is not cached the same. I suspect you will find performance of the set-based sql procedure will improve even further
My mantras always been : "functions like cursors should be avoided if at all possible".
I have certainly seen a good benefit of removing functions and converting to pure optimsed set based sql solutions.
April 29, 2008 at 9:05 am
d_sysuk (4/29/2008)
There is a big difference in writing :>> Unoptimsed SP method for lookup using local var
create procedure mysum (@myid int)
as
declare @myint int
set @myint= @myid
select x,y,z from mytable
where mylookupfilter = @myint
and:
>>Optimised using input variable allows select statement
>> to be properly cached
create procedure mysum (@myid int)
as
select x,y,z from mytable
where mylookupfilter = @myid
The query plan on those two will be exactly the same. The first one just adds a fraction of a microsecond of allocating some memory for a variable and assigning a value to it, that the second one doesn't have to do. The actual query is the same.
Maybe I'm misunderstanding what you're writing.
Edit: Just to be sure, I just tested both of these concepts on my 100-million row table. Two procs, exact same execution plan, exact same execution time, IO, etc. First run on each took a few seconds, second and subsequent runs on each took too little time to measure.
- Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
Property of The Thread
"Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon
April 29, 2008 at 9:37 am
There is a more subtle point, Im making in the query plan optimisation that is saved.
A greater man, than I has quite nicely explained this , have a quick look at ( a google search should pull up the resource info):
MSDN I smell a parameter. Tips, Tricks, and Advice from the SQL Server Query Optimization Team
MSDN Ken Henderson plan stability in SQL Server 2000
It goes into a lot more detail than I can in this forum.
The example posted was quite cut-down, within the given example a name lookup is used to get the PK, which is then used to get the result set back. (2 Querys onto the source table - when this should really be a single set-based query utilising the name as in posters article to get the desired result-set, without the need to lookup the ID value then assign to a local var then use the local var value as a lookup )
April 29, 2008 at 11:13 am
d_sysuk (4/29/2008)
There is a more subtle point, Im making in the query plan optimisation that is saved.A greater man, than I has quite nicely explained this , have a quick look at ( a google search should pull up the resource info):
MSDN I smell a parameter. Tips, Tricks, and Advice from the SQL Server Query Optimization Team
MSDN Ken Henderson plan stability in SQL Server 2000
It goes into a lot more detail than I can in this forum.
The example posted was quite cut-down, within the given example a name lookup is used to get the PK, which is then used to get the result set back. (2 Querys onto the source table - when this should really be a single set-based query utilising the name as in posters article to get the desired result-set, without the need to lookup the ID value then assign to a local var then use the local var value as a lookup )
I think you may have partially misinterpreted the articles you cite. I just read them, and what you outlined in your example isn't actually what's gone over in the articles.
Yes, setting a default can override certain aspects of query optimization. My prefered method for doing this is creating separate procs called by a "meta-proc".
What you have in your example doesn't actually accomplish this. I tested it, and the execution plans are identical, except for the name of the variable in the scalar operator in the seek predicates.
- Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
Property of The Thread
"Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon
Viewing 15 posts - 1 through 15 (of 20 total)
You must be logged in to reply to this topic. Login to reply