UDF's vs Procs

  • Crappy,

    I am really puzzled by the findings. Because in real life systems I have found many performance problems in UDF's....

    It is my job to solve those problems....

    Many of those problems are caused by functions. Furthermore if you have a performance bottleneck you cannot profile or trace the functions. That would be the main reason for me to stay away from them.

    I do agree with Antares to always go with the fastest solution. But for some reason feel that when dat amount of data increases

    the UDF looses it's speed.

    And Furthermore.... I haven't tested this but Crappy uses the UDF a little bit as a kind of view. Wouldn't an indexed view on SQL Advanced server be faster ?????

    Antares you don't have one of those advanced systems around....... ????

  • ok.....

    Now I have a bit of time to play around with the differences.

    When I use the UDF, it does a Constant Scan. I told it to use the Clustered index on the column but to no avail. What exactly is a constant scan? Good or Evil?

    I then told it to use a no-clustered index on a different column which I thought would slow it down as it has nothing to do with the WHERE but it still faster. There was a small drop in speed. So forcing SQL to use an index does have an effect but why does the Execution Plan not show the index being used?

    Code: ....From CCMakeModel (INDEX = IX_CCMakeModel) Where MakeCode = @MakeCode.....

    The table I am selecting from has 22858 rows in it. Small by most standards 🙂

    The Max(MakeCode) is 920. This is the column I am searching on. I ran my loop from 1 <= 920. This ensured SQL actually search for the data and found it (All results are different and correct)

    So far, two people have said that UDF's are bad, [Well] gave gave a reason the other was as usefull as ice to an eskimo. As is the rule, I'm gonna go for the fastest option...

    Lets hope and prey.

    My Question: Does anyone know how to force a UDF to use an index? Is a Constant scan bad? Seems fast enough.

    Cheers,

    Crispin

    Cheers,CrispinI can't die, there are too many people who still have to meet me!It's not a bug, SQL just misunderstood me!

  • Well,

    I think your thinking of SQL Developer Edition or SQL Enterprise Edition with regards to Indexed views. Yes, I have a copy of each we either got thru purchasing directly or thru MSDN (worth the cost for those who put off) that we have setup for testing and playing around.

    Actually, after playing with a bit more if you go back to my post on why you will find it always does a Constant Scan when deterministic. It only runs once then stores the value for use later. Still trying to figure out the mitigating (probably mispelled) circumstances for changes in data and how it is effected but as Crappy stated he reuses the object in a loop (as in the expample) so it works quite nicely since the value returned is constant in relative terms anyway.

    I am pretty sure the UDF may suffer a performance lag as table size increases but not sure the Constant Scan is what is going on within the UDF. If so then even an index will not help as it will not use the index, so a definit exponential lag as the size increases. But still if you are looping tru a set as in the example enough times the lag may still not offset the fact the Proc executes each time. That is why I suggest run once and store the data as a variable even the data submitted to the SP. Then if the data changes you know to resubmit to the SP and get a new value for use. This would allow reuse but would take a bit of proper planning and care in checking the code that you do not step on your variables and are in the right scope (not complicated just have to keep in mind).

    Crappy try doing a test with the SP versus UDF like this and see what happens. Do UDF as is now. However delcare the value for you output to be received from yuort variable and do something like so.

    DECLARE @out VARCHAR(100)

    EXEC Test_CP @MakeCode = 'Value', @MakeDesc = @out OUTPUT

    Then to your while test loop like so.

    WHILE conditiontobemetforend

    BEGIN

    SELECT @out AS output_val

    END

    If I am right should do the same thing but actually outperform the UDF. But again if the Constant Scan listed in the execution plan from the UDF is not what is happening within the UDF itself and it does use an index it won't be any different. If it is correct ay still not be any different due to small size of table. But there has to be a break point somewhere, I am just curious at what point would one break away from the other and what conditions should make considering do either, lean one way or the other.

  • Hi,

    The problem is I can't "reuse" the variable.

    What I have is a loop getting vehicles, this function returns the vehicle type to me.

    If the table where normal, I could just join to it but alas.....

    Currently, the proc being execed sits in the loop as well. It get's passed two params, A quoteID and SeqNo (Vehicle number (Not an identity)) and returns the data. This is what is slowing the proc down. Having about 10 similiar procs in the loop. 'tis the reason I want to try UDF's.

    When you say it's caching the data, how can it as whenever I pass it a different MakeCode (@vCount - the loop's counter) it returns the correct make description. If it were caching the data, would it not return the first make description from the batch?

    Busy converting a proc now to USE UDF's....

    Cheers,CrispinI can't die, there are too many people who still have to meet me!It's not a bug, SQL just misunderstood me!

  • The Constant Scan logical and physical operator introduces a constant row into a query. Meaning the value does not change.

    However if you change the input value it should change as the conditions for it have altered. So thus you data keeps in line.

    Don't have to test right now, but I would be curious to see what would happen if you run the same query with the same UDF parameters with no stopping point. Then while running change the underlying data that it would be hitting to see if it keeps up then. If what I do believe is going on and how it works is in line with my thoughts the value should remain the same until you change the input parameter or the query ends and is started again. Try and let me know what happens. Also, may have something else to track data changes while in loop so doesn't mean it won't change but I am curious if it will.

  • he he, the data changed... 🙂

    UDFCrispin

    UDFCrispin

    UDFCrispin

    >>> Point at which I updated it...

    UDFCrispin22

    UDFCrispin22

    UDFCrispin22

    The cool thing about the UDF is while 1 < 2 SQL is LOCKED!

    Someone want to sponser me a faster machine?

    k, so we now see that the data does actually change during the select (With the same param being passed (999))

    More ideas?

    Cheers,

    Crispin

    Cheers,CrispinI can't die, there are too many people who still have to meet me!It's not a bug, SQL just misunderstood me!

  • Well nice to know they thought of that at least. Bet they are using data pointers thou so the referenced value is a memory location tied to the data pulled into the buffers. Then when changed it kept up.

    Sure would be nice if the underlying factors in the UDF query were exposed. I don't think the Constant Scan is the only thing going on personally. But once the data is found it keeps a reference to it somewhere. I will check with a friend of mine who is a C++ programmer to see what he thinks or if he can give me an idea of what exactly is going on in memory. Probably will just tell me to use the memory tools with VC++ to watch it myself. I will post if I find any additional data.

  • With regards to the Constant Scan, could that not be the actual call to the function and not what the function is doing?

    Example:

    If dbo.GetVehicleMakeByCode('410') <> ''

    Print 'Yes'

    Else

    Print 'No'

    The execution plan shows nothing.

    It would look as if you cannot see what SQL is doing in a function. Just trust MS that it works.... 🙂 RUN AWAY!

    Let's see what you mate says...

    Crispin

    Cheers,CrispinI can't die, there are too many people who still have to meet me!It's not a bug, SQL just misunderstood me!

  • You see what i mean crappy.....

    Maybe you don't have a perf problem right now but if you get one later and you use functions, you cannot tell where the performance problems are....

    Then you have to rewrite a lot of stuff ( back to a sproc) to be able to find where the problem is.....

    That is why you should avoid them...

    Antares you were right about the edition....

    And crappy have you tried the view option ????

    Edited by - well0549 on 02/10/2003 05:25:39 AM

  • hmmm, I see what you mean. Yes, that could be a problem.

    But, I think I could life with it considering the increase (Apparent) that I am getting.

    What I am busy doing now is changing a proc (The slowest, around 4 seconds) to use functions. I'll leave this in production for a while and see how it goes.

    The table(s) it queries change in size constantly so any problems will show up soon...

    Maybe I'll just email Bill (BGates@Microsoft.com ?) and ask him 🙂

    Cheers,

    Crispin

    Cheers,CrispinI can't die, there are too many people who still have to meet me!It's not a bug, SQL just misunderstood me!

  • Sorry, to answer your second question, No, I have not tried views yet. Will get to that later on today.

    Crispin

    Cheers,CrispinI can't die, there are too many people who still have to meet me!It's not a bug, SQL just misunderstood me!

  • I was having a look at another thread Use of functions Do or Don't http://www.sqlservercentral.com/forum/link.asp?TOPIC_ID=9683.

    Well0549, you said there that you found UDFs slow. Antares, you I thing, also added some code.

    The one thing that differs from other UDFs and mine is mine is a simple query. Select * From xxx.

    Does the fact that SQL does not store an execution plan make a difference with a simple select?

    Is that maybe where the huge difference is coming from? A simple query vs a complex one with many join where SQL actually has to think?

    I'll try this theory and see... Yes, I'm grasping at straws but as much as I want to belive my results, I am still sceptical. 🙂

    Cheers,

    Crispin

    Cheers,CrispinI can't die, there are too many people who still have to meet me!It's not a bug, SQL just misunderstood me!

  • Sounds great. Let us know your results. It would be nice to definitively say under X condition using a UDF has the following bennifit, under Y condition SPs are better suited.

  • O well, there's that theroy down the drain.....

    Still faster. Not as fast as a simple query though but still twice as fast.

    All the tables are large(ish)

    When running the proc and UDF only once, sometimes the proc and UDF are bother zero, sometimes the proc is around 13-16ms.

    Here is something strange, The proc exponentially slows down while the UDF maintains a almost constant speed.

    Bear in mind, the input param is changing again (@vCount) so it can't be pulling data from cache.

    Times Run % Faster

    100 771

    200 976

    300 952

    400 1066

    500 1128

    Fresh out of ideas.....

    Cheers,

    Crispin

    StmtText

    -----------------------

    SET SHOWPLAN_TEXT ON

    (1 row(s) affected)

    StmtText

    ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

    Declare

    @vCountInt,

    @vStart1DateTime,

    @vStart2DateTime,

    @vEnd1DateTime,

    @vEnd2DateTime,

    @vQuestionTextVarchar(100),

    @vQuoteIDInt

    Set NoCount On

    Set

    @vCount = 1

    Set

    @vStart1 = GetDate()

    While @vCount < 2

    Begin

    Set @vQuoteID = 1000 + @vCount

    Exec Test_1 @vQuoteID, @vQuestionText OUTPUT

    CREATE Procedure Test_1

    @QuoteIDInt,

    @QuestionTextVarchar(100) OUTPUT

    as

    Set NoCount On

    Select Distinct @QuestionText = U.QuestionText from

    Quotes Q JOIN lnkQUoteQuestions L ON Q.QuoteID = L.QUoteID

    Join Client C ON Q.CLientID = C.ClientID

    JOIN Question U ON L.QuestionID = U.QuestionID

    JOIN Answer A ON U.QuestionID = A.QuestionI

    (8 row(s) affected)

    StmtText

    -----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

    |--Sort(DISTINCT ORDER BY:([Expr1005] ASC))

    |--Compute Scalar(DEFINE:([Expr1005]=Convert(.[QuestionText])))

    |--Hash Match(Inner Join, HASH:(.[QuestionID])=([A].[QuestionID]))

    |--Nested Loops(Inner Join, OUTER REFERENCES:([L].[QuestionID]))

    | |--Nested Loops(Inner Join, OUTER REFERENCES:(

    .[QuoteID]) WITH PREFETCH)

    | | |--Nested Loops(Inner Join, OUTER REFERENCES:(

    .[ClientID]))

    | | | |--Clustered Index Seek(OBJECT:([PUMA].[dbo].[Quotes].[PK_Quotes] AS

    ), SEEK:(

    .[QuoteID] >= 900 AND

    .[QuoteID] <= 1200), WHERE:(

    .[SchemeID]=2) ORDERED FORWARD)

    | | | |--Clustered Index Seek(OBJECT:([PUMA].[dbo].[Client].[PK_Client] AS [C]), SEEK:([C].[ClientID]=

    .[ClientID]) ORDERED FORWARD)

    | | |--Clustered Index Seek(OBJECT:([PUMA].[dbo].[lnkQuoteQuestions].[IX_lnkQuoteQuestions] AS [L]), SEEK:([L].[QuoteID]=

    .[QuoteID] AND [L].[ShortName]='VehReg') ORDERED FORWARD)

    | |--Clustered Index Seek(OBJECT:([PUMA].[dbo].[Question].[PK_Question] AS ), SEEK:(.[QuestionID]=[L].[QuestionID]) ORDERED FORWARD)

    |--Clustered Index Scan(OBJECT:([PUMA].[dbo].[Answer].[PK_Answer] AS [A]))

    (11 row(s) affected)

    StmtText

    -----------------------------------------------------------------------------------------------------------------

    Print 'Proc:' + @vQuestionText

    Set

    @vCount = @vCount + 1

    End

    Set @vEnd1 = GetDate()

    Set

    @vCount = 1

    Set

    @vStart2 = GetDate()

    While @vCount < 2

    Begin

    Set @vQuoteID = 1000 + @vCount

    Print 'UDF:' + dbo.Test1(@vQuoteID)

    Set

    @vCount = @vCount + 1

    End

    Set @vEnd2 = GetDate()

    Select

    DateDiff(millisecond, @vStart1, @vEnd1) AS Pro,

    DateDiff(millisecond, @vStart2, @vEnd2) AS UDF

    (11 row(s) affected)

    StmtText

    ---------------------------------------------------------------------------------------------------------------------------------------------

    |--Compute Scalar(DEFINE:([Expr1000]=datediff(millisecond, [@vStart1], [@vEnd1]), [Expr1001]=datediff(millisecond, [@vStart2], [@vEnd2])))

    |--Constant Scan

    (2 row(s) affected)

    Cheers,CrispinI can't die, there are too many people who still have to meet me!It's not a bug, SQL just misunderstood me!

  • ahhhhhh, Found the Problem. (This only affected the last post!)

    The UDF had an unfare advantage. The proc was doing a between in the last query where the UDF was doing an =.

    After correcting that, the UDF is on avarage 0-20ms SLOWER than the proc.... 🙁

    After running them both 1000 times in the loop, the proc took 390ms and the UDF 406.

    More realistic results! Thanks for help/comment all.

    Cheers,

    Crispin

    Cheers,CrispinI can't die, there are too many people who still have to meet me!It's not a bug, SQL just misunderstood me!

Viewing 15 posts - 16 through 30 (of 37 total)

You must be logged in to reply to this topic. Login to reply