How to Make Scalar UDFs Run Faster (SQL Spackle)

  • Comments posted to this topic are about the item How to Make Scalar UDFs Run Faster (SQL Spackle)

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Great article. I hate it when someone says, "Never...." You've proved that there is always more to the picture than what the eye can see (or the statistics tell us). Thanks.

  • Hmm, it sounds like: "Never use Scalar UDF, unless you're doing iterative string manipulation".

    In which case, I recommend the CLR :o)

    Also, thanks very much for the information regarding "SET STATISTICS ON"! That was extremely illuminating. Does the "Include Client Statistics" feature have an implicit STATISTICS TIME ON?

    It's prompted me to have a deeper look into what this button actually does, and similarly the "Show Actual Execution Plan" button.

    A bit of a tangent, but I remember a number of queries for which response times exploded when trying to view the Execution Plan.

  • Jeff Moden,

    Wonderful note about set statistics time! Never heard of it.

    "Try to beat it using any form of "all in one query" code."

    Here you go!

    Sample Data (one million rows of strings, 1 to 10 word, 2-16 word length)

    use tempdb;

    go

    if object_id('dbo.TestTable') is not null drop table dbo.TestTable;

    go

    select top(1000000) t.rnd_txt as s

    into dbo.TestTable

    from

    (select char(convert(int,rand(checksum(newid()))*26+97))) c (rnd_char)

    cross apply (select convert(int,rand(checksum(newid()))*15+2)) l(rnd_len)

    cross apply (select replicate(c.rnd_char,l.rnd_len)+' ') w(rnd_word)

    cross apply (select convert(int,rand(checksum(newid()))*10+1)) wc(rnd_wcount)

    cross apply (select replicate(w.rnd_word,wc.rnd_wcount)+' ') t(rnd_txt)

    cross join sys.all_columns o1

    cross join sys.all_columns o2

    ;

    here is function:

    create FUNCTION dbo.InitialCapFaster(@String VARCHAR(8000))

    RETURNS table

    AS

    return

    with

    c as

    (

    select

    String = STUFF(LOWER(@String),1,1,UPPER(LEFT(@String,1))) COLLATE Latin1_General_Bin,

    Position = PATINDEX('%[^A-Za-z''][a-z]%',@String COLLATE Latin1_General_Bin),

    Step = 1

    union all

    select

    String = STUFF(c.String,c.Position,2,UPPER(SUBSTRING(c.String,c.Position,2))) COLLATE Latin1_General_Bin,

    Position = PATINDEX('%[^A-Za-z''][a-z]%',c.String COLLATE Latin1_General_Bin),

    Step = c.Step+1

    from c

    where c.Position > 0

    )

    select top(1) String from c order by Step desc

    go

    Here is test

    declare @s-2 varchar(8000);

    declare @ds datetime = getdate();

    select @s-2 = dbo.InitialCap(s) from testtable;

    print 'InitialCap: ' + convert(varchar(10),datediff(ms,@ds,getdate())) + ' ms'

    go

    declare @s-2 varchar(8000);

    declare @ds datetime = getdate();

    select @s-2 = f.String from dbo.testtable cross apply dbo.InitialCapFaster(s) f;

    print 'InitialCapFaster: ' + convert(varchar(10),datediff(ms,@ds,getdate())) + ' ms'

    go

    print @@version

    Results

    InitialCap: 25533 ms

    InitialCapFaster: 12856 ms

    Microsoft SQL Server 2008 R2 (RTM) - 10.50.1600.1 (Intel X86)

    Apr 2 2010 15:53:02

    Copyright (c) Microsoft Corporation

    Enterprise Edition on Windows NT 5.1 <X86> (Build 2600: Service Pack 3)

    Almost 50% faster. I ran several tests, playing with test data.

    Depending on row count, words count and words length I got 30%-60% percents faster.

    But if there are not so many rows, 10 000 for example, than InitialCap wins, about 250 ms vs 600 ms in my experiments.

    And 35 000 rows id the threshold where both functions are almost equal InitialCap: 893 ms vs InitialCapFaster: 860 ms

    Try it on your data.

    All the tests were made on my local machine with 2008R2 sql.


    I am really sorry for my poor gramma. And I hope that value of my answers will outweigh the harm for your eyes.
    Blog: http://somewheresomehow.ru[/url]
    Twitter: @SomewereSomehow

  • Interesting, well done.

    Very concerned about that stats time issue.

    Have you tried looking at these with a server side trace? I'd be very curious to know whether that can have a similar problem.

    Equally - are you sure it hadn't just cached a dodgy query plan for the changed session settings?

  • Nice find about the time measurement method so strongly affecting the results. I honestly never took much notice to this, even while aware of better ways of measuring IO statistics then "set statistics io on" and knowing that when adding query plan output, things really slowed down.

    As for the concept of iSF...i think it does not exist and that the highlighted line in books online is just wrong.

    Here is why:

    According to the definition provided by books online, the type TABLE without definition of its contents, is what is defined as the return type in our fucntion. This is just like with all iTVFs and on odds with being scalar. Exposing all of the inner workings by using a single statement makes it thus an inline table valued function (iTVF) and I think so far you will agree with me.

    What I do not see as anything special is the "presence" of just only one column in the returned TABLE. I dont think this makes anything about it scalar! It is still a table type and that means non-scalar by definition.

    Things to consider:

    1. You have to use the function not as a scalar function, but as a table function.

    2. In the case we have two or more columns in the return table of a iTVF and just one is used by the invoking query, the performance should be proved to be different from one with just one column in it. Without such a difference (and i expect none) there is no distinction between the two!

    3. iTVF have as alternate name "parameterised views" for a good reason. They both have their inner logic exposed as a single statement that can be merged with the SQL code that uses them. The difference being that iTVF can accept parameters, and views cannot. Being essentially views, columns that are not used are simply optimized away from the resulting execution plan.

    Considering this, what then makes an iSF, what really sets it apart and makes it scalar or being processed as such?

  • While every testing method negatively affects the performance of the thing being tested by making the tester part of the experiment, some are obviously worse than others. I'd be interested to hear from some of the SQL Internals authors if they have some insight into how SET STATISTICS TIME ON works and how its use may or may not affect other situations where it is used. Paul? Kalen? Are you thinking about this yet?

    ------------
    Buy the ticket, take the ride. -- Hunter S. Thompson

  • Thanks for the detailed and METHODICAL analysis. I once was troubleshooting a performance problem in a large, complicated application and it came down to a "set quoted identifer on" statement in a scalar function. The statement was not needed, was probably generated by a developer who used a right-click-on-object to generate the "create" statement. The function was being called massive number of times so even the slightest performance difference was hugely magnified. I've been wary ever since but know that knowledge is power, so thanks for the article.

  • Not to say that this was not a good article, which it was, it seems we are not covering the killer scalar UDF issue. a scalar function is reasonably fast EXCEPT when you put some type of data access within the function. When this happens it will kill performance. I havent tried your method yet involving that scenario but I will today:) that would be for a calendar function that accesses a calendar table.

    When writing a small function that accesses a table to get one value, I find it better to just put the logic of the function directly in the calling select statement and not use a function at all and, even though it is basically the same code, it will run much faster. but then again you would lose the benefits of actually having a function that gives one place to manage the code from.

    Michael B
    Data Architect
    MCT, MCITP(SQL DBA 2005/2008),MCTS, MCP,MCDBA

  • Nice article Jeff. And you're right... it is a long spackle article 😀 - but sometimes you need that to fully explain things.

    Wayne
    Microsoft Certified Master: SQL Server 2008
    Author - SQL Server T-SQL Recipes


    If you can't explain to another person how the code that you're copying from the internet works, then DON'T USE IT on a production system! After all, you will be the one supporting it!
    Links:
    For better assistance in answering your questions
    Performance Problems
    Common date/time routines
    Understanding and Using APPLY Part 1 & Part 2

  • Nice article. Very informative.

    Interestingly enough, I just found out a couple of months ago in reading the articles on set-based programming versus procedural programming on Code Project http://www.codeproject.com/Articles/34142/Understanding-Set-based-and-Procedural-approaches and Subho on .Net http://www.shubho.net/2009/03/understanding-set-based-and-procedural_09.html that the use of a function in a select clause results in RBAR processing. They pointed that in a million row table you would be calling the function a million times. It makes perfect sense, though I had never thought about it that way, even while avoiding Cursors like the plague. They recommended that the use of correllated subqueries and JOINS and even While loops would avoid this performance hit. Now I see that using a table valued function with a cross apply will work as well. So I will be adding this to my list of tools.

    Thank you also for pointing out the issues with set statistics time ON. I will be certain not to use it from now on.

  • Awesome article, Jeff, thanks! There are so many great takeaways here. Very interesting comments in the forum so far as well; I'm looking forward to following this one.

    G Bryant McClellan (8/1/2012)


    While every testing method negatively affects the performance of the thing being tested by making the tester part of the experiment, some are obviously worse than others. I'd be interested to hear from some of the SQL Internals authors if they have some insight into how SET STATISTICS TIME ON works and how its use may or may not affect other situations where it is used. Paul? Kalen? Are you thinking about this yet?

    I'd like to echo this. I'm really interested in better understanding what's happening here with SET STATISTICS TIME ON, and when it should/shouldn't be used. That could be a spackle article in itself.

  • Excellent article. Best read of the week. 🙂

  • As always, Great Article Jeff! :-D:-D:-D

    You know, I think that I noted this basic problem with SET STATISTICS (that it adds a per-statement overhead) way back in my "15 Ways To Lose Your Cursors Article, Part 2"[/url], making it inappropriate for measuring Cursor performance, but it never occurred to me that the same problem would apply to measuring UDF's. Duh!!! :w00t:

    Anyway, great catch on that Jeff.

    However, there is a mysterious coda to your background on scalar UDF's. It turns out that there really is an Inline Scalar UDF object type in SQL Server, as described here in the 2005 documentation. It says:

    User-defined scalar functions return a single data value of the type defined in the RETURNS clause. For an inline scalar function, there is no function body; the scalar value is the result of a single statement.

    (emphasis mine)

    You can confirm this with the following query:

    SELECT name

    FROM master..spt_values

    WHERE type = 'O9T'

    AND name LIKE '%function%'

    So as far back as SQL 2005, they actually had the object-type for it, however, AFAIK, none exist (not even in the hidden [mssqlsystemresource] database) and there's no syntax to create one. So it appears that this is something that Microsoft must have anticipated in the run-up to SQL Server 2005 by adding a type for it (and doc!), but never actually implemented for some reason. Even odder, they never took it out of the doc, it's still there in the 2012 version.(!)

    Though it is one of the single most requested features for all of Ms Sql Server. Primarily because the default UDF's are still slower (even with Jeff's measurement corrections) and we end up having to back-end ITVF's to get the same effect. (clumsy, but it works).

    [font="Times New Roman"]-- RBarryYoung[/font], [font="Times New Roman"] (302)375-0451[/font] blog: MovingSQL.com, Twitter: @RBarryYoung[font="Arial Black"]
    Proactive Performance Solutions, Inc.
    [/font]
    [font="Verdana"] "Performance is our middle name."[/font]

  • Here's a function I tested, and the test I used:

    USE [ProofOfConcept]

    GO

    /****** Object: UserDefinedFunction [dbo].[InitialCapTest1] Script Date: 08/01/2012 11:20:42 ******/

    SET ANSI_NULLS ON

    GO

    SET QUOTED_IDENTIFIER ON

    GO

    ALTER FUNCTION [dbo].[InitialCapTest1] (@String VARCHAR(8000))

    RETURNS VARCHAR(8000)

    AS

    BEGIN

    DECLARE @Output VARCHAR(8000);

    WITH Seeds(Seed)

    AS (SELECT Val

    FROM ( VALUES ( 1), ( 1), ( 1), ( 1), ( 1), ( 1), ( 1), ( 1), ( 1), ( 1) ) AS Vals (Val)),

    Numbers(Number)

    AS (SELECT TOP (8000)

    ROW_NUMBER() OVER (ORDER BY S10.Seed)

    FROM Seeds AS S10

    CROSS JOIN Seeds AS S100

    CROSS JOIN Seeds AS S1000

    CROSS JOIN Seeds AS S10000)

    SELECT @Output = ( SELECT CASE WHEN SUBSTRING(@String, Number - 1, 1) NOT LIKE '[a-z]'

    THEN UPPER(SUBSTRING(@String, Number, 1))

    ELSE LOWER(SUBSTRING(@String, Number, 1))

    END

    FROM Numbers

    WHERE Number <= LEN(@String)

    ORDER BY Number

    FOR XML PATH(''),

    TYPE).value('.[1]', 'varchar(8000)');

    RETURN @Output;

    END;

    I created that, and the function referenced in the article, then populated a test environment as follows:

    CREATE TABLE ##T

    (ID INT IDENTITY

    PRIMARY KEY,

    Col1 VARCHAR(8000));

    INSERT INTO ##T

    (Col1)

    SELECT 'Scalar UDFs are generally accepted as one of the worst things you can do performance-wise in T-SQL. However, they may have earned a reputation that they simply don''t deserve because a common method of measuring performance (SET STATISTICS TIME ON) injects a major performance problem of its own when used to measure the performance of Scalar UDFs.

    That, notwithstanding, some Scalar UDFs really are a lot slower than running direct code but direct code means there is no encapsulation and may not be as easy, quick, or consistent to support development efforts. Even Scalar UDFs have the advantage of getting everyone to do things the same way. Of course, everyone doing all the same thing the slow way isn''t an advantage at all.

    In some cases and if the Scalar UDF code can be written as a single query, there can be a highly significant performance gain by converting (or writing originally) the Scalar UDF as an "Inline Scalar Function" (iSF), which is the same as writing an "Inline Table Valued Function" (iTVF) that returns only one value.

    Doing so is not a performance panacea, though. Although it hasn''t been proven within the confines of this article, it''s a well known fact that "all in one query" solutions are frequently performance problems themselves. To be sure of any benefit in converting Scalar UDFs to iSFs, you must test and it must be with a lot of rows to really do it right.

    Last and certainly not least, understand that anything you measure will be changed especially when it comes to SET STATISTICS TIME ON. Although I''ll personally continue to use it on forums to conveniently show performance differences between coding methods, I''ll do so only after I''ve made absolutely sure that it''s not injecting major but artificial performance problems itself. '

    FROM dbo.Numbers;

    (You may recognize the text.) I wanted to test on a reasonable number of rows, on bigger strings. Perhaps a weird test, but it's what I wanted to check. 10,001 rows in ##T from that based on my Numbers table.

    The tests:

    DECLARE @T TABLE (Col1 VARCHAR(8000));

    DECLARE @Start DATETIME = GETDATE();

    INSERT INTO @T

    (Col1)

    SELECT dbo.InitialCap(Col1)

    FROM ##T;

    SELECT DATEDIFF(millisecond, @Start, GETDATE());

    DECLARE @T TABLE (Col1 VARCHAR(8000));

    DECLARE @Start DATETIME = GETDATE();

    INSERT INTO @T

    (Col1)

    SELECT dbo.InitialCapTest1(Col1)

    FROM ##T;

    SELECT DATEDIFF(millisecond, @Start, GETDATE());

    InitialCap runtime, per DateDiff = 182,526 ms

    InitialCapTest1 = 21,673

    "Single-query" version, even without being iSF, is about 9X faster, per this test.

    Tested again, with "Bill O'Malley" as the string, 10k rows.

    TRUNCATE TABLE ##T;

    INSERT INTO ##T (Col1)

    SELECT 'Bill O''Malley'

    FROM dbo.Numbers;

    (I picked that string, because InitialCaps handled it "incorrectly". It's per-spec, but not "human right".)

    Same tests as above.

    InitialCaps runtime = 100 ms

    InitialCapsTest1 runtime = 9836 ms

    So, on a short string, the InitialCaps version is much faster, while on a long string, the InitialCapsTest1 version is much faster.

    Summary: Know your data patterns. Picking which function to use depends a lot on what you'll be running it on, not just on "one test said".

    Post-script: Name-case coding will be wrong much of the time. Bill O'Malley will likely end up as Bill O'malley, or Bob's will end up as Bob'S, unless you code some seriously smart algorithms into it. Any code that handles MacDonald correctly will probably mangle Macy or Macey, and vice-versa. The more "correct" you make it, the more complex it will be, and the slower it will run, and (likely) the buggier it will get.

    Definitely an interesting point about checking timing with Stats On. Traces are usually better, in my experience, and I've used the GetDate() trick for years. The GetDate() one is especially useful if you need to test multiple statements in a single script.

    - Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
    Property of The Thread

    "Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon

Viewing 15 posts - 1 through 15 (of 102 total)

You must be logged in to reply to this topic. Login to reply