T-SQL Stored Proc calling T-SQL Function

  • The following is our T-SQL function that is being called for all VarChar,Char,NChar and NVarChar fields in our SELECT statement in a stored proc. There is a lot of fields selected over a lot of rows on a lot of tables which all use the FOR XML statement. The users are cutting and pasting duff chars into fields.

    We cannot stop this at source, hence the conversion Function.

    Does anyone have any thoughts or advice on speeding up / making more efficient this Function.

    Any input appreciated.

    ALTER FUNCTION dbo.fn_RemoveInvalidCharacters(@Temp NVarChar(max))

    RETURNS NVarChar(max)

    AS

    BEGIN

    DECLARE @regex VarChar(100)

    SET @regex = '%[^a-zA-Z0-9 !"#%&,/:;<=>@_`{}~%\$\(\)\*\+\.\?\[\^\|\u0009\u000A\u000D\u20AC\u005D-]%'

    -- Remove any character in the input parameter if it is ASCII code

    -- that is not compatible with XML i.e. a non-displable char

    WHILE PatIndex(@regex, @Temp) > 0

    SET @Temp = Stuff(@Temp, PatIndex(@regex, @Temp), 1, '')

    RETURN @Temp

    END

  • You could always do this on the application side after the user input and prior to the db call.

  • The user data is input using a different application over which I have no jurisdiction. This is the reason we have gone this way. I could do inline code in the Stored Procedure but the SP code would get uber-bloated so that isn't going to happen either.

  • mark.veitch (2/1/2010)


    The user data is input using a different application over which I have no jurisdiction. This is the reason we have gone this way. I could do inline code in the Stored Procedure but the SP code would get uber-bloated so that isn't going to happen either.

    Is this a 3rd party application, or in house? If it's "in house" you may be able to address this issue to the application side and let them know it's causing performance problems. If not, it seems the choices are limited.

    Function that you're currently doing. This is nice in that it's all in one place so mainting the code is easy.

    or

    Snippets of TSQL in each sproc, but as you alluded to, can really get ugly.

  • I can't say for sure but I'm fairly certain that even if it's in-house then getting it fixed at source won't be an option. :angry:

    There is the possibility that it might even be a mixture of in-house and third party. Looks like the clients may have to accept bad response times even though they are a vast improvement on what the previously had.

  • mark i believe you can also change the function to return a table value function, and use CROSS APPLY to apply it against your data for a big performance gain.

    Lynn Pettis gave a great example of it in this thread a little while ago:

    http://www.sqlservercentral.com/Forums/Topic842953-360-1.aspx?Highlight=fnGICSPFParseNumericForCobol

    He(Lynn) took the inline function, changed it to a table value, and the original poster saw a huge performance gain:

    Lowell


    --help us help you! If you post a question, make sure you include a CREATE TABLE... statement and INSERT INTO... statement into that table to give the volunteers here representative data. with your description of the problem, we can provide a tested, verifiable solution to your question! asking the question the right way gets you a tested answer the fastest way possible!

  • If you do not have access to the application code, you could look at using a trigger to clean the data as it was being saved.

  • Thank you very much Lowell. Just had a quick scan and although this is new to me I'm thinking it might be a possibility. About to sign off so I'll look more indepth tomorrow.

    Once again, thanks for the input guys.

  • Just to add a little more, you may want to take a look my blog post Comparing Hardcoded functions, In-Line TVF's, and Scalar Functions

    Lowell, thank you for reference. Much appreciated! 🙂

  • Converting your existing function into an Inline TVF is not going to be easy and it may not solve your problem. One reason I say that, NVARCHAR(MAX)/VARCHAR(MAX) fields. I am curious, how large is the longest field you have currently encountered? This may actually be one of those cases where a CLR function may be benefit, but don't ask me how to write it as I haven't had the chance to dive into that area yet.

    The biggest problem you are facing right now in your current function is the RBAR, the WHILE loop. This is what is keeping the function from scaling well at the moment.

  • I have some code for you, it requires that you have a Tally table to work. You should look at the fourth article I reference below in my signature block for more about Tally tables.

    create function dbo.ValidString (

    @pInputString nvarchar(max),

    @pValidChars varchar(128) = '[^a-zA-Z0-9 !"#%&,/:;<=>@_`{}~%\$\(\)\*\+\.\?\[\^\|\u0009\u000A\u000D\u20AC\u005D-]'

    )

    returns table

    as

    return(

    with ValidChars as (

    select

    substring(@pInputString, N, 1) as ValidChar

    from

    dbo.Tally

    where

    N <= len(@pInputString)

    and substring(@pInputString, N, 1) not like '%' + @pValidChars + '%' -- '%[^a-zA-Z0-9 !"#%&,/:;<=>@_`{}~%\$\(\)\*\+\.\?\[\^\|\u0009\u000A\u000D\u20AC\u005D-]%'

    )

    select

    replace((select ',' + ValidChar from ValidChars for xml path('')),',','') as ValidString

    );

    select * from dbo.ValidString('This is a Test String 1.',default)

    select * from dbo.ValidString('This is a Test String 1.','[^a-zA-Z .]')

  • mark.veitch (2/1/2010)


    The following is our T-SQL function that is being called for all VarChar,Char,NChar and NVarChar fields in our SELECT statement in a stored proc. There is a lot of fields selected over a lot of rows on a lot of tables which all use the FOR XML statement. The users are cutting and pasting duff chars into fields.

    We cannot stop this at source, hence the conversion Function.

    Does anyone have any thoughts or advice on speeding up / making more efficient this Function.

    Any input appreciated.

    ALTER FUNCTION dbo.fn_RemoveInvalidCharacters(@Temp NVarChar(max))

    RETURNS NVarChar(max)

    AS

    BEGIN

    DECLARE @regex VarChar(100)

    SET @regex = '%[^a-zA-Z0-9 !"#%&,/:;<=>@_`{}~%\$\(\)\*\+\.\?\[\^\|\u0009\u000A\u000D\u20AC\u005D-]%'

    -- Remove any character in the input parameter if it is ASCII code

    -- that is not compatible with XML i.e. a non-displable char

    WHILE PatIndex(@regex, @Temp) > 0

    SET @Temp = Stuff(@Temp, PatIndex(@regex, @Temp), 1, '')

    RETURN @Temp

    END

    A part of the performance problem is that you're using PATINDEX twice for each character removal. Try something like the following...

    CREATE FUNCTION dbo.AlphaNumeric

    (@String VARCHAR(MAX))

    RETURNS VARCHAR(MAX) AS

    BEGIN

    DECLARE @IncorrectCharLoc SMALLINT

    SELECT @IncorrectCharLoc = PATINDEX('%[^0-9A-Za-z]%', @String)

    WHILE @IncorrectCharLoc > 0

    SELECT @string = STUFF(@String, @IncorrectCharLoc, 1, ''),

    @IncorrectCharLoc = PATINDEX('%[^0-9A-Za-z]%', @String)

    RETURN @string

    END

    Of course, you'll need to use your own pattern string...

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • mark.veitch (2/1/2010)


    Thank you very much Lowell. Just had a quick scan and although this is new to me I'm thinking it might be a possibility. About to sign off so I'll look more indepth tomorrow.

    Once again, thanks for the input guys.

    Okay, it has been a couple of days. What's happening?

  • Lynn Pettis (2/3/2010)


    mark.veitch (2/1/2010)


    Thank you very much Lowell. Just had a quick scan and although this is new to me I'm thinking it might be a possibility. About to sign off so I'll look more indepth tomorrow.

    Once again, thanks for the input guys.

    Okay, it has been a couple of days. What's happening?

    Heh... it snowed here today. 😛

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Jeff Moden (2/3/2010)


    Lynn Pettis (2/3/2010)


    mark.veitch (2/1/2010)


    Thank you very much Lowell. Just had a quick scan and although this is new to me I'm thinking it might be a possibility. About to sign off so I'll look more indepth tomorrow.

    Once again, thanks for the input guys.

    Okay, it has been a couple of days. What's happening?

    Heh... it snowed here today. 😛

    Good. It's trying to here at the moment, but I know we won't get a snow day tomorrow.

Viewing 15 posts - 1 through 15 (of 15 total)

You must be logged in to reply to this topic. Login to reply