##Tables in concurrent environment

  • Jeff Moden (2/14/2011)


    The reason for DATALENGTH in the function posted on this thread is so that you can use a space as a delimiter. It's only important if you have trailing delimiters with nothing inbetween but I wanted to make sure it was handled.

    The code also works fine (with some modification for DATALENGTH()/2) and fast for NVARCHAR(4000). I've only started on testing against any of the MAX datatypes. Typically, just using one of the MAX datatypes in a JOIN, as is usually done with splitters of this nature, immediately causes the code to run twice as slow allowing even (ugh!) recursive CTE's to outstripe joined splitters. However, this runs so bloody fast now (thanks to there being NO concatenation), I'm hoping it'll keep up with a rock solid well written While loop on the MAX datatypes. I just haven't had the time to do that testing, yet.

    Fair enough. You've commented twice that the lack of concatonation has really sped this up. I like the IFNULL(NULLIF()) bit you built in, it looks slick. Its surprising that it's that powerful though. I'll have to go check this out.

    Thank you, as always.


    - Craig Farrell

    Never stop learning, even if it hurts. Ego bruises are practically mandatory as you learn unless you've never risked enough to make a mistake.

    For better assistance in answering your questions[/url] | Forum Netiquette
    For index/tuning help, follow these directions.[/url] |Tally Tables[/url]

    Twitter: @AnyWayDBA

  • Anamika (2/14/2011)


    Thanks,

    It is working fine. I wonder on what scenario I've gone for Global Temporary tables. 🙁

    Thanks,

    Ami

    +1 on sending a normal #temp table into dynamic SQL!

    I've used global temp tables when I'm dumping SQL out to bcp to export to a delimited table; in those cases I use entirely dynamic SQL and name the ##temp table something like:

    SET @tmpGlobalTempTable = '##tmpStoredProcedureName_' + CAST(@@spid AS VARCHAR(6)) + '_' + CAST(FLOOR(rand()*100000) AS VARCHAR(6))

  • wolfkillj (2/14/2011)


    Jeff Moden (2/14/2011)


    Ok... the following function will pretty much blow the doors off of other splitter functions including some of my old ones.

    WOW!!! That crash you just heard was the sound of the doors being blown off the splitter functions we currently use. This is going into production TODAY! Thanks, Jeff!

    I'm looking forward to an updated tally table article - it took me a while to grasp the workings of this function, but I'm very curious about how I could use tally table logic in other ways, too.

    Jason

    Jason, I wonder if you'd do me a favor. Would it be possible for you to post the functions you replaced? I'd like the code for two reasons... 1) to do my own testing against and 2) I want to make sure that my code is actually a correct replacement for yours.

    Thanks, Jason.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Sure. My multistatement table-valued function (below) is a WHILE loop that repeats as long as the string has any characters in it, finding the first occurrence of the delimiter, inserting the LEFT() of the string before the first delimiter into a table variable, snipping off that substring plus the delimiter, then repeating on the remaining string until there's nothing left of it. I didn't write this myself, but cribbed it from another website. For short comma-delimited strings, the performance difference between your code and mine seems to be minimal (in very unstructured tests), but since my code requires an INSERT operation for every value in the comma-delimited string, the execution time starts to mount as the string gets longer. We sometimes send comma-delimited strings with hundreds of values to return data for our automated document generation processes, so your function seems to perform better in those cases. I'd be interested to see if rigorous testing bears this out. Thanks!

    CREATE FUNCTION [dbo].[Split_Value](@Value NVARCHAR(max), @delimiter CHAR(1))

    RETURNS @ValueTable TABLE (Value NVARCHAR(max))

    AS

    BEGIN

    DECLARE @tempValueList NVARCHAR(max)

    SET @tempValueList = @Value

    DECLARE @i INT

    DECLARE @ValueVal NVARCHAR(max)

    SET @i = CHARINDEX(@delimiter, @tempValueList)

    WHILE (LEN(@tempValueList) > 0)

    BEGIN

    IF @i = 0

    SET @ValueVal = @tempValueList

    ELSE

    SET @ValueVal = LEFT(@tempValueList, @i - 1)

    INSERT INTO @ValueTable(Value) VALUES(@ValueVal)

    IF @i = 0

    SET @tempValueList = ''

    ELSE

    SET @tempValueList = RIGHT(@tempValueList, LEN(@tempValueList) - @i)

    SET @i = CHARINDEX(@delimiter, @tempValueList)

    END

    END

    Jason Wolfkill

  • wolfkillj (2/16/2011)


    Sure. My multistatement table-valued function (below) is a WHILE loop that repeats as long as the string has any characters in it, finding the first occurrence of the delimiter, inserting the LEFT() of the string before the first delimiter into a table variable, snipping off that substring plus the delimiter, then repeating on the remaining string until there's nothing left of it. I didn't write this myself, but cribbed it from another website. For short comma-delimited strings, the performance difference between your code and mine seems to be minimal (in very unstructured tests), but since my code requires an INSERT operation for every value in the comma-delimited string, the execution time starts to mount as the string gets longer. We sometimes send comma-delimited strings with hundreds of values to return data for our automated document generation processes, so your function seems to perform better in those cases. I'd be interested to see if rigorous testing bears this out. Thanks!

    CREATE FUNCTION [dbo].[Split_Value](@Value NVARCHAR(max), @delimiter CHAR(1))

    RETURNS @ValueTable TABLE (Value NVARCHAR(max))

    AS

    BEGIN

    DECLARE @tempValueList NVARCHAR(max)

    SET @tempValueList = @Value

    DECLARE @i INT

    DECLARE @ValueVal NVARCHAR(max)

    SET @i = CHARINDEX(@delimiter, @tempValueList)

    WHILE (LEN(@tempValueList) > 0)

    BEGIN

    IF @i = 0

    SET @ValueVal = @tempValueList

    ELSE

    SET @ValueVal = LEFT(@tempValueList, @i - 1)

    INSERT INTO @ValueTable(Value) VALUES(@ValueVal)

    IF @i = 0

    SET @tempValueList = ''

    ELSE

    SET @tempValueList = RIGHT(@tempValueList, LEN(@tempValueList) - @i)

    SET @i = CHARINDEX(@delimiter, @tempValueList)

    END

    END

    Very cool. Thanks for the code. I've seen this "nibbler" type of function before and I have a pretty cool CSV generator I built to test it with. I'll let you know.

    As a side bar, if your CSV's will fit in a VARCHAR(8000) or NVARCHAR(4000), you can double the performance of just about any splitter code by not using one of the MAX datatypes. "Right sizing" of data-types works very well, in this case. In fact, I have two splitter functions... one for VARCHAR(8000) and one for VARCHAR(MAX) just so I can get the extra speed out of things that don't need the girth of VARCHAR(MAX). And, yes... I'm also working on improvements for VARCHAR(MAX) splitters. Right now, the VARCHAR(8000) version of the splitter I made is about 3 to 4 times (depending) faster than the nearly identical VARCHAR(MAX) splitter. The ONLY difference between the two is the data-type of the input string variable.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Hi Jeff,

    Thanks for the tip on the (max) vs. (8000)/(4000) nvarchar/varchar data types in these functions. When I implemented your new splitter function, I used nvarchar(4000), which covers our needs to at least a "five nines" percentage. I'll probably put a nvarchar(max) version in place, too, just to cover any unanticipated needs.

    Jason

    Jason Wolfkill

  • wolfkillj (2/17/2011)


    Hi Jeff,

    Thanks for the tip on the (max) vs. (8000)/(4000) nvarchar/varchar data types in these functions. When I implemented your new splitter function, I used nvarchar(4000), which covers our needs to at least a "five nines" percentage. I'll probably put a nvarchar(max) version in place, too, just to cover any unanticipated needs.

    Jason

    It won't hurt anything except slow things down abit because the cteTally table will always "overshoot" the length of NVARCHAR(4000) but we really should change most instances of DATALENGTH() to DATALENGTH()/2 if you're going to use NVARCHAR(). Sorry I didn't give you a heads up on that, before.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

Viewing 7 posts - 16 through 21 (of 21 total)

You must be logged in to reply to this topic. Login to reply