a ITVF string concatenation without "FOR_XML"

  • Hello every1.

    I have a problem with the following UDF.

    "create function [dbo].[_some_function]

    (@poolint,

    @Eventint,

    @delimiter varchar(100))

    returns table

    return

    with result (polje) as

    (select stuff((

    select@delimiter + T1.object_value

    fromtable1 T1

    inner join table2 T2 on T2.[pool] = T1.[pool] and T2.event= T1.object_value

    where

    T2.[pool] = @pool and

    T2.event= @eventNo

    order by

    T2.object_value

    FOR XML PATH(''),TYPE).value('.','varchar(MAX)'),1,1,''))

    select polje from result"

    It works fine and is also very fast in terms of performance.

    It is, however useless and i have to rewrite it. Why?

    After having deployed it into the production environment the function kept returning some error (I cant' remember exactly what it was; someting regarding the qouted_identifier setting).

    Fine, I said to myself. I'll simply change the setting and it will work just fine. But there is always a but. The "test-to-production" scenario includes more than 1000 database objects and is performed through a .net program.

    The problem is, that the UDF is accessed via the stored procedure, whose quoted_identifier setting is set to off when being created (in a drop => create manner)

    and the database admin doesn't care much about my problem. It's an extremely delicate operation, annoying customer, etc, bla bla...

    In a word, I cannot influent the setting during the test-to-production phase.

    After doing some research I figured out, that the problem lies within the for_xml phrase.

    Does anyone have the idea how to rewrite the upper query by eliminating the "for_xml" phrase without changing the "nature" of the UDF. It has to remain the inline one.

    To my experience, any other option will result in a huge performance degradation. The UDF is also accessed via the sql view (outer apply) with approximately 90K rows.

    Let me stress, that the UDF always returns exactly 1 row.

    What does it do?

    It concatenates string data from a table, delimiting the data with a selected delimiter. A typical output would be something like:

    select * from [dbo].[_some_function](17,245,'#')

    result: John Doe#Jane Doe#Someone Else

    Any help will be very much appreciated

  • All other ways of doing this that I know of are either much slower, or require using CLR objects in the database (which would also require the DBA allowing that).

    You might want to point out to that DBA that setting Quoted Identifiers to Off is a violation of ISO-92 SQL standards, and that MS has it On as a default for a reason.

    - Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
    Property of The Thread

    "Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon

  • Agreed, that FOR XML PATH is quickest.

    Here is an alternative that is definitely slower and has it's own limitation.

    CREATE FUNCTION [dbo].[_some_function]

    (

    @poolint,

    @Eventint,

    @delimiter varchar(100)

    )

    RETURNS TABLE

    AS

    RETURN

    (

    with base(value,line) as

    (

    SELECT T1.object_value

    , row_number() OVER (ORDER BY T2.object_value DESC)

    FROM

    table1 T1

    INNER JOIN table2 T2 ON T2.[pool] = T1.[pool] AND T2.event = T1.object_value

    WHERE

    T2.[pool] = @pool

    AND

    T2.event = @eventNo

    ),

    results(result,line) as

    (

    SELECT TOP 1 CAST(RTRIM(value) AS VARCHAR(MAX))

    , line

    FROM

    base

    ORDER BY

    line

    UNION ALL

    SELECT RTRIM(base.VALUE) + @delimiter + result,base.line

    FROM base

    JOIN results

    ON base.line = results.line+1

    )

    SELECT TOP 1 result

    FROM

    results

    ORDER BY

    line DESC

    )

    The limitation here is that it will only cope with up to 100 iterations before you will have to include an OPTION(MAXRECURSION xx)

    statement in the calling query.

    Sample call:

    select *

    from dbo.[_some_function](1,1,'#')

    option (MAXRECURSION 0)

    Why do I build the concatenated string in reverse?

    A: Because in my experience it can be much faster when working with MAX datatypes to do it that way.

    MM



    select geometry::STGeomFromWKB(0x0106000000020000000103000000010000000B0000001000000000000840000000000000003DD8CCCCCCCCCC0840000000000000003DD8CCCCCCCCCC08408014AE47E17AFC3F040000000000104000CDCCCCCCCCEC3F9C999999999913408014AE47E17AFC3F9C99999999991340000000000000003D0000000000001440000000000000003D000000000000144000000000000000400400000000001040000000000000F03F100000000000084000000000000000401000000000000840000000000000003D0103000000010000000B000000000000000000143D000000000000003D009E99999999B93F000000000000003D009E99999999B93F8014AE47E17AFC3F400000000000F03F00CDCCCCCCCCEC3FA06666666666FE3F8014AE47E17AFC3FA06666666666FE3F000000000000003D1800000000000040000000000000003D18000000000000400000000000000040400000000000F03F000000000000F03F000000000000143D0000000000000040000000000000143D000000000000003D, 0);

  • Forum Etiquette: How to post Reporting Services problems
  • [/url]
  • Forum Etiquette: How to post data/code on a forum to get the best help - by Jeff Moden
  • [/url]
  • How to Post Performance Problems - by Gail Shaw
  • [/url]

  • Squared & Magoo

    Thanks both for your replies.

    I did find something myself.

    Here is the code:

    ";with aa as

    (select * from __xxx where idd=1),

    cte ( idd, ime, [ime_list], length, stev )

    AS (SELECT idd,

    CAST('' AS VARCHAR(8000)),

    CAST('' AS VARCHAR(8000)),

    0,

    0

    FROM aa

    GROUP BY idd

    UNION ALL

    SELECT p.idd,

    CAST(p.ime AS VARCHAR(8000)),

    CAST([ime_List] + CASE WHEN length = 0 THEN '' ELSE ', ' END + p.[ime] AS VARCHAR(8000)),

    length + 1,

    p.stev

    FROM cte c

    INNER JOIN aa p ON c.[idd] = p.[idd]

    WHERE p.[stev] > c.[stev])

    SELECT [idd],[ime_List]

    FROM (SELECT [idd],

    [ime_List],

    Rank() OVER ( PARTITION BY [idd] ORDER BY length DESC),

    stev

    FROM cte) d ( [idd], [ime_List], rank, stev)

    WHERE rank = 1 "

    and it works fine. I tried it out in the testing environment and is 100% correct.

    I'm still checking the performance.

    Magoo's solution seems far more simple. I'll check it out tomorrow to find out whether it is a better one in terms of performance.

    Thanks again for assistance. I'll let you know about the performance test.

  • The option you posted, Marko, is what's called a "triangular join", and it will work just fine on very small datasets, and will absolutely die on large datasets. So, make sure you test it on a large, well-fragmented table (like real life data).

    - Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
    Property of The Thread

    "Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon

  • Thanks for letting me know Squared.

    I checked the UDF's performance on 100K records (via a view) and managed to retreive data in 50s. Id does seem a lot, but in real life the app usually accesses 50 records at a time so it shouldn't be a problem. The function itself usually deals with up to three records per every call.

    What scares me more is the fact, that the previous UDF practically stopped working after every SQL server restart. As if, in some way, the optimizer failed to obtain index and statistics data when accessing the UDF. I've read about it all over the net and people seem to have similar problems.

    My fear is related to the fact that there are many other UDF's used by my app working in a similar way. I'll probably have to consider rewriting them.

    Today I'll check out the Magoo's solution.

    Thx again

  • After you restart the SQL Server service (either just the service, or rebooting the server), the database engine has to recache all its execution plans, recache data, etc. Pretty much every query will be slower the first time it's used after a service restart. That's usual.

    - Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
    Property of The Thread

    "Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon

  • Viewing 7 posts - 1 through 6 (of 6 total)

    You must be logged in to reply to this topic. Login to reply