a ITVF string concatenation without "FOR_XML"

Question

a ITVF string concatenation without "FOR_XML"

marko.celarc

Mr or Mrs. 500

Points: 525
More actions
November 2, 2011 at 6:42 am

#251054

Hello every1.
I have a problem with the following UDF.
"create function [dbo].[_some_function]
(@poolint,
@Eventint,
@delimiter varchar(100))
returns table
return
with result (polje) as
(select stuff((
select@delimiter + T1.object_value
fromtable1 T1
inner join table2 T2 on T2.[pool] = T1.[pool] and T2.event= T1.object_value
where
T2.[pool] = @pool and
T2.event= @eventNo
order by
T2.object_value
FOR XML PATH(''),TYPE).value('.','varchar(MAX)'),1,1,''))
select polje from result"
It works fine and is also very fast in terms of performance.
It is, however useless and i have to rewrite it. Why?
After having deployed it into the production environment the function kept returning some error (I cant' remember exactly what it was; someting regarding the qouted_identifier setting).
Fine, I said to myself. I'll simply change the setting and it will work just fine. But there is always a but. The "test-to-production" scenario includes more than 1000 database objects and is performed through a .net program.
The problem is, that the UDF is accessed via the stored procedure, whose quoted_identifier setting is set to off when being created (in a drop => create manner)
and the database admin doesn't care much about my problem. It's an extremely delicate operation, annoying customer, etc, bla bla...
In a word, I cannot influent the setting during the test-to-production phase.
After doing some research I figured out, that the problem lies within the for_xml phrase.
Does anyone have the idea how to rewrite the upper query by eliminating the "for_xml" phrase without changing the "nature" of the UDF. It has to remain the inline one.
To my experience, any other option will result in a huge performance degradation. The UDF is also accessed via the sql view (outer apply) with approximately 90K rows.
Let me stress, that the UDF always returns exactly 1 row.
What does it do?
It concatenates string data from a table, delimiting the data with a selected delimiter. A typical output would be something like:
select * from [dbo].[_some_function](17,245,'#')
result: John Doe#Jane Doe#Someone Else
Any help will be very much appreciated
Forum Etiquette: How to post Reporting Services problems

[/url]

Forum Etiquette: How to post data/code on a forum to get the best help - by Jeff Moden
[/url]

How to Post Performance Problems - by Gail Shaw
[/url]

GSquared SSC Guru Points: 260824 More actions · Answer 1

All other ways of doing this that I know of are either much slower, or require using CLR objects in the database (which would also require the DBA allowing that).

You might want to point out to that DBA that setting Quoted Identifiers to Off is a violation of ISO-92 SQL standards, and that MS has it On as a default for a reason.

- Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
Property of The Thread

"Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon

mister.magoo SSC-Forever Points: 47174 More actions · Answer 2

Agreed, that FOR XML PATH is quickest.

Here is an alternative that is definitely slower and has it's own limitation.

CREATE FUNCTION [dbo].[_some_function]

(

@poolint,

@Eventint,

@delimiter varchar(100)

)

RETURNS TABLE

AS

RETURN

(

with base(value,line) as

(

SELECT T1.object_value

, row_number() OVER (ORDER BY T2.object_value DESC)

FROM

table1 T1

INNER JOIN table2 T2 ON T2.[pool] = T1.[pool] AND T2.event = T1.object_value

WHERE

T2.[pool] = @pool

AND

T2.event = @eventNo

),

results(result,line) as

(

SELECT TOP 1 CAST(RTRIM(value) AS VARCHAR(MAX))

, line

FROM

base

ORDER BY

line

UNION ALL

SELECT RTRIM(base.VALUE) + @delimiter + result,base.line

FROM base

JOIN results

ON base.line = results.line+1

)

SELECT TOP 1 result

FROM

results

ORDER BY

line DESC

)

The limitation here is that it will only cope with up to 100 iterations before you will have to include an OPTION(MAXRECURSION xx)

statement in the calling query.

Sample call:

select *

from dbo.[_some_function](1,1,'#')

option (MAXRECURSION 0)

Why do I build the concatenated string in reverse?

A: Because in my experience it can be much faster when working with MAX datatypes to do it that way.

MM

select geometry::STGeomFromWKB(0x0106000000020000000103000000010000000B0000001000000000000840000000000000003DD8CCCCCCCCCC0840000000000000003DD8CCCCCCCCCC08408014AE47E17AFC3F040000000000104000CDCCCCCCCCEC3F9C999999999913408014AE47E17AFC3F9C99999999991340000000000000003D0000000000001440000000000000003D000000000000144000000000000000400400000000001040000000000000F03F100000000000084000000000000000401000000000000840000000000000003D0103000000010000000B000000000000000000143D000000000000003D009E99999999B93F000000000000003D009E99999999B93F8014AE47E17AFC3F400000000000F03F00CDCCCCCCCCEC3FA06666666666FE3F8014AE47E17AFC3FA06666666666FE3F000000000000003D1800000000000040000000000000003D18000000000000400000000000000040400000000000F03F000000000000F03F000000000000143D0000000000000040000000000000143D000000000000003D, 0);

marko.celarc Mr or Mrs. 500 Points: 525 More actions · Answer 3

Squared & Magoo

Thanks both for your replies.

I did find something myself.

Here is the code:

";with aa as

(select * from __xxx where idd=1),

cte ( idd, ime, [ime_list], length, stev )

AS (SELECT idd,

CAST('' AS VARCHAR(8000)),

0,

0

FROM aa

GROUP BY idd

UNION ALL

SELECT p.idd,

CAST(p.ime AS VARCHAR(8000)),

CAST([ime_List] + CASE WHEN length = 0 THEN '' ELSE ', ' END + p.[ime] AS VARCHAR(8000)),

length + 1,

p.stev

FROM cte c

INNER JOIN aa p ON c.[idd] = p.[idd]

WHERE p.[stev] > c.[stev])

SELECT [idd],[ime_List]

FROM (SELECT [idd],

[ime_List],

Rank() OVER ( PARTITION BY [idd] ORDER BY length DESC),

stev

FROM cte) d ( [idd], [ime_List], rank, stev)

WHERE rank = 1 "

and it works fine. I tried it out in the testing environment and is 100% correct.

I'm still checking the performance.

Magoo's solution seems far more simple. I'll check it out tomorrow to find out whether it is a better one in terms of performance.

Thanks again for assistance. I'll let you know about the performance test.

GSquared SSC Guru Points: 260824 More actions · Answer 4

The option you posted, Marko, is what's called a "triangular join", and it will work just fine on very small datasets, and will absolutely die on large datasets. So, make sure you test it on a large, well-fragmented table (like real life data).

- Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
Property of The Thread

"Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon

marko.celarc Mr or Mrs. 500 Points: 525 More actions · Answer 5

Thanks for letting me know Squared.

I checked the UDF's performance on 100K records (via a view) and managed to retreive data in 50s. Id does seem a lot, but in real life the app usually accesses 50 records at a time so it shouldn't be a problem. The function itself usually deals with up to three records per every call.

What scares me more is the fact, that the previous UDF practically stopped working after every SQL server restart. As if, in some way, the optimizer failed to obtain index and statistics data when accessing the UDF. I've read about it all over the net and people seem to have similar problems.

My fear is related to the fact that there are many other UDF's used by my app working in a similar way. I'll probably have to consider rewriting them.

Today I'll check out the Magoo's solution.

Thx again

GSquared SSC Guru Points: 260824 More actions · Answer 6

After you restart the SQL Server service (either just the service, or rebooting the server), the database engine has to recache all its execution plans, recache data, etc. Pretty much every query will be slower the first time it's used after a service restart. That's usual.

- Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
Property of The Thread

"Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon