May 6, 2011 at 9:32 am
I'm trying to figure out how to concatenate the first letter of each word within a given string. It's bit like parsing a delimited list, I'd imagine. Nonetheless, I find an elegant solution a tad elusive.
Here's an example:
String = "COMMUNITY GENERAL HOSPITAL"
I'd like to return "CGH"
I've seen various posts/articles that discuss parsing a delimited list into separat rows, but I haven't quite figured out how to return a single row as a concatenated string.
Thanks in advance for assistance.
--Pete
May 6, 2011 at 9:48 am
It appears I've found one solution which I should be able to put into a udf:
DECLARE @STRING VARCHAR(8000)
SET @STRING = 'COMMUNITY GENERAL HOSPITAL'
DECLARE @TEXTXML XML
SELECT @textXML = cast('<d>' + replace(@string, ' ', '</d><d>') + '</d>' as xml)
DECLARE @result VARCHAR(8000)
SET @result = ''
SELECT @result = @result + left(T.split.value('.', 'nvarchar(max)'), 1)
FROM @textXML.nodes('/d') T ( split )
SELECT @result
Came across the code from http://social.msdn.microsoft.com/Forums/en-US/transactsql/thread/d7ce8bf9-e6fe-4712-aa8d-b89eddc56453/[/url]
Feel free to share other techniques.
May 8, 2011 at 2:07 pm
This handles different separator characters (not just space),
handles multiplication of the separators (e.g. double space or space at the beggining of the string),
and works from SQL 2000 and up:
DECLARE @string VARCHAR(8000); SET @string = ' COMMUNITY ... GENERAL, HOSPITAL 1973'
DECLARE @allowed varchar(100); SET @allowed = 'A-Z0-9' -- characters allowed in the result
DECLARE @i INT; SET @i = 0
DECLARE @result varchar(8000)
WHILE @i is not null
BEGIN
SET @result = ISNULL(@result,'')+ISNULL(SUBSTRING(@string,@i+1,1),'')
SET @i = @i + NULLIF(PATINDEX('%[^('+@allowed+')]['+@allowed+']%',SUBSTRING(@string,@i+1,8000)),0)
END
SELECT @string, @result
Output: "CGH1"
You can easily modify the output by setting @allowed characters to e.g. 'a-z' to get result without numbers.
May 8, 2011 at 2:58 pm
Try this:
DECLARE @STRING VARCHAR(8000);
SET @STRING = 'COMMUNITY GENERAL HOSPITAL';
SELECT (SELECT LEFT(Item,1)
FROM dbo.DelimitedSplit8K(@String, ' ')
ORDER BY ItemNumber
FOR XML PATH(''),TYPE).value('.','varchar(max)');
For the DelimitedSplit8K function, see the link in my signature for Splitting Delimited Strings.
For an explanation of what the rest of the query is doing, see this article: Creating a comma-separated list (SQL Spackle)[/url]
Wayne
Microsoft Certified Master: SQL Server 2008
Author - SQL Server T-SQL Recipes
May 8, 2011 at 6:29 pm
Vedran Kesegic (5/8/2011)
This handles different separator characters (not just space),handles multiplication of the separators (e.g. double space or space at the beggining of the string),
and works from SQL 2000 and up:
DECLARE @string VARCHAR(8000); SET @string = ' COMMUNITY ... GENERAL, HOSPITAL 1973'
DECLARE @allowed varchar(100); SET @allowed = 'A-Z0-9' -- characters allowed in the result
DECLARE @i INT; SET @i = 0
DECLARE @result varchar(8000)
WHILE @i is not null
BEGIN
SET @result = ISNULL(@result,'')+ISNULL(SUBSTRING(@string,@i+1,1),'')
SET @i = @i + NULLIF(PATINDEX('%[^('+@allowed+')]['+@allowed+']%',SUBSTRING(@string,@i+1,8000)),0)
END
SELECT @string, @result
Output: "CGH1"
You can easily modify the output by setting @allowed characters to e.g. 'a-z' to get result without numbers.
But it's RBAR. 😉
--Jeff Moden
Change is inevitable... Change for the better is not.
May 8, 2011 at 6:31 pm
peterzeke (5/6/2011)
It appears I've found one solution which I should be able to put into a udf:
DECLARE @STRING VARCHAR(8000)
SET @STRING = 'COMMUNITY GENERAL HOSPITAL'
DECLARE @TEXTXML XML
SELECT @textXML = cast('<d>' + replace(@string, ' ', '</d><d>') + '</d>' as xml)
DECLARE @result VARCHAR(8000)
SET @result = ''
SELECT @result = @result + left(T.split.value('.', 'nvarchar(max)'), 1)
FROM @textXML.nodes('/d') T ( split )
SELECT @result
Came across the code from http://social.msdn.microsoft.com/Forums/en-US/transactsql/thread/d7ce8bf9-e6fe-4712-aa8d-b89eddc56453/[/url]
Feel free to share other techniques.
See the post that Wayne posted above. XML splitters are a bit tough on performance.
--Jeff Moden
Change is inevitable... Change for the better is not.
May 9, 2011 at 7:11 am
Jeff/Wayne:
Many thanks for your replies! Hearing from the two of you is an honor. I had pondered using SQL from Jeff's tally-based splitting functions, but couldn't figure out how to concantenate the values, elegantly, once they've been split into rows. (In fact, I had seen Jeff's recent "Tally Oh!" article, but hadn't had time to digest it.) So, rather than I putting a hodge-podge of sql together, I thought I'd check with the experts in the SQL community. In my quest, I did come across quite a few discussions with performance test results, often provided by Jeff, that clearly spell out the limitations of XML splitting.
But, I am a little confused by Wayne's choice to use XML to create a delimited list. Is XML better at concantenation than splitting? or is it that no solution presents itself signicantly better than XML in terms of performance so XML is the tool of choice for simplifying the SQL statement? In general, I've seen thorough analyses of splitting arrays/delimited lists, but I don't recall seeing performance evaluations regarding concantenation. If performance assessments of concantenation have been posted in previous articles/discussions, please feel free to provide a link. I'm eager to learn.
By the way, when it comes to performance, I'm guessing the phrase "good enough" just isn't, right?
Thanks again,
Pete
May 9, 2011 at 7:33 am
Thanks for the kudo, Pete.
Yes... XML is VERY effecient for concatenation of strings like the ones you have. Wayne has written a nice "SQL Spackle" article on the subject and there's an enhancement in the discussion. Here's the link.
http://www.sqlservercentral.com/articles/comma+separated+list/71700/
It's been awhile since I wrote about tuning concatenation problems but here's my old article.
http://www.sqlservercentral.com/articles/Test+Data/61572/
--Jeff Moden
Change is inevitable... Change for the better is not.
May 9, 2011 at 7:56 am
Excellent citations!
Jeff, your "older" article about concantenation has the following comment regarding the final XML concantenation solution:
I normally warn against using correlated sub-queries, but the following just absolutely flies (no UDF overhead)...
The part about "no UDF overhead" -- would putting Wayne's XML solution (combined with your recent DelimitedSplit8K function) into a udf be pointless or defeat performance?
Example 1:
/****************************************************************/
/*FUNCTION TO CREATE ABBREVIATIONS*/
/*5/9/2011*/
/**/
/*Uses Jeff Moden's DelimitedSplit8K split function and*/
/*Wayne Sheffield'S SQL Spackle to concantenate the left most characters of each word*/
/*to form the actual abbreviation.*/
/*Tally Table required.*/
/**/
/****************************************************************/
ALTER FUNCTION dbo.udf_AbbreviateV1
--===== Define I/O parameters
(@String VARCHAR(8000))
RETURNS TABLE WITH SCHEMABINDING AS
RETURN
SELECT ABBRV = (SELECT LEFT(Item,1)
FROM dbo.udf_DelimitedSplit8K(@String, ' ')
ORDER BY ItemNumber
FOR XML PATH(''),TYPE).value('.','varchar(max)');
Example 2:
/****************************************************************/
/*FUNCTION TO CREATE ABBREVIATIONS*/
/*5/9/2011*/
/**/
/*Uses Jeff Moden's DelimitedSplit8K split function and*/
/*Wayne Sheffield'S SQL Spackle to concantenate the left most characters of each word*/
/*to form the actual abbreviation.*/
/*Tally Table required.*/
/**/
/****************************************************************/
CREATE FUNCTION dbo.udf_AbbreviateV2
--===== Define I/O parameters
(@String VARCHAR(8000))
Returns VARCHAR(8000)
AS
BEGIN
RETURN (
SELECT (SELECT LEFT(Item,1)
FROM dbo.udf_DelimitedSplit8K(@String, ' ')
ORDER BY ItemNumber
FOR XML PATH(''),TYPE).value('.','varchar(max)')
)
END;
May 9, 2011 at 10:24 am
peterzeke (5/9/2011)
The part about "no UDF overhead" -- would putting Wayne's XML solution (combined with your recent DelimitedSplit8K function) into a udf be pointless or defeat performance?
It actually would have an impact on performance if it were put into a scalar type of UDF like you have. If you want the convenience of a UDF, I recommend you convert your UDF to an iTVF (INLINE Table Valued Function) and use it in a Cross Apply.
Truth be told, though, the code is simple enough where I don't see an advantage to having it in a UDF no matter which type of UDF you settle on.
--Jeff Moden
Change is inevitable... Change for the better is not.
May 9, 2011 at 10:30 am
peterzeke (5/9/2011)
Jeff/Wayne:Many thanks for your replies! Hearing from the two of you is an honor.
:blush: Thanks Pete - and I hope I can live up to that.
But, I am a little confused by Wayne's choice to use XML to create a delimited list. Is XML better at concantenation than splitting?
Are you familiar with the movie "Grease"? Well, think of the XML concatenation with "FOR XML PATH" as "Greased Lightning". (Bonus points for having the same intonation as John Travolta did.) Or you could just think of it as being wickedly fast. But it comes down to - yes, this is a proved, very high speed method for concatenating strings together.
By the way, when it comes to performance, I'm guessing the phrase "good enough" just isn't, right?
It depends... 😀
For scalability, "good enough" is not good enough for me.
Wayne
Microsoft Certified Master: SQL Server 2008
Author - SQL Server T-SQL Recipes
May 9, 2011 at 10:33 am
Jeff Moden (5/8/2011)
peterzeke (5/6/2011)
It appears I've found one solution which I should be able to put into a udf:
DECLARE @STRING VARCHAR(8000)
SET @STRING = 'COMMUNITY GENERAL HOSPITAL'
DECLARE @TEXTXML XML
SELECT @textXML = cast('<d>' + replace(@string, ' ', '</d><d>') + '</d>' as xml)
DECLARE @result VARCHAR(8000)
SET @result = ''
SELECT @result = @result + left(T.split.value('.', 'nvarchar(max)'), 1)
FROM @textXML.nodes('/d') T ( split )
SELECT @result
Came across the code from http://social.msdn.microsoft.com/Forums/en-US/transactsql/thread/d7ce8bf9-e6fe-4712-aa8d-b89eddc56453/[/url]
Feel free to share other techniques.
See the post that Wayne posted above. XML splitters are a bit tough on performance.
I used to use an XML splitter all the time... that is, until Jeff showed me the light. There is a thread on here somewhere that discusses it, but it's a couple of years old. I'll try to find it.
Wayne
Microsoft Certified Master: SQL Server 2008
Author - SQL Server T-SQL Recipes
May 9, 2011 at 10:48 am
WayneS (5/9/2011)
I used to use an XML splitter all the time... that is, until Jeff showed me the light. There is a thread on here somewhere that discusses it, but it's a couple of years old. I'll try to find it.
I think you're talking about this article by Divya Agrawal: "Split String Using XML" (6/25/2009) Link = http://www.sqlservercentral.com/articles/XML/66932/[/url] It's a neat article in terms of providing first time exposure to XML & SQL -- but the subsequent discussions clearly challenge its usefulness in terms of performance.
Within the discussion thread, Jeff casually steps in on page 2, and on page 3 he lays out all the evidence for disposing xml splitting in favor of a tally table, especially the inline split function.
May 9, 2011 at 10:59 am
peterzeke (5/9/2011)
WayneS (5/9/2011)
I used to use an XML splitter all the time... that is, until Jeff showed me the light. There is a thread on here somewhere that discusses it, but it's a couple of years old. I'll try to find it.
I think you're talking about this article by Divya Agrawal: "Split String Using XML" (6/25/2009) Link = http://www.sqlservercentral.com/articles/XML/66932/[/url] It's a neat article in terms of providing first time exposure to XML & SQL -- but the subsequent discussions clearly challenge its usefulness in terms of performance.
Within the discussion thread, Jeff casually steps in on page 2, and on page 3 he lays out all the evidence for disposing xml splitting in favor of a tally table, especially the inline split function.
I don't know if you've seen the new article...
http://www.sqlservercentral.com/articles/Tally+Table/72993/
{EDIT}... heh... I'm apparently going blind. You actually cited the same article.
--Jeff Moden
Change is inevitable... Change for the better is not.
May 9, 2011 at 11:58 am
peterzeke (5/9/2011)
WayneS (5/9/2011)
I used to use an XML splitter all the time... that is, until Jeff showed me the light. There is a thread on here somewhere that discusses it, but it's a couple of years old. I'll try to find it.
I think you're talking about this article by Divya Agrawal: "Split String Using XML" (6/25/2009) Link = http://www.sqlservercentral.com/articles/XML/66932/[/url] It's a neat article in terms of providing first time exposure to XML & SQL -- but the subsequent discussions clearly challenge its usefulness in terms of performance.
Within the discussion thread, Jeff casually steps in on page 2, and on page 3 he lays out all the evidence for disposing xml splitting in favor of a tally table, especially the inline split function.
No, I hadn't seen that article/discussion thread. It was in the forums... I had posted an XML splitter for a posters issue, and Jeff came in and nicely smoked it. I've been converted ever since.
Wayne
Microsoft Certified Master: SQL Server 2008
Author - SQL Server T-SQL Recipes
Viewing 15 posts - 1 through 15 (of 56 total)
You must be logged in to reply to this topic. Login to reply