June 1, 2016 at 2:14 pm
I have a snippet of T-SQL code that I use as a function. All it does is, take a string that has several phrases separated by a predefined separator. The goal is to parse each of the phrases out and insert them into a table as separate rows. The table also has an ID column, which is an identity.
BEGIN
DECLARE @UnparsedString VARCHAR(MAX) = 'This|is|a|short|sentence'
, @Separator VARCHAR(1) = '|'
DECLARE @ParsedDataSet TABLE([PhraseID] INT IDENTITY, [Phrase] VARCHAR(MAX))
DECLARE @XMLString XML
SELECT @XMLString = CONVERT(XML, SQL_TEXT)
FROM (
SELECT '<root><item>'
+ REPLACE(@UnparsedString, @Separator, '</item><item>')
+ '</item></root>' AS SQL_TEXT
) DerTab
INSERT INTO @ParsedDataSet([Phrase])
SELECT t.col.query('.').value('.', 'VARCHAR(1000)') AS [Phrase]
FROM @XMLString.nodes('root/item') t(col)
SELECT * FROM @ParsedDataSet
END
My code works absolutely fine as long as I don't use any special characters. When I try to add a phrase that has an ampersand, it throws an error.
When an ampersand is used without any spaces around, say for e.g.
DECLARE @UnparsedString VARCHAR(MAX) = 'This|is|a|short&Sweet|sentence')
I get the following error...
Msg 9411, Level 16, State 1, Line 10
XML parsing: line 1, character 70, semicolon expected
When an ampersand is used with spaces around, say for e.g.
DECLARE @UnparsedString VARCHAR(MAX) = 'This|is|a|short & Sweet|sentence')
I get the following error...
Msg 9421, Level 16, State 1, Line 10
XML parsing: line 1, character 66, illegal name character
How can I avoid both these situations and parse my input string? Your help and advise is grately appreciated.
SQLCurious
June 1, 2016 at 2:24 pm
Try this
Edits: Had to use a screen shot to get the coded ampersand to render as text!
The absence of evidence is not evidence of absence.
Martin Rees
You can lead a horse to water, but a pencil must be lead.
Stan Laurel
June 1, 2016 at 2:26 pm
You may want to read this article: http://www.sqlservercentral.com/articles/Tally+Table/72993/
June 1, 2016 at 3:41 pm
Thanks, @Phil Parkin
June 1, 2016 at 6:57 pm
SQLCurious (6/1/2016)
When an ampersand is used without any spaces around, say for e.g.
DECLARE @UnparsedString VARCHAR(MAX) = 'This|is|a|short&Sweet|sentence')
I get the following error...
Msg 9411, Level 16, State 1, Line 10
XML parsing: line 1, character 70, semicolon expected
It's invalid XML.
Therefore you get an error trying to parse it.
You'll get an error too if you try to parse this string:
DECLARE @UnparsedString VARCHAR(MAX) = 'Numbera|compared:|2>1')
Error message will be different, but the reason the same - incorrectly formed XML.
_____________
Code for TallyGenerator
June 2, 2016 at 4:30 am
You should be very careful using xml for splitting or merging texts as there are many exceptions to deal with in xml. Some of the problems can be avoided using escape sequences like the &, > and < sequences. But if your texts can contain low or high ascii/unicode values (for example non-printable characters with ascii codes below 32) you're in for a lot more fun: these characters can't be escaped in (MS SQL's) xml. Your connection will be doomed if you try to convert any text with such characters in it into an xml-type.
Better use the string splitting algorithms provided in Jeff's articles pointed to in earlier responses, as these do not require any 'magic characters' that need escaping.
By the way, a way to get any characters escaped easier than using replace() is to have sql server do it for you like this:
declare @s-2 nvarchar(1000) = '< > &';
select (select @s-2 as [text()] for xml path(''))
The result is:
< > &
Viewing 7 posts - 1 through 6 (of 6 total)
You must be logged in to reply to this topic. Login to reply