CDATA In TSQL Script

  • I'm having trouble using CDATA in TSQL. SQL Server is complaining that it cannot parse my CDATA item. Any thoughts on a better way to do this?

    (My CDATA @FaceTemplateData will not parse)

    DECLARE @FaceCenterX int

    ,@FaceCenterY int

    ,@FaceWidth int

    ,@FaceAngle decimal(18,2)

    ,@FaceTemplateData varbinary(max)

    ,@FaceFeatures varchar(2048)

    DECLARE @FaceCenterXStr varchar(10)

    ,@FaceCenterYStr varchar(10)

    ,@FaceWidthStr varchar(10)

    ,@FaceAngleStr varchar(10)

    ,@FaceTemplateDataStr varchar(max)

    SELECT @FaceCenterX = 150

    ,@FaceCenterY = 200

    ,@FaceWidth = 250

    ,@FaceAngle = 9

    ,@FaceTemplateData = 0x3A38635661813D6A5B435D5D6

    ,@FaceFeatures = '276=166|356=159|317=201|291=245|352=240'

    SELECT

    @FaceCenterXStr = CAST(@FaceCenterX AS varchar(10))

    ,@FaceCenterYStr = CAST(@FaceCenterY AS varchar(10))

    ,@FaceWidthStr = CAST(@FaceWidth AS varchar(10))

    ,@FaceAngleStr = CAST(@FaceAngle AS varchar(12))

    ,@FaceTemplateDataStr = CAST(@FaceTemplateData AS varchar(max))

    SELECT @FaceCenterXStr AS FaceCenterX

    ,@FaceCenterYStr AS FaceCenterY

    ,@FaceWidthStr AS FaceWidth

    ,@FaceAngleStr AS FaceAngle

    ,@FaceTemplateDataStr AS FaceTemplateData

    ,@FaceFeatures AS FaceFeatures

    SELECT CAST(

    '<?xml version="1.0" encoding="utf-8" ?>' +

    '<FacialRecognition>' +

    '<Templates>' +

    '<Template Name="T1">' +

    '<FaceCenterX>' + @FaceCenterXStr + '</FaceCenterX>' +

    '<FaceCenterY>' + @FaceCenterYStr + '</FaceCenterY>' +

    '<FaceWidth>' + @FaceWidthStr + '</FaceWidth>' +

    '<FaceAngle>' + @FaceAngleStr + '</FaceAngle>' +

    '<![CDATA[<FaceTemplate>' + @FaceTemplateDataStr + '</FaceTemplate>]]>' +

    '<FaceFeatures>' + @FaceFeatures + '</FaceFeatures>' +

    '</Template>' +

    '</Templates>' +

    '</FacialRecognition>' AS XML) AS UserDataXML

    .

  • XML in SQL Server only support UTF-16LE. Two things, change all instances of VARCHAR in your script to NVARCHAR and change utf-8 to utf-16 in your xml declaration.

    There are no special teachers of virtue, because virtue is taught by the whole community.
    --Plato

  • Oh, a third thing...prefix all your string literals with N to denote they should be treated as Unicode strings (i.e. they contain 'N'ational characters).

    There are no special teachers of virtue, because virtue is taught by the whole community.
    --Plato

  • Works perfectly, thanks much!

    .

  • Hmm, not so fast, my code seems to work against simple test data, but when I try to run it against some real production data, seems to be having some trouble parsing that varbinary(max) column. Here's the error I get:

    XML parsing: line 1, character 5076, illegal xml character

    The real data is too big to post really. Below is a fragment that simulates how I'm handling that varbinary/cdata column. Maybe somebody can spot what I'm doing wrong:

    declare @TestXml xml, @TestData varbinary(max)

    select @TestData = 0x012883198723a123

    select @TestXml = cast(

    N'<?xml version="1.0" encoding="utf-16" ?>' +

    N'<MyRootTag>' +

    N'<![CDATA[<FaceTemplate>' + CAST(@TestData AS nvarchar(max)) + N'</FaceTemplate>]]>' +

    N'</MyRootTag>' AS xml)

    select @TestXml

    .

  • BSavoie (3/17/2011)


    XML parsing: line 1, character 5076, illegal xml character

    Can you do a substring on your xml and pull characters 5050-5100? Need to figure out what it's cranky about, might just be a bad entry in the original data layer that's been transmitted into the xml.


    - Craig Farrell

    Never stop learning, even if it hurts. Ego bruises are practically mandatory as you learn unless you've never risked enough to make a mistake.

    For better assistance in answering your questions[/url] | Forum Netiquette
    For index/tuning help, follow these directions.[/url] |Tally Tables[/url]

    Twitter: @AnyWayDBA

  • Here's what I seeing in that substring: select substring(cast(@MyData as varchar(max)), 5050, 50)

    rb_OELM{€B=_OJCt|SCHHK\j}S8bNRKp|a=UHQD{tHBR_JKmt[

    .

  • Not all characters are valid in XML. Your code looks fine. You have some bad data in the mix that you'll need to cleanse...either that or escape it before casting to XML.

    There are no special teachers of virtue, because virtue is taught by the whole community.
    --Plato

  • Got it. Thanks again! I thought that was the point of the "cdata". I thought that was essentially "escaping" it for me.

    .

  • Good deal!

    As a side note, you can drop this piece of code. It just adds unnecessary overhead because that declaration is implied by SQL Server, notice how it is not returned in when you select your XML variable:

    N'<?xml version="1.0" encoding="utf-16" ?>' +

    There are no special teachers of virtue, because virtue is taught by the whole community.
    --Plato

  • Should I just base64 encode the data before I string it out then?

    .

  • I was worried the consuming applications might not like the xml without that header. Maybe that's not an issue.

    I've also pretty much decided I need to write this utility function in c# instead of tsql. I'm just not sure how to do the base64 encoding with tsql. I found a couple base64 encoder/decoder routines on Google, but it seems folks were having some problems with them. I guess c# is MY best bet since it is my native tounge, and .NetFX makes this so much easier. Surprisingly, it seems like base64 encoding in tsql is not real common. I did not find a lot of good answers on Google.

    Thanks for the help. I appreciate it!

    .

  • The purpose of CDATA is to instruct the XML interpreter that everything within it should be treated as character data, and not markup. The CDATA tag still must contain legal XML characters and also may not contain the string

    ]]>

    (the "triad") as it creates a parsing ambiguity. Other than the triad and illegal XML characters you can put whatever you want in a CDATA tag and it should work provided the interpreter is compliant with the XML 1.0 spec.

    DECLARE @nvarchar_max NVARCHAR(MAX) ;

    SET @nvarchar_max = N'

    <root>

    <![CDATA[

    3 < 5 Text you want to escape goes here...

    ]]>

    </root>

    '

    SELECT CAST(@nvarchar_max AS XML) AS works ;

    SET @nvarchar_max = N'

    <root>

    <![CDATA[

    ...a NULL char is not legal in XML, even in a CDATA field' + NCHAR(0)

    + N'

    ]]>

    </root>

    '

    SELECT CAST(@nvarchar_max AS XML) AS [does not work] ;

    edit: added code samples; changed wording "valid" to "legal"

    There are no special teachers of virtue, because virtue is taught by the whole community.
    --Plato

  • Worked extremely well for me!!..

  • So if anyone else is interested. Here's the end product, and it seems to be working well.

    First of all, I found what seems to be simple, reliable tsql base64 encoding/decoding that suits my needs. Thanks to "CVMichael" over at VBForums.com. Link to the article is posted at the end.

    CREATE FUNCTION dbo.BinaryToBase64

    (

    @bin VARBINARY(MAX)

    )

    RETURNS VARCHAR(MAX)

    AS

    BEGIN

    DECLARE @Base64 VARCHAR(MAX)

    SET @Base64 = CAST(N'' AS XML).value('xs:base64Binary(xs:hexBinary(sql:variable("@bin")))', 'VARCHAR(MAX)')

    RETURN @Base64

    END

    GO

    CREATE FUNCTION dbo.Base64ToBinary

    (

    @Base64 VARCHAR(MAX)

    )

    RETURNS VARBINARY(MAX)

    AS

    BEGIN

    DECLARE @Bin VARBINARY(MAX)

    SET @Bin = CAST(N'' AS XML).value('xs:base64Binary(sql:variable("@Base64"))', 'VARBINARY(MAX)')

    RETURN @Bin

    END

    Armed with these bad boys, the update I needed to write became quite simple:

    UPDATE Photo

    SET UserDataXML =

    CAST(

    N'<FacialRecognition>'

    + N'<Templates>'

    + N'<Template>'

    + N'<FaceCenterX>' + CAST(FaceCenterX AS varchar) + N'</FaceCenterX>'

    + N'<FaceCenterY>' + CAST(FaceCenterY AS varchar) + N'</FaceCenterY>'

    + N'<FaceWidth>' + CAST(FaceWidth AS varchar) + N'</FaceWidth>'

    + N'<FaceAngle>' + CAST(FaceAngle AS varchar) + N'</FaceAngle>'

    + N'<![CDATA[<FaceTemplate>' + dbo.BinaryToBase64(FaceTemplate) + N'</FaceTemplate>]]>'

    + N'<FaceFeatures>' + CAST(FaceFeatures AS varchar) + N'</FaceFeatures>'

    + N'</Template>'

    + N'</Templates>'

    + N'</FacialRecognition>'

    AS xml)

    FROM Photo

    I'm not sure if I need to wrap the binary data in CDATA AND base64 encode it, but I guess it can't hurt to do both. Hope this helps someone else!

    Here's where I found the tsql base64 encoder. I think it's cheating a little bit, but that's perfectly acceptable in my situation.

    Link: VBForums base64 encoding article

    .

Viewing 15 posts - 1 through 14 (of 14 total)

You must be logged in to reply to this topic. Login to reply