xml encoding utf-16 not working in sqlserver

  • hi,

    i get xml in following formate

    select cast('<?xml version="1.0" encoding="utf-8"?>

    <test>d</test>' as xml)

    this formate works in sqlserver but following is not working what should i do.

    select cast('<?xml version="1.0" encoding="utf-16"?>

    <test>d</test>' as xml)

    yours sincerey

  • rajemessage 14195 (10/11/2016)


    hi,

    i get xml in following formate

    select cast('<?xml version="1.0" encoding="utf-8"?>

    <test>d</test>' as xml)

    this formate works in sqlserver but following is not working what should i do.

    select cast('<?xml version="1.0" encoding="utf-16"?>

    <test>d</test>' as xml)

    yours sincerey

    You have to use nvarchar for utf-16

    😎

    This works

    select cast(N'<?xml version="1.0" encoding="utf-16"?>

    <test>d</test>' as xml)

  • hi,

    thank u,

    i know about capital N , but i wanted to understand , if i have not used any char of Unicode in my xml, then why

    capital N is required. whole string has varchar type of character. and ecoding is xmls encoding. so xml should read it.

    like following does not need capital N

    SELECT CAST('<?xml version="1.0" encoding="windows-1256"?>

    <MyColumn>MY DATA</MyColumn>' AS xml)

    there is an other example

    create table #t (tid int, i image)

    insert into #t(tid,i) values(1,'<?xml version="1.0" encoding="windows-1256"?><MyColumn>MY DATA</MyColumn>')

    insert into #t(tid,i) values(2,'<?xml version="1.0" encoding="utf-8"?><MyColumn>MY DATA</MyColumn>')

    insert into #t(tid,i) values(3,'<?xml version="1.0" encoding="utf-16"?><MyColumn>MY DATA</MyColumn>')

    SELECT cast(CAST(i as varbinary(max)) AS xml) from #t

    note: i know that xml data type should be used but there are people who are using it.

    yours sincerley.

  • This is well documented on BOL: Using XML Data Types

    😎

  • my question was some thing else , that is with out N it is working.

    now u have written N which is also good , but i am trying to understand that why it is not working with out N though i have not used any character of Unicode , and by putting windows-1256 it is working with out N. so i just wanted to understand , that what should be the correct way.

    because i can not ask every body out there to put N who ever is using my xml. and i can not ask

    others who are making xml , to not to put encoding in it.

    because the xml made by me is used by others and xml made by others are used by us. so there must be some universal way to handle the situation.

    yours sincerly.

  • rajemessage 14195 (10/15/2016)


    my question was some thing else , that is with out N it is working.

    now u have written N which is also good , but i am trying to understand that why it is not working with out N though i have not used any character of Unicode , and by putting windows-1256 it is working with out N. so i just wanted to understand , that what should be the correct way.

    because i can not ask every body out there to put N who ever is using my xml. and i can not ask

    others who are making xml , to not to put encoding in it.

    because the xml made by me is used by others and xml made by others are used by us. so there must be some universal way to handle the situation.

    yours sincerly.

    If the XML contains specified encoding then there are limitations, the BOM and the encoding must conform. You will have to provide more information on what you are trying to do here, not just simple naive conversion examples, i.e. where does the XML come from, how is it generated, how is it loaded into SQL Server table, what are the consumers of that XML etc.

    😎

    Question, why are you using IMAGE data type in your example?

    Further on the BOM, a quote from the BOL:

    The XML standard describes how an XML processor can detect the encoding used

    for a document by examining the first few bytes of the document. There are

    opportunities for the encoding specified by the application to conflict with the

    encoding specified by the document. For documents passed as bound

    parameters, XML is treated as binary data by SQL Server, so no conversions

    are made and the XML parser can use the encoding specified within the

    document without problems. However, for XML data that is bound as WSTR,

    then the application must ensure that the document is encoded as Unicode.

    This may entail loading the document into a DOM, changing the encoding to

    Unicode and serializing the document. If this is not done, data conversions

    may occur, resulting in invalid or corrupt XML.

    There is also potential for conflict when XML is specified in literals.

    For example the following are invalid:

    INSERT INTO xmltable(xmlcol) VALUES('<?xml version="1.0" encoding="UTF-16"?><doc/>')

    INSERT INTO xmltable(xmlcol) VALUES(N'<?xml version="1.0" encoding="UTF-8"?><doc/>')

Viewing 6 posts - 1 through 5 (of 5 total)

You must be logged in to reply to this topic. Login to reply