November 15, 2013 at 8:30 am
Historically entries have been stored in a DB with web styling encoding and are similar to these (i know bad practice):
select 'Α&ampNu;ΤΩΝΙΟΥ'
UNION ALL select 'ΑΡΒΑΝΙΤΗΣ'
UNION ALL select 'Αρβανιτάκη'
UNION ALL select 'ΑσημÎνιου'
Are there any functions in SQL that can decode this or will i need to call a web service from SSIS to get to HTML.decode?
They should be stored as follows:
ΑΝΤΩΝΙΟΥ
ΑΡΒΑΝΙΤΗΣ
Αρβανιτάκη
ΑσημÎνιου
Thanks
B
November 15, 2013 at 10:24 am
This works for me:
DECLARE @XmlEntities as NVarchar(MAX) = N'<?xml version="1.0" encoding="utf-16"?>
<!DOCTYPE schema [
<!ENTITY Alpha "?">
<!ENTITY ampNu "?">
<!ENTITY Nu "?">
<!ENTITY Tau "T">
<!ENTITY Omega "O">
<!ENTITY Omicron "?">
<!ENTITY Iota "?">
<!ENTITY Upsilon "?">
<!ENTITY Rho "?">
<!ENTITY Beta "?">
<!ENTITY Eta "?">
<!ENTITY Sigma "S">
<!ENTITY alpha "a">
<!ENTITY nu "?">
<!ENTITY tau "t">
<!ENTITY omega "O">
<!ENTITY omicron "?">
<!ENTITY iota "?">
<!ENTITY upsilon "?">
<!ENTITY rho "?">
<!ENTITY beta "ß">
<!ENTITY eta "?">
<!ENTITY sigma "s">
<!ENTITY Kappa "K">
<!ENTITY kappa "?">
<!ENTITY mu "µ">
]>
';
WITH cte as
(
select '&Alpha;&ampNu;&Tau;&Omega;&Nu;&Iota;&Omicron;&Upsilon;' As htxt
UNION ALL select '&Alpha;&Rho;&Beta;&Alpha;&Nu;&Iota;&Tau;&Eta;&Sigma;'
UNION ALL select '&Alpha;&rho;&beta;&alpha;&nu;&iota;&tau;ά&kappa;&eta;'
UNION ALL select '&Alpha;&sigma;&eta;&mu;Î&nu;&iota;&omicron;&upsilon;'
)
, cteWrapped As
(
SELECT CONVERT(XML, @XmlEntities+N'<txt>'+htxt+N'</txt>', 2) as xWrapped
FROM cte
)
SELECT T.c.value('.','nvarchar(80)')
FROM cteWrapped As w
CROSS APPLY w.xWrapped.nodes('/txt') T(c)
[font="Times New Roman"]-- RBarryYoung[/font], [font="Times New Roman"] (302)375-0451[/font] blog: MovingSQL.com, Twitter: @RBarryYoung[font="Arial Black"]
Proactive Performance Solutions, Inc. [/font][font="Verdana"] "Performance is our middle name."[/font]
November 17, 2013 at 8:06 am
Thanks Barry, thats pretty awesome. Is there anywhere I can a full list of characters its not just Greek specific.
D
November 20, 2013 at 11:00 am
bugg (11/17/2013)
Thanks Barry, thats pretty awesome. Is there anywhere I can a full list of characters its not just Greek specific.D
Sure, here: http://www.fileformat.info/info/charset/UTF-16/list.htm, but you're looking at encoding all of Unicode/UTF-16. I think you'd be better off taking another approach (i.e., using SQLCLR, or fixing it before it gets into SQL).
[font="Times New Roman"]-- RBarryYoung[/font], [font="Times New Roman"] (302)375-0451[/font] blog: MovingSQL.com, Twitter: @RBarryYoung[font="Arial Black"]
Proactive Performance Solutions, Inc. [/font][font="Verdana"] "Performance is our middle name."[/font]
November 20, 2013 at 11:08 am
Hi Barry
Thanks for that list, this is going to be a one off job to migrate some data, so still toying with using SQLCLR to do this.
Cheers
D
Viewing 5 posts - 1 through 4 (of 4 total)
You must be logged in to reply to this topic. Login to reply