April 29, 2005 at 5:01 pm
I'm having trouble converting Japanese and Chinese double-byte characters to unicode.
My Chinese characters are in this format:
GB2312
http://www.iana.org/assignments/charset-reg/GBK
Microsoft Code Page 936
I've had success manually converting them using these two methods:
Unifier from http://www.melody-soft.com
http://www.geocities.com/herong_yang/gb2312/ java converter
My converted characters look great using these tools and I can successfully get it into a SQL Server Unicode Latin1_General collation.
This page from Microsoft says SQL 2000 does support code page 936:
http://msdn.microsoft.com/library/default.asp?url=/library/en-us/tsqlref/ts_ca-co_2e95.asp
How do I use SQL's Chinese_PRC_CI_AI colation to convert gb2312 to unicode though? I'd like to do my conversion using the built-in sql collations rather than the external tools and lists that worked.
For japan, I'm using Shift-JIS and got it working with these resources:
http://www.unicode.org/Public/MAPPINGS/OBSOLETE/EASTASIA/JIS/SHIFTJIS.TXT
Unifier from http://www.melody-soft.com
Can someone send me a code sample that accepts 2 bytes of varchar in either GB2312 or Shift-JIS format then uses SQL's collation to converts it to a 1 (double-byte) character Unicode nvarchar?
Thanks,
Ray Metz
April 29, 2005 at 6:44 pm
My associate at work showed me how to do it...
SET NOCOUNT ON
-- This is your table with GB2312 coded cities
CREATE TABLE #GB2312
(CITY VARCHAR(30) COLLATE Latin1_General_CI_AI)
INSERT INTO #GB2312 VALUES('»´±±')
INSERT INTO #GB2312 VALUES('»´ÄÏ')
INSERT INTO #GB2312 VALUES('»ÆɽÊÐ')
-- Start conversion
-- Create a temporary table to hold your non-unicode data
CREATE TABLE #NonUnicode
(CITY VARCHAR(30) COLLATE Chinese_PRC_CI_AS)
-- Populate your non-unicode temp table with varbianary data selected from your GB2312 coded table
INSERT INTO #NonUnicode(CITY)
SELECT CONVERT(VARBINARY(8000),CONVERT(VARCHAR(30),CITY COLLATE Latin1_General_CI_AI))
FROM #GB2312
-- Create a table to store your converted data
CREATE TABLE #Unicode
(City nvarchar(15) COLLATE Latin1_General_CI_AI)
-- Populate your unicode table
INSERT INTO #Unicode
SELECT CITY COLLATE Chinese_PRC_CI_AS
FROM #NonUnicode
-- View your unicode output
SELECT CITY
FROM #Unicode
DROP TABLE #NonUnicode
DROP TABLE #Unicode
DROP TABLE #GB2312
Viewing 2 posts - 1 through 1 (of 1 total)
You must be logged in to reply to this topic. Login to reply