Collation for English-Chinese Data Warehouse

  • Hi there

    We are implementing a data warehouse for the Chinese subsidary of a multi-national company.

    The ERP (Microsoft Dynamics SQL Database) uses a collation of Chinese_PRC_CI_AS. Most of the data is in English, but chinese characters are stored in some fields.

    I do not know much about collation setting. I would assume that for the data warehouse we should also use Chinese_PRC_CI_AS to all English as well as Chinese characters to be displayed?

    Any ideas?

    Many thanks!!

    Regards

    Chris

  • Colleagues of mine did a DWH project with Chinese data mixed with English once, and I believe they just used unicode everywhere, but I'm not 100% sure.

    Need an answer? No, you need a question
    My blog at https://sqlkover.com.
    MCSE Business Intelligence - Microsoft Data Platform MVP

  • I can verify that it does have to be unicode, having done a worldwide DW myself. You're lucky you are building from scratch and not having to switch over like I did - It was a nightmare :crazy:


    I'm on LinkedIn

  • Thanks for your feedback.

    I have just been reading up on the meaning of Collation and Unicode.

    From what I gather, your choice of datatype is driven by whether Unicode is required. So in my case, if I want to include English as well as Chinese characters, I should choose nvarchar instead of varchar. No problems here.

    However, collation is not related to this, from what I can gather. So my question remains - what collation should I use - Latin1_General_CI_AS or the Chinese collation I mention in my post??

    Many thanks!!!

    Regards

    Chris

Viewing 4 posts - 1 through 3 (of 3 total)

You must be logged in to reply to this topic. Login to reply