Converting UTF8 from MySQL to SQL Server 2008

  • Hello,

    I've got a question about utf8 (code page).

    I've got to migrate data that is on a MySQL (utf8) to an instance that is in SQL Server 2008(Microsoft SQL Server 2008 (SP1) - 10.0.2531.0 (X64)) and i dont know how to answer in this case.

    1. What should i do to migrate (convert) data from MySQL to SQL Server and preserve data as it comes from source?

    2. if it is possible to do this what should i check after data is migrated?

    3. What is the difference between utf8 and utf16?

    If theres anything more that i should know i appreciate that you tell me.

    Thanks and regards,

    JMSM 😉

  • 1. What should i do to migrate (convert) data from MySQL to SQL Server and preserve data as it comes from source?

    Make sure the data type for the columns is Nvarchar because in 2008 the bytes are the same with UTF16, you may also need collation for the language you are storing in MySQL.

    2. if it is possible to do this what should i check after data is migrated?

    I don't understand what you mean by that.

    3. What is the difference between utf8 and utf16?

    UTF8 is single byte while UTF16 is double byte and SQL Server does not use UTF16 it uses UCS-2 which is similar to UTF16 but still different.

    Kind regards,
    Gift Peddie

  • Thanks Gift

    But when you tell me

    'Make sure the data type for the columns is Nvarchar because in 2008 the bytes are the same with UTF16'

    What do you mean with '.....the bytes are the same with UTF16'

    and then when you tell me

    'UTF8 is single byte while UTF16 is double byte and SQL Server does not use UTF16 it uses UCS-2 which is similar to UTF16 but still different.'

    You tell that 'SQL Server does not use UTF16 it uses UCS-2 which is similar to UTF16 but still different'

    I'm sorry for these kind of questions but unexpectly i look like that i dont know anything?

    :blush:

    Regards,

    JMSM 😉

  • SQL Server 2005 and below there was no data type with the same bytes as the .NET Char which is an unsigned integer 16 created to represent character data. That was changed in SQL Server 2008 the NVachar and Nchar maps to the same bytes as the .NET Char an unsigned interger 16 which is the Unicode standard UTF16 but SQL Server and some other RDBMS uses UCS-2 which is similar to UTF16 but comes with some character rendering limitations in some languages.

    http://msdn.microsoft.com/en-us/library/ms131092.aspx

    Kind regards,
    Gift Peddie

Viewing 4 posts - 1 through 3 (of 3 total)

You must be logged in to reply to this topic. Login to reply