UTF8 fixed width import with European and Unicode characters

  • I have a fixed width file in UTF-8 format that contains some strings such as Kjöpmannskjär, plus some true Unicode stings (㈱マツモト交商) which I'm trying to import into a table.

    In the Connection Manager, if I define the Code Page as 65001 (UTF-8), then in the file Preview you can see that data after a "foreign" character are offset to the right by the number of "foreign" character encountered. The same applies when it gets to the database table, with the same subsequent data offset.

    In Connection Manager I've tried defining the file as Fixed Width or Ragged Right, but can't get anything to work.

    Also in Connection Manager I've tried checking the Unicode box, but then I get an "The file format of UTF-8 is not supported as Unicode" error message when I try to run it.

    I know that if I first convert the fixed width file to a CSV file, it will then import just fine, but I'm trying to avoid this additional step.

    At this stage any ideas would be gratefully received.


    Tim

  • The "Unicode" checkbox means UCS-2.

    Can you attach an example file to the thread with a line or two. DDL for your destination table and your SSIS and SQL Server product versions will help too.

    There are no special teachers of virtue, because virtue is taught by the whole community.
    --Plato

Viewing 2 posts - 1 through 1 (of 1 total)

You must be logged in to reply to this topic. Login to reply