Internationalization: Handling Lookup Data in Multiple Languages

  • I have searched high and low and the best I could find was this SSC post from a while back: http://www.sqlservercentral.com/Forums/Topic205432-230-4.aspx#bm1000310

    Essentially, we have a tables with lookup data that we need to present in multiple languages. For instance, a ContactType table:

    ContactTypeID ContactTypeDescription

    1 Customer

    2 Vendor

    We need "Customer" and "Vendor" to be displayed in English or Spanish or German, depending on the users preference. So I am considering the following options:

    1) Add a LanguageID to the ContactType table and make a composite key with the existing ContactTypeID. Then everywhere we have a FK using ContactTypeID, we'll have to also include LanguageID. This gets messy quickly.

    2) Create a MasterTranslation table. Anywhere we reference ContactTypeDescription, we would instead use a UDF and pass the table name, column name, entity ID, and language ID.

    e.g. SELECT @ContactType = dbo.fnTranslate('ContactType', 'ContactTypeDescription', 2, 29) if 29 was the desired LanguageSID

    2a) According to the post I linked to above, we could break out each language into its own MasterTranslation table to improve perfomance and allow different collation for each table.

    2b) Also according to the post I linked to above, we could use a single MasterTranslation table, and have a separate column for each language (theoretically applying a different collation to each column)

    Here's where I am getting stuck -

    • Wouldn't we be incurring a lot of overhead if we're pulling thousands of rows and have to call the translation UDF for various columns in each row? Compared to just joining against the parent table...
    • In options 2a and 2b, what's the best way to specify which language-specific table or column we need (within the UDF)? simplified examples:

      e.g. JOIN MasterTransalation_Spanish ON {conditions} AND @LanguageSID = 29 --we'd have one join for each supported language

      e.g. SELECT CASE @LanguageSID WHEN 29 THEN SpanishData WHEN 10 THEN EnglishData END FROM MasterTranslation

    Is there any internationalization/globalization database design best practice documentation out there?

  • Depends on what kind of scale of data you're looking at.

    One of the easiest ways to do it, in the database, is to add columns for each language. Keep the ID value the same, make the front-end know which column to pull depending on which language it's dealing with. Has serious scalability and development costs if you're doing this in a lot of places.

    Another is to have a composite key and have different rows for different languages of the same value. The problem here is making sure they stay that way. That you don't end up with:

    ID Lang Value

    1 ENG Man

    1 SPA Mesa

    It's easy to end up with that kind of thing when the rows are separated throughout a table. (For anyone who cares and doesn't know, "Man" is not the English word for "Mesa" in Spanish. "Table" would be the correct English translation, or perhaps "Desk".)

    That kind of design works pretty well, but maintenance of the data is critical.

    - Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
    Property of The Thread

    "Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon

Viewing 2 posts - 1 through 1 (of 1 total)

You must be logged in to reply to this topic. Login to reply