Spanish characters to English conversion

  • Hi All,

    Here is my situation:

    My current SQL server 2000 database takes first name, last name, address, etc from ASP front end in english. Now we want to have a separate ASP front end for Spanish, which takes first name, last name, address, etc in spanish characters to hit the same Table as previous one. Now Spanish and english are staying together fine with the language column (ENG for enlish, SPA for Spanish).

    All the data from this table is moved to a different application by a stored procedure (business logic is embedded). Now the requirement is I have to convert all the spanish characters to English before sending it to this different application.

    Can we convert Names & Address from Spanish to English? How do we do it using SQL server Functions?

    I ideally want a User defined Function like fn_convert_spanish_to_english(@spanish_in) Which returns @english_out

    Finally I want something Like:

    Set @english_first_name = fn_convert_spanish_to_english(@spanish_first_name)

    Set @english_last_name = fn_convert_spanish_to_english(@spanish_last_name)

    Set @english_address1 = fn_convert_spanish_to_english(@spanish_address1)

    Set @english_address2 = fn_convert_spanish_to_english(@spanish_address2)

    Set @english_city = fn_convert_spanish_to_english(@spanish_city)

    Is it possible? If so any leads?

    Many Thanks

    Ram.

     

     

     

  • Hi Ramesh

    Each character has it's own charater code whether the language is in en or es. There's no automatic translation between the characters, SQL Server isn't a translation engine.

    What I suggest is to create a table of Spanish characters or phrases with their English translation and use the REPLACE function to "convert" the Spanish text where found into the English translation.

    The context of a charater or phrase within a sentence or word does not automatically have a easily recognisable translation. Words with singular and clearly defined translations, cities for example, would be easy to automatically replace.

    Whether this type of approach will maintain the integrity of your data is questionable, hence the plethora of well paid translation professionals.

    Here's the URL on how to use Unicode data in Books Online: http://msdn2.microsoft.com/en-us/library/ms187828.aspx

     

    Max

  • I would offer a simple solution just by converting the very few characters in spanish that don't exist in english: á, é, í, ó, ú, ñ, Ñ to their corresponding: a, e, i, o, u, n, N. Just use REPLACE in your functions to translate first, last names, addresses and use a mapping table for cities.

    The actual names cannot be translated beyond the conversion of the spanish characters and the cities can be mapped.

  • I would be careful about trying to convert characters that exist in Spanish but not in English to English.

    In fact, logically it's impossible to do because La Niña is not the same as La Nina.  There simply isn't an English equivalent of ñ.

    It feels to me like this requirement has not been described correctly or the business is asking you for a translation program that will covert Spanish words to their English counterparts (for ex. La Niña becomes "The Girl").  If that's what they're asking for, then you'll have to build that type of program yourself (or select a 3rd party vendor that sells that type of software) because I do not believe that there is anything in SQL Server that will do this for you.

     

     

  • why not store the characters in spanish?

    use nvarchar as your data type

  • Rob,

    I also was wondering if this was a space issue, and if they were trying to regain space by using non-unicode data types after they 'converted' the Spanish characters to English.

    However, it's a common misunderstanding to believe that all foreign languages require unicode data types (i.e., nvarchar, ntext, etc.).  Although Microsoft's official recommendation is to always use unicode if you are supporting more than one language,  if you are really pushed for space then you can get away with using non-unicode data types for Latin based character languages in the West (e.g., English, Spanish, French, etc.) even if they have accents and/or letters that don't exist in English.  In general, it's the non-Latin based character languages (e.g., Chinese, Arabic, Cyrillic, etc.) that absolutely require unicode.

    In other words, if the international scope of his project is to support data from North America and Western Europe, then he can use varchar.  However, if the scope is to support data from North America and the Pacific Rim, then he should use nvarchar.

    There are ways to fool SQL Server into permitting non-Latin based characters to be stored in non-unicode data types, however you'll be flirting with disaster.

     

     

  • If the scope of this project is limited to spanish, there is no need to use nvarchar.

  • Max

  • Thanks for all your responses.

    It turned out to be we dont want any spanish characters stored in database, so convert only non-english spanish characters to its equivalent english ones.

    That would be those special a,e,i,o,u,n, etc (with tilde and other single quote on its top) to regular a,e,i,o,u,n

    Thanks!

     

     

  • How will the users feel about this decision?

Viewing 10 posts - 1 through 9 (of 9 total)

You must be logged in to reply to this topic. Login to reply