arabic digits stored in varchar column with Arabic_* collation

  • Hello,

    2021-10-22_10-04-41

    2021-10-22_09-50-24

    FYI Database collation is Arabic_100_CS_AS

    Why do Arabic characters in a varchar column (collated as Arabic_100_CS_AS) get handled correctly (inserted and displayed) while digits get transformed into their Latin equivalent? We know that NVARCHAR is able to handle this correctly but we still want to know what goes wrong with our varchar column in that case.

    Thank you,

    Mark

    • This topic was modified 3 years, 2 months ago by  Mark T..
  • Removed

    • This reply was modified 3 years, 2 months ago by  Phil Parkin.

    The absence of evidence is not evidence of absence.
    Martin Rees

    You can lead a horse to water, but a pencil must be lead.
    Stan Laurel

  • Screen shot of SQL code :

    2021-10-22_10-04-41

  • Let me start out by saying I'm not 100% sure of everything that's going on here. I'm honestly a little bit surprised that the non-digit characters get preserved in the VARCHAR field. But anyway...

    NVARCHAR uses 2 bytes to store a character, and VARCHAR uses 1 byte to store a character. But I'm not that familiar with the code mappings and how things show up based on computer settings and collation.

    The real reason I bring this up though, is because I do know that when SQL Server takes NVARCHAR characters and tries to store it in a VARCHAR field, sometimes it does a mapping or a translation to get an "equivalent" character. This is important, because there's a clever hack for SQL injection.

    There are some programmers who will do string concatenations, and try to prevent SQL injection attacks by disallowing certain characters in their input, like a single quote. However, there's an NVARCHAR character (that I can no longer remember what it is) that SQL Server will convert to a CHAR single quote. So you can type in this character, SQL Server converts to to a single quote, and that opens up the SQL injection that they were trying to prevent.

    Given that scenario, I wouldn't be surprised if there are other equivalency translations that SQL Server does, and these Arabic digits are going through the same process.

Viewing 4 posts - 1 through 3 (of 3 total)

You must be logged in to reply to this topic. Login to reply