charindex failure - SQL Server 2008 bug?

  • I am parsing the contents of a text field that looks like this:

    ####:<lname, fname>:####

    So I am using charindex(':', text_value) to find the colon in order to separate the numbers from the name. It works fine, with one exception. One of the names starts with a special character (Unicode decimal value 700). When the colon is followed by that character, SQL server simply can't see it.

    Here is some sample data for the id (int) and text_value (ntext) fields:

    753456:Jones, Mark:976521

    427654:Smith, John:23456

    98876:ʼÐiên, Biêx:34567

    To demonstrate the problem, this SQL query returns the contents of the text_value field, the field length, and the positions of the first and second colon in the field.

    select

    id

    , text_value

    , datalength(text_value) FieldLength

    , charindex(':', text_value) Colon1Pos

    , charindex(':', text_value, charindex(':', text_value) + 1) Colon2Pos

    from metadatavalue mv where id in (75, 42, 98)

    The query returns:

    idtext_valueFieldLengthColon1PosColon2Pos

    753456:Jones, Mark:97652146517

    427654:Smith, John:2345644517

    98876:ʼÐiên, Biêx:3456742160

    Note that for the third item, SQL cannot see the first colon, so it returns the postion of the second colon first and returns 0 for the postion of the second colon. Removing the diacritic (Unicode decimal value 700) from the beginning of the name demonstrates that that is what causes the problem. Is this a known problem? Is there a workaround for using charindex in this situation? (I have already solved the problem a different way using ISNUMERIC and a bunch of CASE statements, but that is tedious and I'd like the code to be more efficient.)

  • I do not know if this will work for you but if you change the datatype of your text_value column in the query to non unicode the correct values are returned.

    create table #tmp(id int, text_value ntext)

    insert into #tmp

    values(75,N'3456:Jones, Mark:976521'),

    (42,N'7654:Smith, John:23456'),

    (98,N'876:'Ðiên, Biêx:34567')

    select

    id

    , text_value

    , datalength(text_value) FieldLength

    , charindex(':', cast(text_value as varchar(50))) Colon1Pos

    , charindex(':', cast(text_value as varchar(50)), charindex(':', cast(text_value as varchar(50))) + 1) Colon2Pos

    from #tmp mv where id in (75, 42, 98)

    drop table #tmp

    idtext_value FieldLengthColon1PosColon2Pos

    753456:Jones, Mark:97652146 5 17

    427654:Smith, John:2345644 5 17

    98876:'Ðiên, Biêx:3456742 4 16

    Bob
    -----------------------------------------------------------------------------
    How to post to get the best help[/url]

  • Very Nice! Thanks!

Viewing 3 posts - 1 through 2 (of 2 total)

You must be logged in to reply to this topic. Login to reply