Finding Characters Patterns

  • I'm trying to search for names which include latin characters. I therefore thought the following would do the job:

    select count(*) from Proposal where Surname like '%['+char(127)+'-'+char(255)+']%'

    Which works a treat when the char values are less than 127, but not in this case.

    Is there a way to make this work, or is there some other way to find out this information.

    Thanks,

  • what is the datatype of  the "Surname" column ?

     


    * Noel

  • I need to be able to do this on varchar, char and text columns.

  • Sorry to keep asking but: what collation are you using ?

     


    * Noel

  • Noeld will probabely have a better solution but this works (even if it's a little slow) :

    Select distinct name from dbo.SysObjects O inner join dbo.Numbers N on len(name) <= PkNumber and ascii(substring(name, PkNumber, 1)) between 127 and 255

    --I use this table for many other string operations as well

    CREATE TABLE [Numbers] (

    [PkNumber] [int] IDENTITY (1, 1) NOT NULL ,

    CONSTRAINT [Pk_Number] PRIMARY KEY CLUSTERED

    (

    [PkNumber]

    ) ON [PRIMARY]

    ) ON [PRIMARY]

    GO

    Declare @i as int

    set @i = 0

    while @i < 8000

    begin

    Insert into dbo.Numbers Default values

    set @i = @i + 1

    end

    GO

  • Latin1_General_CI_AS - thanks

  • I tried the following on an NTEXT column

    SET NOCOUNT ON

    DECLARE @sPattern VARCHAR(50), @lValue TinyInt ,@lPos Int

    SET @lValue=127

    WHILE @lValue0

    SELECT @lValue , @lPos, char(@lValue)

    SET @lValue=@lvalue+1

    END

    WHEN I did SELECT SUBSTRING(Trace5,,1) I got a completely different character than I was expecting. When I did a PATINDEX on the single character it confirmed that the character had made the pattern match.

    Weird

  • Remi's Solution works for me -- not for text columns though

    DECLARE @Proposal TABLE

    (

    Surname VARCHAR(100)

    )

    INSERT @Proposal VALUES ('ABCD')

    INSERT @Proposal VALUES ('ABCD' + CHAR(128) + 'EFGH')

    INSERT @Proposal VALUES ('ABCD' + CHAR(200) + 'EFGH')

    INSERT @Proposal VALUES ('ABCD' + CHAR(240) + 'EFGH')

    INSERT @Proposal VALUES ('ABCDEFGH')

    INSERT @Proposal VALUES ('IJKLMNOP' + CHAR(240))

    INSERT @Proposal VALUES (CHAR(199) + 'QRST')

    --SELECT * FROM @Proposal

    DECLARE @ASCII TABLE (

     [PkNumber] [int] IDENTITY (1, 1) NOT NULL)

    DECLARE @i as int

    SET @i = 0

    WHILE @i < 100 -- Number should be the lenth of datatype of column on question

    BEGIN

     INSERT INTO @ASCII DEFAULT VALUES

    SET @i = @i + 1

    END

    --SELECT * FROM @ASCII

    SELECT DISTINCT Surname

    FROM

     @Proposal

    INNER JOIN @ASCII 

     ON PkNumber <= LEN(Surname)   and

     ASCII(SUBSTRING(Surname, PkNumber, 1)) between 127 and 255

    SELECT COUNT(DISTINCT Surname )

    FROM

     @Proposal

    INNER JOIN @ASCII 

     ON PkNumber <= LEN(Surname)   and

     ASCII(SUBSTRING(Surname, PkNumber, 1)) between 127 and 255

    Regards,
    gova

  • Sorry for the delay I had to do some work

    The reason for the missmatch you were getting is that AS collations consider accented caracters like "é" the same as "e" and some of the same happen for greek symbols, to avoid confusion is safer to use a binary comparison using either Remis approach or the Following :

     

    declare @data varchar(200)

    select @data = '%[' + char(127) +'-'+ char(255) + ']%' 

     

    select data from myTable where data collate Latin1_General_BIN

     like @data collate Latin1_General_BIN

     

    hth

     


    * Noel

  • DOH! Of course that collation works that way...I even pointed it out to someone the other day

    Nice, very clean solution. I knew there had to be a way.

    Thanks for all the help.

  • Hey noeld, do you have an short exemple that makes that select work?

  • Remi,

    Sorry for the delay -- I have to get some work done , you know

     A quick example is :

    create table myTable (data varchar(20))

    insert into myTable ( data ) values ( '5ñ')

    insert into myTable ( data ) values ( '5n')

    Now, Run the following queries:

    select * from myTable where data Like '%n' collate Latin1_General_CI_AS

    select * from myTable where data Like '%n' collate Latin1_General_CI_AI

     

    Enjoy!

    hth


    * Noel

  • Oh, and if you need to double check that my select works try:

    create table myTable (data varchar(20))

    insert into myTable ( data ) values ( 'normal' )

    insert into myTable ( data ) values ( '1'+ char(128) + '1' )

    insert into myTable ( data ) values ( '2'+ char(158) + '2' )

    insert into myTable ( data ) values ( '3'+ char(168) + '3' )

    insert into myTable ( data ) values ( '4'+ char(200) + '4' )

    declare @data varchar(200)

    select @data = '%[' + char(127)+'-'+ char(255) + ']%'

    -- You can change the first char(127) to char(160) and verify that the

    -- number of rows returned vary as expected ( only 2 rows)

    select data from myTable where data collate Latin1_General_BIN

     like @data collate Latin1_General_BIN

     

    hth


    * Noel

  • wow.. thanx for the trick, really gonna be usefull one of these days.

Viewing 14 posts - 1 through 13 (of 13 total)

You must be logged in to reply to this topic. Login to reply