Strange comparison results ³ seems to equal 3

  • I'm hoping someone can explain this bit of weirdness.

    I have a function that parses a string and keeps only a set of listed characters.

    The strange thing is that comparison of ³ WITH 3 shows true.

    Therefore if I have a string like 'HTH123³Def' I want it to return only the numbers, for this example 123. It returns 123³ which won't then convert to an integer.

    If I run this simple statement the result really surprised me:

    SELECT case when '³' = '3' then 'yep' else 'no' end

    Any guidance to help my sanity appreciated 🙂

    ---------------------------------------------
    If data can screw you, it will; never assume the data are correct.

  • Best bet is probably to use a CLR, but you could use the ASCII function.

    e.g.

    DECLARE @input VARCHAR(100), @output VARCHAR(100)

    SET @input = 'HTH123³Def'

    --Tally table would be better, for testing purposes I've included one here "on the fly"

    ;WITH t1 AS (SELECT 1 N UNION ALL SELECT 1 N),

    t2 AS (SELECT 1 N FROM t1 x, t1 y),

    t3 AS (SELECT 1 N FROM t2 x, t2 y),

    t4 AS (SELECT 1 N FROM t3 x, t3 y),

    tally AS (SELECT ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) AS num

    FROM t4 x, t4 y)

    SELECT @output = COALESCE(@output, '') + CHAR(ASCII(SUBSTRING(@input, num, 1)))

    FROM tally

    WHERE num <= LEN(@input) AND

    (ASCII(SUBSTRING(@input, num, 1)) >= 48 AND

    ASCII(SUBSTRING(@input, num, 1)) <= 57)

    SELECT @output


    Forever trying to learn
    My blog - http://www.cadavre.co.uk/
    For better, quicker answers on T-SQL questions, click on the following...http://www.sqlservercentral.com/articles/Best+Practices/61537/
    For better, quicker answers on SQL Server performance related questions, click on the following...http://www.sqlservercentral.com/articles/SQLServerCentral/66909/

  • Its down to collation , try this

    create table #x(c1 char,c2 char)

    insert into #x values('³','3')

    select * from #x where c1= c2

    select * from #x where c1= c2

    COLLATE Latin1_General_100_BIN



    Clear Sky SQL
    My Blog[/url]

  • Thanks for the answers, the collation sounds good.

    Is there a reason why Latin collation show both of these value as equal or is it 'just because'? 😉

    ---------------------------------------------
    If data can screw you, it will; never assume the data are correct.

  • Just because.

    It appears to be seeing it as a lower case 3.

    Its probably documented *somewhere*.

    select * from #x where c1= c2 COLLATE Latin1_General_CI_AI

    select * from #x where c1= c2 COLLATE Latin1_General_CI_AS

    select * from #x where c1= c2 COLLATE Latin1_General_CS_AS

    select * from #x where c1= c2 COLLATE Latin1_General_CS_AI



    Clear Sky SQL
    My Blog[/url]

  • Some more testing.

    SELECT case when '²' COLLATE SQL_Latin1_General_CP1_CS_AS = '2' COLLATE SQL_Latin1_General_CP1_CS_AS then 'yep' else 'no' end

    SELECT case when '²' COLLATE SQL_Latin1_General_CP1_CI_AS = '2' COLLATE SQL_Latin1_General_CP1_CI_AS then 'yep' else 'no' end

    SELECT case when '²' COLLATE Latin1_General_CS_AS = '2' COLLATE Latin1_General_CS_AS then 'yep' else 'no' end

    SELECT case when '²' COLLATE Latin1_General_CI_AS = '2' COLLATE Latin1_General_CI_AS then 'yep' else 'no' end

    Doesnt affect the CP1 codepage whether case sensitive or insensitive but does affect the last query that is case-insensitive.

    ---------------------------------------------
    If data can screw you, it will; never assume the data are correct.

  • pretty interesting; for one specific collation, i ran this query and got a lot of interesting results...A=À with an accent, for exmaple. I'd seen this before, good to be aware of again.

    /*

    N1 TheChar N2 TheMatch MatchResults

    49 1 185 ¹ Match

    50 2 178 ² Match

    51 3 179 ³ Match

    65 A 97 a Match

    65 A 170 ª Match

    65 A 192 À Match

    65 A 193 Á Match

    65 A 194 Â Match

    65 A 195 Ã Match

    65 A 196 Ä Match

    65 A 197 Å Match

    */

    WITH myCTE

    AS (SELECT

    TOP 255

    ROW_NUMBER() OVER (ORDER BY name) AS N

    FROM

    sys.columns)

    SELECT

    T1.N As N1,

    CHAR(t1.N) AS TheChar,

    T2.N As N2,

    CHAR(T2.N) AS TheMatch,

    CASE WHEN CHAR(t1.N) COLLATE Latin1_General_CI_AI

    = CHAR(T2.N) COLLATE Latin1_General_CI_AI

    THEN 'Match' Else '' END As MatchResults

    FROM

    myCTE T1

    CROSS JOIN myCTE T2

    WHERE CASE WHEN CHAR(t1.N) COLLATE Latin1_General_CI_AI

    = CHAR(T2.N) COLLATE Latin1_General_CI_AI

    THEN 'Match' Else '' END = 'Match'

    AND T1.N <> T2.N

    Lowell


    --help us help you! If you post a question, make sure you include a CREATE TABLE... statement and INSERT INTO... statement into that table to give the volunteers here representative data. with your description of the problem, we can provide a tested, verifiable solution to your question! asking the question the right way gets you a tested answer the fastest way possible!

Viewing 7 posts - 1 through 6 (of 6 total)

You must be logged in to reply to this topic. Login to reply