Ignore ANSI Characters in data comparison

  • I am comparing two fields one from our legacy table and one in our new table structure that should have identical text data. The new field has an assortment of ANSI characters where the legacy data did not have these. Is there anything I can do that will ignore all ansi character differences? The only route I can think of is just do a replace on each ANSI type on the new column but there are quite a few character types and I was looking for a more elegant solution.

  • I assume you mean non-printable characters like Carriage Return or Tab?

  • Yes you are correct.

  • Something like this could work:

    DECLARE @string VARCHAR(1000) = 'this is a string with some non-printable characters at the end '

    /* 67 characters including spaces at the end */ ;

    SELECT @string, DATALENGTH(@string);

    /* add 200 characters of non-printable characters */

    SET @string = @string + REPLICATE(CHAR(10), 100) + REPLICATE(CHAR(7), 100);

    SELECT @string, DATALENGTH(@string);

    WITH L0

    AS (

    SELECT

    1 AS c

    UNION ALL

    SELECT

    1

    ),

    L1

    AS (

    SELECT

    1 AS c

    FROM

    L0 AS A

    CROSS JOIN L0 AS B

    ),

    L2

    AS (

    SELECT

    1 AS c

    FROM

    L1 AS A

    CROSS JOIN L1 AS B

    ),

    L3

    AS (

    SELECT

    1 AS c

    FROM

    L2 AS A

    CROSS JOIN L2 AS B

    ),

    L4

    AS (

    SELECT

    1 AS c

    FROM

    L3 AS A

    CROSS JOIN L3 AS B

    ),

    L5

    AS (

    SELECT

    1 AS c

    FROM

    L4 AS A

    CROSS JOIN L4 AS B

    ),

    Nums

    AS (

    SELECT

    ROW_NUMBER() OVER (ORDER BY (

    SELECT

    NULL

    )) AS n

    FROM

    L5

    )

    /* The RTRIM removes trailing spaces. The REPLACE replaces all

    non-printable ASCII characters (0-31) */

    SELECT

    @string = RTRIM(REPLACE(@string, CHAR(n-1), ''))

    FROM

    Nums

    WHERE

    n < 32

    SELECT

    @string,

    DATALENGTH(@string);

    The problem right now is that the REPLACE function replaces the characters found anywhere in the text. If you want to just remove the ending characters you'd have to adapt the code to find the first one and use substring to just remove the characters from that point to the end.

  • This worked perfectly! Thanks so much for your help.

Viewing 5 posts - 1 through 4 (of 4 total)

You must be logged in to reply to this topic. Login to reply