Remove non alphabetic characters but retain (space, single quote and dash)

  • Hi,

    This isn't my function but I've been playing around with it so it does what the title above says!

    Create Function [dbo].[RemoveNonAlphaCharacters](@Temp VarChar(1000))

    Returns VarChar(1000)

    AS

    Begin

    While PatIndex('%[^a-z - '']%', @Temp) > 0

    Set @Temp = Stuff(@Temp, PatIndex('%[^a-z - '']%', @Temp), 1, '')

    Return @Temp

    End

    go

    Tried it with the following:

    select [dbo].[RemoveNonAlphaCharacters](' O''Neil ??/ OMG!!!!! %$See<> No,,,,, See.... see-his ')

    But this gives back: O'Neil OMG See No See seehis

    I've retained the space and the single quote but I can't keep the dash!

    ---------------------------------------------------------

    It takes a minimal capacity for rational thought to see that the corporate 'free press' is a structurally irrational and biased, and extremely violent, system of elite propaganda.
    David Edwards - Media lens[/url]

    Society has varying and conflicting interests; what is called objectivity is the disguise of one of these interests - that of neutrality. But neutrality is a fiction in an unneutral world. There are victims, there are executioners, and there are bystanders... and the 'objectivity' of the bystander calls for inaction while other heads fall.
    Howard Zinn

  • Abu Dina (8/10/2012)


    Hi,

    This isn't my function but I've been playing around with it so it does what the title above says!

    Create Function [dbo].[RemoveNonAlphaCharacters](@Temp VarChar(1000))

    Returns VarChar(1000)

    AS

    Begin

    While PatIndex('%[^a-z - '']%', @Temp) > 0

    Set @Temp = Stuff(@Temp, PatIndex('%[^a-z - '']%', @Temp), 1, '')

    Return @Temp

    End

    go

    Tried it with the following:

    select [dbo].[RemoveNonAlphaCharacters](' O''Neil ??/ OMG!!!!! %$See<> No,,,,, See.... see-his ')

    But this gives back: O'Neil OMG See No See seehis

    I've retained the space and the single quote but I can't keep the dash!

    What's your question?

    The absence of evidence is not evidence of absence.
    Martin Rees

    You can lead a horse to water, but a pencil must be lead.
    Stan Laurel

  • I don't want the function to remove spaces, single quotes and dashes but I still want it to remove non characters like [, {, #

    It's almost working except for the dashes. It's still removing them.

    :blush:

    ---------------------------------------------------------

    It takes a minimal capacity for rational thought to see that the corporate 'free press' is a structurally irrational and biased, and extremely violent, system of elite propaganda.
    David Edwards - Media lens[/url]

    Society has varying and conflicting interests; what is called objectivity is the disguise of one of these interests - that of neutrality. But neutrality is a fiction in an unneutral world. There are victims, there are executioners, and there are bystanders... and the 'objectivity' of the bystander calls for inaction while other heads fall.
    Howard Zinn

  • here's one way;

    the nice thing about this technique, is it makes it very easy to change/adapt to include/exclude other special characters...at least for me it's obvious, that is.

    additionally, depending on your collation, A is the same as Â; your original function would miss that if it needed to be excluded.

    going after the ascii code eliminates that hiccup.

    /*--results:

    O'Neil -OMG- See No

    */

    select [dbo].RemoveNonAlphaCharacters(' O''Neil ??/ -OMG-!!!!! %$See<> No,,,,, ')

    ALTER FUNCTION RemoveNonAlphaCharacters(@OriginalText VARCHAR(8000))

    RETURNS VARCHAR(8000)

    BEGIN

    DECLARE @CleanedText VARCHAR(8000)

    ;WITH E01(N) AS (SELECT 1 UNION ALL SELECT 1 UNION ALL

    SELECT 1 UNION ALL SELECT 1 UNION ALL

    SELECT 1 UNION ALL SELECT 1 UNION ALL

    SELECT 1 UNION ALL SELECT 1 UNION ALL

    SELECT 1 UNION ALL SELECT 1), -- 10 or 10E01 rows

    E02(N) AS (SELECT 1 FROM E01 a, E01 b), -- 100 or 10E02 rows

    E04(N) AS (SELECT 1 FROM E02 a, E02 b), -- 10,000 or 10E04 rows

    Tally(N) AS (SELECT ROW_NUMBER() OVER (ORDER BY N) FROM E04)

    SELECT @CleanedText = ISNULL(@CleanedText,'') +

    CASE

    --ascii numbers are 48(for '0') thru 57 (for '9')

    WHEN ASCII(SUBSTRING(@OriginalText,Tally.N,1)) BETWEEN 48 AND 57

    --THEN SUBSTRING(@OriginalText,Tally.N,1)

    THEN ''

    --ascii upper case letters A-Z is 65 thru 90

    WHEN ASCII(SUBSTRING(@OriginalText,Tally.N,1)) BETWEEN 65 AND 90

    THEN SUBSTRING(@OriginalText,Tally.N,1)

    --ascii lower case letters a-z is 97 thru 122

    WHEN ASCII(SUBSTRING(@OriginalText,Tally.N,1)) BETWEEN 97 AND 122

    THEN SUBSTRING(@OriginalText,Tally.N,1)

    --some exceptions: space(32),singlequote(39),dashes(45)

    WHEN ASCII(SUBSTRING(@OriginalText,Tally.N,1))IN(32,45,39)

    THEN SUBSTRING(@OriginalText,Tally.N,1)

    ELSE '' END

    FROM tally WHERE Tally.N <= LEN(@OriginalText)

    RETURN @CleanedText

    END

    Lowell


    --help us help you! If you post a question, make sure you include a CREATE TABLE... statement and INSERT INTO... statement into that table to give the volunteers here representative data. with your description of the problem, we can provide a tested, verifiable solution to your question! asking the question the right way gets you a tested answer the fastest way possible!

  • Use this:

    '%[^a-z ''^-]%'

    _____________________________________________
    "The only true wisdom is in knowing you know nothing"
    "O skol'ko nam otkrytiy chudnyh prevnosit microsofta duh!":-D
    (So many miracle inventions provided by MS to us...)

    How to post your question to get the best and quick help[/url]

Viewing 5 posts - 1 through 4 (of 4 total)

You must be logged in to reply to this topic. Login to reply