Verifying Data against a list of symbol characters

  • I have a database that is fed from a secondary DB. AT times the secondary database allows users to enter symbols in the name field. I am trying to isolate or list the names that have other than standard name symbols - ' examples, and letters.

  • You might want to look into the wildcards used in a LIKE query.

    For details please see BOL, section "LIKE", examples.

    Another option would be to check if the string contains an ASCII() value outside a given range.

    It depends what exactly you're looking for...



    Lutz
    A pessimist is an optimist with experience.

    How to get fast answers to your question[/url]
    How to post performance related questions[/url]
    Links for Tally Table [/url] , Cross Tabs [/url] and Dynamic Cross Tabs [/url], Delimited Split Function[/url]

  • If you do not have a listing of the ASCII character sets go to:

    http://msdn.microsoft.com/en-us/library/4z4t9ed1.aspx

    with these you can compose a T-SQL statement using either the CHAR and/or ASCI functions to test each character in the name to identify those that you wish to replace.

    If you would post some sampke data (not actual names) some one would be able to assist you in composing the necessary T-SQL statement.

    If everything seems to be going well, you have obviously overlooked something.

    Ron

    Please help us, help you -before posting a question please read[/url]
    Before posting a performance problem please read[/url]

  • Most commonly I will get data like this

    LastName

    SmithJohn

  • I usually get data like this in error

    LastName FirstName

    Smith John

    S#ith Jane

    \ Jim

    Anderson Ken

    Smith-Barny Andre'

    What I am trying to get is comes way to find those records that have invalid symbols. As the last record denotes a name that has valid characters - and ' while lines 2 and 4 have invalid characters # and \ , the other lines having standard names.

  • Robert.Weeden (5/18/2010)


    Most commonly I will get data like this

    LastName

    SmithJohn

    ???

    Seems like the page formatting eat up all your "special" character... Maybe you could provide a (small) screen shot as an image?



    Lutz
    A pessimist is an optimist with experience.

    How to get fast answers to your question[/url]
    How to post performance related questions[/url]
    Links for Tally Table [/url] , Cross Tabs [/url] and Dynamic Cross Tabs [/url], Delimited Split Function[/url]

  • Actually I had an overly "helpful" coworker attack my cube as I was typing... 🙂

  • Would something like the following help?

    SELECT * FROM

    (

    SELECT 'Smith John' AS lastname UNION ALL

    SELECT 'S#ith Jane' UNION ALL

    SELECT '\ Jim' UNION ALL

    SELECT 'Anderson Ken' UNION ALL

    SELECT 'Smith-Barny Andre'''

    ) a

    WHERE lastname LIKE '%[^a-Z /-'']%' ESCAPE '/'

    It will check for rows not containing characters from a to Z, blanks, - and '.

    Please note that I had to use the ESCAPE keyword to allow - since it's usually considered a special character as shown in the a-Z syntax.... Also, I had to use a double ' to mark it as a specific character and not as the end of the character expression itself. Looks weird, but it seems to work... 😉

    If you need more character to be excluded you need to expand the LIKE statement.



    Lutz
    A pessimist is an optimist with experience.

    How to get fast answers to your question[/url]
    How to post performance related questions[/url]
    Links for Tally Table [/url] , Cross Tabs [/url] and Dynamic Cross Tabs [/url], Delimited Split Function[/url]

  • The root problem I am having is trying to find any instance of the incorrect symbols amongst a myriad of combinations. I need to be able to find any time that the LastName field contains ANY invalid Characters. The only ones we currently accept are A-Z, a-z, - , and ' OR 27,65-90,95,97-122 Dec.

  • Robert.Weeden (5/18/2010)


    The root problem I am having is trying to find any instance of the incorrect symbols amongst a myriad of combinations. I need to be able to find any time that the LastName field contains ANY invalid Characters. The only ones we currently accept are A-Z, a-z, - , and ' OR 27,65-90,95,97-122 Dec.

    Actually, that's what the code from above does. It will only allow the values stored in the LIKE condition. The ^ sign at the beginning of the wildcard search makes the search list a "white list" instead of a "black list", so those characters are allowed and all others will cause the value to be displayed.

    Actually, I had a reverse phrase in my prev. post:

    If you need more character to be included you need to expand the LIKE statement (= "white list").



    Lutz
    A pessimist is an optimist with experience.

    How to get fast answers to your question[/url]
    How to post performance related questions[/url]
    Links for Tally Table [/url] , Cross Tabs [/url] and Dynamic Cross Tabs [/url], Delimited Split Function[/url]

  • Yep and it is just what I was looking for. Thank you very much,

    Bob 🙂

  • Glad I could help 🙂



    Lutz
    A pessimist is an optimist with experience.

    How to get fast answers to your question[/url]
    How to post performance related questions[/url]
    Links for Tally Table [/url] , Cross Tabs [/url] and Dynamic Cross Tabs [/url], Delimited Split Function[/url]

  • lmu92 (5/18/2010)


    Would something like the following help?

    SELECT * FROM

    (

    SELECT 'Smith John' AS lastname UNION ALL

    SELECT 'S#ith Jane' UNION ALL

    SELECT '\ Jim' UNION ALL

    SELECT 'Anderson Ken' UNION ALL

    SELECT 'Smith-Barny Andre'''

    ) a

    WHERE lastname LIKE '%[^a-Z /-'']%' ESCAPE '/'

    It will check for rows not containing characters from a to Z, blanks, - and '.

    Please note that I had to use the ESCAPE keyword to allow - since it's usually considered a special character as shown in the a-Z syntax.... Also, I had to use a double ' to mark it as a specific character and not as the end of the character expression itself. Looks weird, but it seems to work... 😉

    If you need more character to be excluded you need to expand the LIKE statement.

    Nicely done. Here's an additional hint. Put the dash right after the ^ and you won't need to escape it.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Jeff Moden (5/18/2010)


    lmu92 (5/18/2010)


    Would something like the following help?

    SELECT * FROM

    (

    SELECT 'Smith John' AS lastname UNION ALL

    SELECT 'S#ith Jane' UNION ALL

    SELECT '\ Jim' UNION ALL

    SELECT 'Anderson Ken' UNION ALL

    SELECT 'Smith-Barny Andre'''

    ) a

    WHERE lastname LIKE '%[^a-Z /-'']%' ESCAPE '/'

    It will check for rows not containing characters from a to Z, blanks, - and '.

    Please note that I had to use the ESCAPE keyword to allow - since it's usually considered a special character as shown in the a-Z syntax.... Also, I had to use a double ' to mark it as a specific character and not as the end of the character expression itself. Looks weird, but it seems to work... 😉

    If you need more character to be excluded you need to expand the LIKE statement.

    Nicely done. Here's an additional hint. Put the dash right after the ^ and you won't need to escape it.

    Good point!

    (Why does it remind me of "The Clever Little Tailor" story, where the tailor seems to be able to squeeze water from a stone based on just pure knowlege/wisdom? 😀 )



    Lutz
    A pessimist is an optimist with experience.

    How to get fast answers to your question[/url]
    How to post performance related questions[/url]
    Links for Tally Table [/url] , Cross Tabs [/url] and Dynamic Cross Tabs [/url], Delimited Split Function[/url]

Viewing 14 posts - 1 through 13 (of 13 total)

You must be logged in to reply to this topic. Login to reply