number field check

  • Hi all,

    How do I find out if the field value is of a number that has the same digit.

    For instance I need to find all rows where column_1 is all 0s or all 1s or all 9s as 00000, 111, 999999999999

    Here is some value that are in the table right now:

    000

    0000000

    00000000000

    000001

    0001000

    00-0000000-01

    Thanks,

  • Does this do what you want?

    SELECT my_column

    FROM dbo.my_table myt

    WHERE my_column not like '%[^0]%'

    OR my_column not like '%[^1]%'

    OR my_column not like '%[^9]%'

    If not, can you post back an example of where it falls down?

    Cheers

    ...One of the symptoms of an approaching nervous breakdown is the belief that ones work is terribly important.... Bertrand Russell

  • You mean to say you want to count the occurence of that number right?

    If so then below is the query for that:

    SELECT (LEN(columnname) - LEN(REPLACE(columnname, FindSubString, '' ))) Cntrepeated

    :-)I am a newbie to this site:-)

  • Here's one approach. You didn't provide ddl for your test data so I made some assumptions about data types. This approach is based on using the first character to define what "all the same" looks like. It does not account for punctuation so it may not be handling the last example correctly.

    Create Table #test

    (

    num varchar(20)

    )

    ;

    Insert #test (num)

    Values

    ('000'),

    ('0000000'),

    ('00000000000'),

    ('000001'),

    ('0001000'),

    ('00-0000000-01'),

    ('00-0000000-00')

    ;

    Select

    num,

    Case When Replicate(LEFT(num,1),LEN(num)) = num Then 'all the same' Else 'different' End As compare

    From

    #test

    ;

    Drop table #test;

  • I know I didn't ask for it in the original post, but is there a way to account for any digit?

    This work fine, just looking for a shortcut.

    SELECT *

    FROM tab

    WHERE col not like '%[^0]%'

    OR col not like '%[^1]%'

    OR col not like '%[^2]%'

    OR col not like '%[^3]%'

    OR col not like '%[^4]%'

    OR col not like '%[^5]%'

    OR col not like '%[^6]%'

    OR col not like '%[^7]%'

    OR col not like '%[^8]%'

    OR col not like '%[^9]%'

    something like SELECT * FROM tab WHERE col not like '%[^0 - ^9]%'

    Thanks,

  • If I understand your requirements, the approach I suggested will work for any number. I placed the comparison test in a CTE and then queried for the final results. You'd still have to parse out punctuation if punctuation should be ignored in the comparison. Using an expanded set of data:

    Create Table #test

    (

    num varchar(20)

    )

    ;

    Insert #test (num)

    Values

    ('000'),

    ('0000000'),

    ('00000000000'),

    ('000001'),

    ('0001000'),

    ('00-0000000-01'),

    ('00-0000000-00'),

    ('88848'),

    ('44444444'),

    ('62135444'),

    ('222222222'),

    ('55')

    ;

    With same_check

    As

    (

    Select

    num,

    Case When Replicate(LEFT(num,1),LEN(num)) = num Then 'same' Else 'different' End As compare

    From

    #test

    )

    Select num

    From same_check

    Where compare = 'same'

    ;

    Drop table #test;

  • Thanks you all for your replies. I have some options now.

  • Be good to know what the underlying business need is. Maybe there's a solution we haven't given you because we don't have enough information about the underlying problem. What's the significance of '00000' or '111' or '999999999999999999999' or '9'? What do they mean, and why do you need to filter them out?

    ...One of the symptoms of an approaching nervous breakdown is the belief that ones work is terribly important.... Bertrand Russell

  • Could you do something like this?

    ;WITH Strings (s) AS (

    SELECT '000' UNION ALL SELECT '0000000' UNION ALL SELECT '00000000000'

    UNION ALL SELECT '000001' UNION ALL SELECT '0001000' UNION ALL SELECT '00-0000000-01'

    UNION ALL SELECT '00-0000000-00' UNION ALL SELECT '88848' UNION ALL SELECT '44444444'

    UNION ALL SELECT '62135444' UNION ALL SELECT '222222222' UNION ALL SELECT '55')

    SELECT s

    FROM Strings

    WHERE PATINDEX('%[^' + LEFT(s, 1) + ']%', s) = 0


    My mantra: No loops! No CURSORs! No RBAR! Hoo-uh![/I]

    My thought question: Have you ever been told that your query runs too fast?

    My advice:
    INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
    The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.

    Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
    Since random numbers are too important to be left to chance, let's generate some![/url]
    Learn to understand recursive CTEs by example.[/url]
    [url url=http://www.sqlservercentral.com/articles/St

  • I was asked to clean up data. The column that I am looking at (nvarchar(20)) contains tax id values.

    I needed to identify rows where tax id value contains same digits (as of all zeros, threes, nines ..... or others).

    I really liked SSC-Enthusiastic's approach as it helps me to identify values of the same characters (....., $$$$$, 1111, ## and so on), not just numbers.

    As it appears, the data that we've got has a lot of junk in it.

  • eugene.pipko (1/3/2013)


    I was asked to clean up data. The column that I am looking at (nvarchar(20)) contains tax id values.

    I needed to identify rows where tax id value contains same digits (as of all zeros, threes, nines ..... or others).

    I really liked SSC-Enthusiastic's approach as it helps me to identify values of the same characters (....., $$$$$, 1111, ## and so on), not just numbers.

    As it appears, the data that we've got has a lot of junk in it.

    Hmmm... thought mine did that too.


    My mantra: No loops! No CURSORs! No RBAR! Hoo-uh![/I]

    My thought question: Have you ever been told that your query runs too fast?

    My advice:
    INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
    The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.

    Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
    Since random numbers are too important to be left to chance, let's generate some![/url]
    Learn to understand recursive CTEs by example.[/url]
    [url url=http://www.sqlservercentral.com/articles/St

  • Sorry Dwain for not mentioning you. I just didn't test yours yet.

  • eugene.pipko (1/3/2013)


    Sorry Dwain for not mentioning you. I just didn't test yours yet.

    No problem... I'm used to toiling in obscurity. :hehe:


    My mantra: No loops! No CURSORs! No RBAR! Hoo-uh![/I]

    My thought question: Have you ever been told that your query runs too fast?

    My advice:
    INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
    The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.

    Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
    Since random numbers are too important to be left to chance, let's generate some![/url]
    Learn to understand recursive CTEs by example.[/url]
    [url url=http://www.sqlservercentral.com/articles/St

  • In terms of finding rows for your cleanup, Dwain's approach and mine will give you the same result so it comes down to speed. How many rows are you looking at cleaning up? I didn't do any performance testing so I'm not sure which approach is faster, REPLICATE or PATINDEX. If it's only a few rows, it probably doesn't matter. If you're talking about millions of rows, performance testing might be helpful.

    I haven't used PATINDEX before--thanks for the new thought Dwain! I suspect that in performance testing that will be the faster approach but testing would show for sure.

  • I wonder whether your friendly tax office (yah right!) might be able to supply you with some business rules around validating "tax ids". For instance can it be only X characters wide? Do any of the digits have intrinsic meaning? Some of the government identifiers I've dealt with in the past have had all sorts of little easter eggs embedded in them that help you weed out the junk. Some have had a particular "check digit" that is based on the values of the other digits. In some the first digit could only be a number between 1 and 6... and so on. Just thinking that if your ultimate aim is to identify junk, there are going to be lots of other type of junk than just consecutive digits.

    ...One of the symptoms of an approaching nervous breakdown is the belief that ones work is terribly important.... Bertrand Russell

Viewing 15 posts - 1 through 15 (of 15 total)

You must be logged in to reply to this topic. Login to reply