number field check

Question

Post reply

number field check

RightOnTarget

Hall of Fame

Points: 3041
More actions
January 3, 2013 at 1:19 pm

#272424

Hi all,
How do I find out if the field value is of a number that has the same digit.
For instance I need to find all rows where column_1 is all 0s or all 1s or all 9s as 00000, 111, 999999999999
Here is some value that are in the table right now:
000
0000000
00000000000
000001
0001000
00-0000000-01
Thanks,

Viewing 15 posts - 1 through 15 (of 15 total)

You must be logged in to reply to this topic. Login to reply

GPO SSCarpal Tunnel Points: 4574 More actions · Answer 1

Does this do what you want?

SELECT my_column

FROM dbo.my_table myt

WHERE my_column not like '%[^0]%'

OR my_column not like '%[^1]%'

OR my_column not like '%[^9]%'

If not, can you post back an example of where it falls down?

Cheers

...One of the symptoms of an approaching nervous breakdown is the belief that ones work is terribly important.... Bertrand Russell

sugat11 SSC Rookie Points: 32 More actions · Answer 2

You mean to say you want to count the occurence of that number right?

If so then below is the query for that:

SELECT (LEN(columnname) - LEN(REPLACE(columnname, FindSubString, '' ))) Cntrepeated

:-)I am a newbie to this site:-)

kl25 Hall of Fame Points: 3752 More actions · Answer 3

Here's one approach. You didn't provide ddl for your test data so I made some assumptions about data types. This approach is based on using the first character to define what "all the same" looks like. It does not account for punctuation so it may not be handling the last example correctly.

Create Table #test

(

num varchar(20)

)

;

Insert #test (num)

Values

('000'),

('0000000'),

('00000000000'),

('000001'),

('0001000'),

('00-0000000-01'),

('00-0000000-00')

;

Select

num,

Case When Replicate(LEFT(num,1),LEN(num)) = num Then 'all the same' Else 'different' End As compare

From

#test

;

Drop table #test;

RightOnTarget Hall of Fame Points: 3041 More actions · Answer 4

I know I didn't ask for it in the original post, but is there a way to account for any digit?

This work fine, just looking for a shortcut.

SELECT *

FROM tab

WHERE col not like '%[^0]%'

OR col not like '%[^1]%'

OR col not like '%[^2]%'

OR col not like '%[^3]%'

OR col not like '%[^4]%'

OR col not like '%[^5]%'

OR col not like '%[^6]%'

OR col not like '%[^7]%'

OR col not like '%[^8]%'

OR col not like '%[^9]%'

something like SELECT * FROM tab WHERE col not like '%[^0 - ^9]%'

Thanks,

kl25 Hall of Fame Points: 3752 More actions · Answer 5

If I understand your requirements, the approach I suggested will work for any number. I placed the comparison test in a CTE and then queried for the final results. You'd still have to parse out punctuation if punctuation should be ignored in the comparison. Using an expanded set of data:

Create Table #test

(

num varchar(20)

)

;

Insert #test (num)

Values

('000'),

('0000000'),

('00000000000'),

('000001'),

('0001000'),

('00-0000000-01'),

('00-0000000-00'),

('88848'),

('44444444'),

('62135444'),

('222222222'),

('55')

;

With same_check

As

(

Select

num,

Case When Replicate(LEFT(num,1),LEN(num)) = num Then 'same' Else 'different' End As compare

From

#test

)

Select num

From same_check

Where compare = 'same'

;

Drop table #test;

RightOnTarget Hall of Fame Points: 3041 More actions · Answer 6

Thanks you all for your replies. I have some options now.

GPO SSCarpal Tunnel Points: 4574 More actions · Answer 7

Be good to know what the underlying business need is. Maybe there's a solution we haven't given you because we don't have enough information about the underlying problem. What's the significance of '00000' or '111' or '999999999999999999999' or '9'? What do they mean, and why do you need to filter them out?

...One of the symptoms of an approaching nervous breakdown is the belief that ones work is terribly important.... Bertrand Russell

Dwain Camps SSC Guru Points: 86948 More actions · Answer 8

Could you do something like this?

;WITH Strings (s) AS (

SELECT '000' UNION ALL SELECT '0000000' UNION ALL SELECT '00000000000'

UNION ALL SELECT '000001' UNION ALL SELECT '0001000' UNION ALL SELECT '00-0000000-01'

UNION ALL SELECT '00-0000000-00' UNION ALL SELECT '88848' UNION ALL SELECT '44444444'

UNION ALL SELECT '62135444' UNION ALL SELECT '222222222' UNION ALL SELECT '55')

SELECT s

FROM Strings

WHERE PATINDEX('%[^' + LEFT(s, 1) + ']%', s) = 0

My mantra: No loops! No CURSORs! No RBAR! Hoo-uh![/I]
My thought question: Have you ever been told that your query runs too fast?

My advice:
INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.

Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
Since random numbers are too important to be left to chance, let's generate some![/url]
Learn to understand recursive CTEs by example.[/url]
[url url=http://www.sqlservercentral.com/articles/St

RightOnTarget Hall of Fame Points: 3041 More actions · Answer 9

I was asked to clean up data. The column that I am looking at (nvarchar(20)) contains tax id values.

I needed to identify rows where tax id value contains same digits (as of all zeros, threes, nines ..... or others).

I really liked SSC-Enthusiastic's approach as it helps me to identify values of the same characters (....., $$$$$, 1111, ## and so on), not just numbers.

As it appears, the data that we've got has a lot of junk in it.

Dwain Camps SSC Guru Points: 86948 More actions · Answer 10

eugene.pipko (1/3/2013)
I was asked to clean up data. The column that I am looking at (nvarchar(20)) contains tax id values.
I needed to identify rows where tax id value contains same digits (as of all zeros, threes, nines ..... or others).
I really liked SSC-Enthusiastic's approach as it helps me to identify values of the same characters (....., $$$$$, 1111, ## and so on), not just numbers.
As it appears, the data that we've got has a lot of junk in it.

Hmmm... thought mine did that too.

My mantra: No loops! No CURSORs! No RBAR! Hoo-uh![/I]
My thought question: Have you ever been told that your query runs too fast?

My advice:
INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.

Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
Since random numbers are too important to be left to chance, let's generate some![/url]
Learn to understand recursive CTEs by example.[/url]
[url url=http://www.sqlservercentral.com/articles/St

RightOnTarget Hall of Fame Points: 3041 More actions · Answer 11

Sorry Dwain for not mentioning you. I just didn't test yours yet.

Dwain Camps SSC Guru Points: 86948 More actions · Answer 12

eugene.pipko (1/3/2013)
Sorry Dwain for not mentioning you. I just didn't test yours yet.

No problem... I'm used to toiling in obscurity. :hehe:

My mantra: No loops! No CURSORs! No RBAR! Hoo-uh![/I]
My thought question: Have you ever been told that your query runs too fast?

My advice:
INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.

Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
Since random numbers are too important to be left to chance, let's generate some![/url]
Learn to understand recursive CTEs by example.[/url]
[url url=http://www.sqlservercentral.com/articles/St

kl25 Hall of Fame Points: 3752 More actions · Answer 13

In terms of finding rows for your cleanup, Dwain's approach and mine will give you the same result so it comes down to speed. How many rows are you looking at cleaning up? I didn't do any performance testing so I'm not sure which approach is faster, REPLICATE or PATINDEX. If it's only a few rows, it probably doesn't matter. If you're talking about millions of rows, performance testing might be helpful.

I haven't used PATINDEX before--thanks for the new thought Dwain! I suspect that in performance testing that will be the faster approach but testing would show for sure.

GPO SSCarpal Tunnel Points: 4574 More actions · Answer 14

I wonder whether your friendly tax office (yah right!) might be able to supply you with some business rules around validating "tax ids". For instance can it be only X characters wide? Do any of the digits have intrinsic meaning? Some of the government identifiers I've dealt with in the past have had all sorts of little easter eggs embedded in them that help you weed out the junk. Some have had a particular "check digit" that is based on the values of the other digits. In some the first digit could only be a number between 1 and 6... and so on. Just thinking that if your ultimate aim is to identify junk, there are going to be lots of other type of junk than just consecutive digits.

...One of the symptoms of an approaching nervous breakdown is the belief that ones work is terribly important.... Bertrand Russell