Data Extract

Question

Data Extract

ishaan99

SSCertifiable

Points: 6261
More actions
March 26, 2012 at 8:18 am

#253718

I have a table witha column as clientid. the client id datais as follows
'55566',
'00055566'
'abcd'
'ijkl'
'1234'
'8890'
'05566'
'008890'
'8765'
'9021'
IN the above list 55566 and 8890 is a client which gets repeated a few times with superceding 0's.is there a way we extract ony thoose clientids and see if these exists as substring data in another record.if there a straight sql which could give me the results set as
'55566'
'00055566'
'05566'
'8890'
'008890'
ANy help on this wil lbe greatly appreciated. TIA

Viewing 8 posts - 1 through 7 (of 7 total)

You must be logged in to reply to this topic. Login to reply

Lowell SSC Guru Points: 323518 More actions · Answer 1

where the sample values 'abcd' and 'ijkl' there to show that the clientid is alphanumeric, or are all the id's really numeric?

if they are numeric, you can convert them to INT, for example.

converting to int will strip preceeding zeros...

WHERE CONVERT(int,client_id) IN(55556,8890)

if they are numeric and have to be varchars, recovert again to varchar:

WHERE CONVERT(varchar,CONVERT(int,client_id)) IN('55556','8890')

if the values are sometimes alphanumeric, you can use a case statement:

WHERE

CASE

WHEN ISNUMERIC(client_id) = 1

THEN CONVERT(varchar,CONVERT(int,client_id))

ELSE client_id

END IN('55556','8890')

Lowell

--help us help you! If you post a question, make sure you include a CREATE TABLE... statement and INSERT INTO... statement into that table to give the volunteers here representative data. with your description of the problem, we can provide a tested, verifiable solution to your question! asking the question the right way gets you a tested answer the fastest way possible!

GSquared SSC Guru Points: 260824 More actions · Answer 2

I'm not entirely clear on what you want here. Do you want the string-literal, or do you want the numeric equivalent so you can tie them together?

If the numeric equivalent, then insert into a temp table using "Where PatIndex '%[^0-9]%' = 0" (that will get you the ones that are strictly numeric), and the convert the values in the temp table to integers. That will give you the ones that can be converted to integers, and you can then pull equivalances from there.

Something like:

create table #T (

RawVal varchar(100),

NumVal int);

insert into #T (RawVal)

select MyColumn

from dbo.MyTable

where PatIndex '%[^0-9]%' = 0;

update #T

set NumVal = RawVal;

select *

from #T as T1

inner join #T as T2

on T1.RawVal != T2.RawVal

and T1.NumVal = T2.NumVale;

That will give you the original values that would be equivalent if they were integers, like "8890" and "0008890". You can join that back to the original table to get rows that have "the same value" numerically but different values as strings.

Is that what you need? Or am I misreading something?

- Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
Property of The Thread

"Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon

Dwain Camps SSC Guru Points: 86978 More actions · Answer 3

You can try something like this and but it will also match ABC in 0ABC (if this is what you want):

DECLARE @clients TABLE (clientno VARCHAR(20))

INSERT INTO @clients

SELECT '55566' As clientno

UNION ALL SELECT '00055566'

UNION ALL SELECT 'abcd'

UNION ALL SELECT 'ijkl'

UNION ALL SELECT '1234'

UNION ALL SELECT '8890'

UNION ALL SELECT '05566'

UNION ALL SELECT '008890'

UNION ALL SELECT '8765'

UNION ALL SELECT '9021'

UNION ALL SELECT '0ABC'

UNION ALL SELECT 'ABC'

SELECT DISTINCT x.clientno

FROM @clients c

CROSS APPLY (

SELECT clientno, REPLACE(LTRIM(REPLACE(clientno, '0', ' ')),' ', '0') as clientno1

FROM @clients) x

WHERE CHARINDEX(x.clientno1, c.clientno) > 0 and x.clientno <> c.clientno

My mantra: No loops! No CURSORs! No RBAR! Hoo-uh![/I]
My thought question: Have you ever been told that your query runs too fast?

My advice:
INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.

Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
Since random numbers are too important to be left to chance, let's generate some![/url]
Learn to understand recursive CTEs by example.[/url]
[url url=http://www.sqlservercentral.com/articles/St

GSquared SSC Guru Points: 260824 More actions · Answer 4

IsNumeric won't necessarily do the job.

Try this to see why:

SELECT ISNUMERIC('1d3');

SELECT CAST('1d3' AS INT);

- Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
Property of The Thread

"Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon

ishaan99 SSCertifiable Points: 6261 More actions · Answer 5

Sorry i should have mentioned it before . its a varchar column which can have client ids from [0-9] or [a-z].

drew.allen SSC Guru Points: 77004 More actions · Answer 6

dwain.c (3/26/2012)
You can try something like this and but it will also match ABC in 0ABC (if this is what you want):
DECLARE @clients TABLE (clientno VARCHAR(20))
INSERT INTO @clients
SELECT '55566' As clientno
UNION ALL SELECT '00055566'
UNION ALL SELECT 'abcd'
UNION ALL SELECT 'ijkl'
UNION ALL SELECT '1234'
UNION ALL SELECT '8890'
UNION ALL SELECT '05566'
UNION ALL SELECT '008890'
UNION ALL SELECT '8765'
UNION ALL SELECT '9021'
UNION ALL SELECT '0ABC'
UNION ALL SELECT 'ABC'
SELECT DISTINCT x.clientno
FROM @clients c
CROSS APPLY (
SELECT clientno, REPLACE(LTRIM(REPLACE(clientno, '0', ' ')),' ', '0') as clientno1
FROM @clients) x
WHERE CHARINDEX(x.clientno1, c.clientno) > 0 and x.clientno <> c.clientno

Despite what your signature says, this has hidden RBAR. If you take out the DISTINCT, you'll see that your table with 12 records produces 144 rows. You're essentially doing a cross join to itself and producing the square of the number of rows. Needless to say, this will bog down a server very quickly. This can be easily fixed by removing the FROM clause from inside the CROSS APPLY.

While your formula works, I think the following formula more closely matches the description of the problem (and is therefore easier to follow) and I was unable to find any performance difference between the two approaches. (Of course, I didn't test on Jeff's million row table, either.)

SELECT c.clientno, x.ClientNo1

FROM @clients c

CROSS APPLY (

SELECT SUBSTRING(c.ClientNo, PATINDEX('%[^0]%', c.ClientNo), Len(c.ClientNo)) AS ClientNo1

) AS x

Drew

J. Drew Allen
Business Intelligence Analyst
Philadelphia, PA

Dwain Camps SSC Guru Points: 86978 More actions · Answer 7

Drew - Yes I am aware of hidden RBAR gotchas and if I can I usually try to remove them. In this case, I didn't see that way but your post made me see the light.

Good solution.

My mantra: No loops! No CURSORs! No RBAR! Hoo-uh![/I]
My thought question: Have you ever been told that your query runs too fast?

My advice:
INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.

Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
Since random numbers are too important to be left to chance, let's generate some![/url]
Learn to understand recursive CTEs by example.[/url]
[url url=http://www.sqlservercentral.com/articles/St