More than 1 alphanumeric chars in a string

Question

More than 1 alphanumeric chars in a string

Viewing 8 posts - 16 through 23 (of 23 total)

You must be logged in to reply to this topic. Login to reply

Dwain Camps SSC Guru Points: 86978 More actions · Answer 1

Luis Cazares (3/12/2014)
I might be trying to kill a fly with a cannon but this could be an option. It could be turned into an inLine Table-valued Function if needed to avoid writing everything if this code is needed somewhere else.
CREATE TABLE #employee(Surname varchar(50), GivenName varchar(50)) ;
WITH Employee as
(
select 'ABC' as Surname, 'x.yz' as GivenName union all
select 'ABC', 'X.*YZ' union all
select 'A*.BC', 'xyz' union all
select 'ABc', 'x.y.z'
)
INSERT INTO #employee
SELECT *
FROM Employee;
WITH E1(N) AS(
SELECT N FROM (VALUES(0),(0),(0),(0),(0),(0),(0),(0),(0),(0))E(N)
),
E2(N) AS(
SELECT a.N FROM E1 a, E1 b
),
E4(N) AS(
SELECT a.N FROM E2 a, E2 b
)
SELECT e.*
FROM #employee e
CROSS APPLY( SELECT TOP( SELECT MAX(strLen)
FROM( VALUES(LEN( e.Surname)), (LEN( e.Surname))) x(strLen))
ROW_NUMBER() OVER(ORDER BY (SELECT NULL))
FROM E4) Tally(N)
WHERE (SUBSTRING( e.Surname, N, 1) LIKE '[^0-9a-zA-z]'
AND SUBSTRING( e.Surname, N + 1, 1) LIKE '[^0-9a-zA-z]')
OR (SUBSTRING( e.GivenName, N, 1) LIKE '[^0-9a-zA-z]'
AND SUBSTRING( e.GivenName, N + 1, 1) LIKE '[^0-9a-zA-z]') ;
GO
DROP TABLE #employee;

CROSS APPLY was my first thought too but it doesn't seem to work for this case:

with E1(N) AS(

SELECT N FROM (VALUES(0),(0),(0),(0),(0),(0),(0),(0),(0),(0))E(N)

),

E2(N) AS(

SELECT a.N FROM E1 a, E1 b

),

E4(N) AS(

SELECT a.N FROM E2 a, E2 b

),

Employee as

(

select 'ABC' as Surname, 'x.yz' as GivenName union all

select 'A.BC' as Surname, 'X.*YZ' as GivenName union all

select 'A.D.*C' as Surname, 'X.*YZ' as GivenName union all

select 'A*.BC.*.%.%.%' as Surname, 'xyz' as GivenName

)

SELECT e.*

FROM Employee e

CROSS APPLY( SELECT TOP( SELECT MAX(strLen)

FROM( VALUES(LEN( e.Surname)), (LEN( e.Surname))) x(strLen))

ROW_NUMBER() OVER(ORDER BY (SELECT NULL))

FROM E4) Tally(N)

WHERE (SUBSTRING( e.Surname, N, 1) LIKE '[^0-9a-zA-z]'

AND SUBSTRING( e.Surname, N + 1, 1) LIKE '[^0-9a-zA-z]')

OR (SUBSTRING( e.GivenName, N, 1) LIKE '[^0-9a-zA-z]'

AND SUBSTRING( e.GivenName, N + 1, 1) LIKE '[^0-9a-zA-z]') ;

My mantra: No loops! No CURSORs! No RBAR! Hoo-uh![/I]
My thought question: Have you ever been told that your query runs too fast?

My advice:
INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.

Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
Since random numbers are too important to be left to chance, let's generate some![/url]
Learn to understand recursive CTEs by example.[/url]
[url url=http://www.sqlservercentral.com/articles/St

Dwain Camps SSC Guru Points: 86978 More actions · Answer 2

This also works, using the PatternSplitCM FUNCTION that Chris Morris contributed to the 4th article in my signature links.

with Employee as

(

select 'ABC' as Surname, 'x.yz' as GivenName union all

select 'A.BC' as Surname, 'X.*YZ' as GivenName union all

select 'A.D.*C' as Surname, 'X.*YZ' as GivenName union all

select 'A*.BC.*.%.%.%' as Surname, 'xyz' as GivenName

)

SELECT a.Surname, a.GivenName

FROM Employee a

WHERE EXISTS

(

SELECT 1

FROM dbo.PatternSplitCM(a.Surname, '[^a-zA-Z0-9]')

WHERE [Matched]=1 AND LEN(Item) > 1

)

OR EXISTS

(

SELECT 1

FROM dbo.PatternSplitCM(a.GivenName, '[^a-zA-Z0-9]')

WHERE [Matched]=1 AND LEN(Item) > 1

);

Timing between my two submissions also looks pretty close:

SET STATISTICS TIME ON;

with Employee as

(

select 'ABC' as Surname, 'x.yz' as GivenName union all

select 'A.BC' as Surname, 'X.*YZ' as GivenName union all

select 'A.D.*C' as Surname, 'X.*YZ' as GivenName union all

select 'A*.BC.*.%.%.%' as Surname, 'xyz' as GivenName

),

Tally (n) AS

(

SELECT TOP (SELECT CASE WHEN MAX(LEN(Surname)) > MAX(LEN(GivenName)) THEN MAX(LEN(Surname)) ELSE MAX(LEN(GivenName)) END FROM Employee)

ROW_NUMBER() OVER (ORDER BY (SELECT NULL))

FROM sys.all_columns

)

SELECT a.Surname, a.GivenName

FROM Employee a, Employee c, Employee d, Employee e, Employee f, Employee g, Employee h, Employee i, Employee j

WHERE EXISTS

(

SELECT 1

FROM Tally

WHERE n < LEN(a.Surname) - 1 AND

PATINDEX('%[^a-zA-Z0-9]%', SUBSTRING(a.Surname, n, 1)) > 0 AND

PATINDEX('%[^a-zA-Z0-9]%', SUBSTRING(a.Surname, n+1, 1)) > 0

)

OR EXISTS

(

SELECT 1

FROM Tally

WHERE n < LEN(a.GivenName) - 1 AND

PATINDEX('%[^a-zA-Z0-9]%', SUBSTRING(a.GivenName, n, 1)) > 0 AND

PATINDEX('%[^a-zA-Z0-9]%', SUBSTRING(a.GivenName, n+1, 1)) > 0

);

SET STATISTICS TIME OFF;

SET STATISTICS TIME ON;

with Employee as

(

select 'ABC' as Surname, 'x.yz' as GivenName union all

select 'A.BC' as Surname, 'X.*YZ' as GivenName union all

select 'A.D.*C' as Surname, 'X.*YZ' as GivenName union all

select 'A*.BC.*.%.%.%' as Surname, 'xyz' as GivenName

)

SELECT a.Surname, a.GivenName

FROM Employee a, Employee c, Employee d, Employee e, Employee f, Employee g, Employee h, Employee i, Employee j

WHERE EXISTS

(

SELECT 1

FROM dbo.PatternSplitCM(a.Surname, '[^a-zA-Z0-9]')

WHERE [Matched]=1 AND LEN(Item) > 1

)

OR EXISTS

(

SELECT 1

FROM dbo.PatternSplitCM(a.GivenName, '[^a-zA-Z0-9]')

WHERE [Matched]=1 AND LEN(Item) > 1

);

SET STATISTICS TIME OFF;

My mantra: No loops! No CURSORs! No RBAR! Hoo-uh![/I]
My thought question: Have you ever been told that your query runs too fast?

My advice:
INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.

Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
Since random numbers are too important to be left to chance, let's generate some![/url]
Learn to understand recursive CTEs by example.[/url]
[url url=http://www.sqlservercentral.com/articles/St

Mitesh Oswal SSCrazy Points: 2798 More actions · Answer 3

DECLARE @tbl TABLE

(

ID INT IDENTITY(1,1),

Char1 NVARCHAR(100)

)

INSERT INTO @tbl

select 'ABC x.yz'

UNION

select 'A.BC X.*YZ'

UNION

select 'A.BC *X.YZ'

DECLARE @tblNumber TABLE

(

Number INT

)

INSERT INTO @tblNumber

SELECT TOP 1000 ROW_NUMBER() OVER(ORDER BY S.Object_id )

from sys.objects s,sys.objects s1

select Char1 from @tbl,@tblNumber where PATINDEX('%[^A-Z0-9 ]%',SUBSTRING(Char1,Number,1))>0 AND Number<=LEN(Char1)

GROUP BY Char1

HAVING COUNT(Distinct Number)>1

Regards,
Mitesh OSwal
+918698619998

Sean Lange SSC Guru Points: 286609 More actions · Answer 4

Mitesh Oswal (3/13/2014)
DECLARE @tbl TABLE ( ID INT IDENTITY(1,1), Char1 NVARCHAR(100) ) INSERT INTO @tbl select 'ABC x.yz' UNION select 'A.BC X.*YZ' UNION select 'A.BC *X.YZ' DECLARE @tblNumber TABLE ( Number INT ) INSERT INTO @tblNumber SELECT TOP 1000 ROW_NUMBER() OVER(ORDER BY S.Object_id ) from sys.objects s,sys.objects s1 select Char1 from @tbl,@tblNumber where PATINDEX('%[^A-Z0-9 ]%',SUBSTRING(Char1,Number,1))>0 AND Number<=LEN(Char1) GROUP BY Char1 HAVING COUNT(Distinct Number)>1

Unfortunately this does not meet the requirements. The OP is trying to find when there are 2 nonalphanumeric characters in a row. The excellent code you posted will only find those that have 2 non alphanumeric characters.

_______________________________________________________________

Need help? Help us help you.

Read the article at http://www.sqlservercentral.com/articles/Best+Practices/61537/ for best practices on asking questions.

Need to split a string? Try Jeff Modens splitter http://www.sqlservercentral.com/articles/Tally+Table/72993/.

Cross Tabs and Pivots, Part 1 – Converting Rows to Columns - http://www.sqlservercentral.com/articles/T-SQL/63681/
Cross Tabs and Pivots, Part 2 - Dynamic Cross Tabs - http://www.sqlservercentral.com/articles/Crosstab/65048/
Understanding and Using APPLY (Part 1) - http://www.sqlservercentral.com/articles/APPLY/69953/
Understanding and Using APPLY (Part 2) - http://www.sqlservercentral.com/articles/APPLY/69954/

Mitesh Oswal SSCrazy Points: 2798 More actions · Answer 5

DECLARE @tbl TABLE

(

ID INT IDENTITY(1,1),

Char1 NVARCHAR(100)

)

INSERT INTO @tbl

select 'ABC x.yz'

UNION

select 'A.BC X.*YZ'

UNION

select 'A.BC *X.YZ'

DECLARE @tblNumber TABLE

(

Number INT

)

INSERT INTO @tblNumber

SELECT TOP 1000 ROW_NUMBER() OVER(ORDER BY S.Object_id )

from sys.objects s,sys.objects s1

;with cte as

(

select Char1,Number,ROW_NUMBER() OVER(PARTITION BY CHAR1 ORDER BY Number)RN from @tbl,@tblNumber where PATINDEX('%[^A-Z0-9 ]%',SUBSTRING(Char1,Number,1))>0 AND Number<=LEN(Char1)

)

select Distinct C.Char1 from cte c INNER JOIN Cte C2 ON C.Char1 = C2.Char1 AND C.RN+1 = C2.RN AND C.NUmber+1 = C2.NUmber

Regards,
Mitesh OSwal
+918698619998

Hunchback SSCrazy Points: 2649 More actions · Answer 6

Try:

SELECT

*

FROM

#employee

WHERE

Surname LIKE '%[^0-9a-zA-Z][^0-9a-zA-Z]%'

OR GivenName LIKE '%[^0-9a-zA-Z][^0-9a-zA-Z]%';

GO

Jeff Moden SSC Guru Points: 1004685 More actions · Answer 7

Crud post on my part... took it down. :blush:

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

Jeff Moden SSC Guru Points: 1004685 More actions · Answer 8

hunchback (4/8/2014)
Try:
SELECT
*
FROM
#employee
WHERE
Surname LIKE '%[^0-9a-zA-Z][^0-9a-zA-Z]%'
OR GivenName LIKE '%[^0-9a-zA-Z][^0-9a-zA-Z]%';
GO

Heh... simplicity at it's best. The only thing you need to add is a % between the calculated characters and a COLLATE clause on each line. Nicely done and good reminder to keep it simple. Thanks.

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)