March 12, 2014 at 8:25 am
I have an employee table
Surname GivenName
ABC x.yz
A.BC X.*YZ
A*.BC xyz
The query needs to get me the surname and givenname which have more than 1 non alphanumeric characters. In this case the output will be
a.bc x.*yz
a*.BC xyz
I know how to find non alphanumeric chars in a string but not sure if I can how to find the strings with minimum 2 non alphanumeric chars?
March 12, 2014 at 8:57 am
Now look for one of the SQL Ninja's on the forum to give you something cleaner and better, but this should work for now.with Employee as
(
select 'ABC' as Surname, 'x.yz' as GivenName union all
select 'A.BC' as Surname, 'X.*YZ' as GivenName union all
select 'A*.BC' as Surname, 'xyz' as GivenName
)
SELECT * FROM Employee
WHERE substring(Surname, patindex('%[^a-zA-Z0-9]%',Surname)+1,len(Surname)+1) like '%[^a-zA-Z0-9]%'
or substring(GivenName, patindex('%[^a-zA-Z0-9]%',GivenName)+1,len(GivenName)+1) like '%[^a-zA-Z0-9]%'
March 12, 2014 at 8:57 am
shan-422658 (3/12/2014)
I have an employee tableSurname GivenName
ABC x.yz
A.BC X.*YZ
A*.BC xyz
The query needs to get me the surname and givenname which have more than 1 non alphanumeric characters. In this case the output will be
a.bc x.*yz
a*.BC xyz
I know how to find non alphanumeric chars in a string but not sure if I can how to find the strings with minimum 2 non alphanumeric chars?
Hi and welcome to the forums. This is a rather strange requirement for an employee table. It would seem to me that you should find any rows that have characters that don't make sense for a name. The only values that make sense in a name are any alphanumeric values (that depends on if you allow accent characters), a space, hyphen and apostrophe.
If you really need to allow characters like * in your employee name and you want to find any where there are more than 1 of those you could use a nested replace. It looks ugly but it should be really fast.
with employee as
(
select 'ABC' as Surname, 'x.yz' as GivenName union all
select 'A.BC', 'X.*YZ' union all
select 'A*.BC', 'xyz'
)
, AlphaRemoved as
(
select *,
len(
REPLACE(
REPLACE(
REPLACE(
REPLACE(
REPLACE(
REPLACE(
REPLACE(
REPLACE(
REPLACE(
REPLACE(
REPLACE(
REPLACE(
REPLACE(
REPLACE(
REPLACE(
REPLACE(
REPLACE(
REPLACE(
REPLACE(
REPLACE(
REPLACE(
REPLACE(
REPLACE(
REPLACE(
REPLACE(
REPLACE(
Surname, 'A', ''),
'B', ''),
'C', ''),
'D', ''),
'E', ''),
'F', ''),
'G', ''),
'H', ''),
'I', ''),
'J', ''),
'K', ''),
'L', ''),
'M', ''),
'N', ''),
'O', ''),
'P', ''),
'Q', ''),
'R', ''),
'S', ''),
'T', ''),
'U', ''),
'V', ''),
'W', ''),
'X', ''),
'Y', ''),
'Z', '')
) as NonAlphaCountSurname,
len(
REPLACE(
REPLACE(
REPLACE(
REPLACE(
REPLACE(
REPLACE(
REPLACE(
REPLACE(
REPLACE(
REPLACE(
REPLACE(
REPLACE(
REPLACE(
REPLACE(
REPLACE(
REPLACE(
REPLACE(
REPLACE(
REPLACE(
REPLACE(
REPLACE(
REPLACE(
REPLACE(
REPLACE(
REPLACE(
REPLACE(
GivenName, 'A', ''),
'B', ''),
'C', ''),
'D', ''),
'E', ''),
'F', ''),
'G', ''),
'H', ''),
'I', ''),
'J', ''),
'K', ''),
'L', ''),
'M', ''),
'N', ''),
'O', ''),
'P', ''),
'Q', ''),
'R', ''),
'S', ''),
'T', ''),
'U', ''),
'V', ''),
'W', ''),
'X', ''),
'Y', ''),
'Z', '')
) as NonAlphaCountGivenName
from employee
)
select Surname, GivenName
from AlphaRemoved
where NonAlphaCountSurname > 1
or NonAlphaCountGivenName > 1
_______________________________________________________________
Need help? Help us help you.
Read the article at http://www.sqlservercentral.com/articles/Best+Practices/61537/ for best practices on asking questions.
Need to split a string? Try Jeff Modens splitter http://www.sqlservercentral.com/articles/Tally+Table/72993/.
Cross Tabs and Pivots, Part 1 β Converting Rows to Columns - http://www.sqlservercentral.com/articles/T-SQL/63681/
Cross Tabs and Pivots, Part 2 - Dynamic Cross Tabs - http://www.sqlservercentral.com/articles/Crosstab/65048/
Understanding and Using APPLY (Part 1) - http://www.sqlservercentral.com/articles/APPLY/69953/
Understanding and Using APPLY (Part 2) - http://www.sqlservercentral.com/articles/APPLY/69954/
March 12, 2014 at 9:02 am
March 12, 2014 at 9:19 am
Keith Tate (3/12/2014)
BTW Shan, Sean is one of the SQL Ninjas π
Thanks Keith. I would not consider myself a SQL Ninja by any means but thanks for the compliment. I am truly honored and humbled.
_______________________________________________________________
Need help? Help us help you.
Read the article at http://www.sqlservercentral.com/articles/Best+Practices/61537/ for best practices on asking questions.
Need to split a string? Try Jeff Modens splitter http://www.sqlservercentral.com/articles/Tally+Table/72993/.
Cross Tabs and Pivots, Part 1 β Converting Rows to Columns - http://www.sqlservercentral.com/articles/T-SQL/63681/
Cross Tabs and Pivots, Part 2 - Dynamic Cross Tabs - http://www.sqlservercentral.com/articles/Crosstab/65048/
Understanding and Using APPLY (Part 1) - http://www.sqlservercentral.com/articles/APPLY/69953/
Understanding and Using APPLY (Part 2) - http://www.sqlservercentral.com/articles/APPLY/69954/
March 12, 2014 at 10:14 am
Sean Lange (3/12/2014)
Keith Tate (3/12/2014)
BTW Shan, Sean is one of the SQL Ninjas πThanks Keith. I would not consider myself a SQL Ninja by any means but thanks for the compliment. I am truly honored and humbled.
and well known for his love of a good nested replace π
March 12, 2014 at 11:08 am
In an effort to evaluate which of the two posted methods would perform better I put together a quick test with 1000 random names. It is kind of nasty to post all the details here but I will certainly put it together if anybody wants to see what I did. I have a couple of random name tables that I used for this. With 1,000 rows the two performed so close that there is no clear winner. When I used a cross join between my two tables (which means we are now hitting 2 million rows) the nested replace has a VERY slight edge on performance. They are both actually quite fast with even 2 million rows.
_______________________________________________________________
Need help? Help us help you.
Read the article at http://www.sqlservercentral.com/articles/Best+Practices/61537/ for best practices on asking questions.
Need to split a string? Try Jeff Modens splitter http://www.sqlservercentral.com/articles/Tally+Table/72993/.
Cross Tabs and Pivots, Part 1 β Converting Rows to Columns - http://www.sqlservercentral.com/articles/T-SQL/63681/
Cross Tabs and Pivots, Part 2 - Dynamic Cross Tabs - http://www.sqlservercentral.com/articles/Crosstab/65048/
Understanding and Using APPLY (Part 1) - http://www.sqlservercentral.com/articles/APPLY/69953/
Understanding and Using APPLY (Part 2) - http://www.sqlservercentral.com/articles/APPLY/69954/
March 12, 2014 at 11:54 am
Thanks for your reply.
Just a version 2 of the above queryβ¦
Surname GivenName
ABC x.yz
ABcx.y.z
ABC X.*YZ
A*.BC xyz
The query needs to get me the surname and givenname which have more than 1 non alphanumeric characters together(one after the other). In this case the output will be
abc x.*yz
a*.BC xyz
How would I find just the names that have 2 non alphanumeric chars together(one after the other)
March 12, 2014 at 12:22 pm
Dohsan (3/12/2014)
Sean Lange (3/12/2014)
Keith Tate (3/12/2014)
BTW Shan, Sean is one of the SQL Ninjas πThanks Keith. I would not consider myself a SQL Ninja by any means but thanks for the compliment. I am truly honored and humbled.
and well known for his love of a good nested replace π
π They are certainly ugly to look at but they are also usually super fast.
_______________________________________________________________
Need help? Help us help you.
Read the article at http://www.sqlservercentral.com/articles/Best+Practices/61537/ for best practices on asking questions.
Need to split a string? Try Jeff Modens splitter http://www.sqlservercentral.com/articles/Tally+Table/72993/.
Cross Tabs and Pivots, Part 1 β Converting Rows to Columns - http://www.sqlservercentral.com/articles/T-SQL/63681/
Cross Tabs and Pivots, Part 2 - Dynamic Cross Tabs - http://www.sqlservercentral.com/articles/Crosstab/65048/
Understanding and Using APPLY (Part 1) - http://www.sqlservercentral.com/articles/APPLY/69953/
Understanding and Using APPLY (Part 2) - http://www.sqlservercentral.com/articles/APPLY/69954/
March 12, 2014 at 12:30 pm
shan-422658 (3/12/2014)
Thanks for your reply.Just a version 2 of the above queryβ¦
Surname GivenName
ABC x.yz
ABcx.y.z
ABC X.*YZ
A*.BC xyz
The query needs to get me the surname and givenname which have more than 1 non alphanumeric characters together(one after the other). In this case the output will be
abc x.*yz
a*.BC xyz
How would I find just the names that have 2 non alphanumeric chars together(one after the other)
What are you really after? It seems like you are doing some analysis on your data. Is this something you are going to be doing routinely? Trying to find only those values where there are at least 2 non alphanumeric characters in a row is going to suck rotten tomatoes. You are going to have to look at each and every character in every value for all rows in the entire table. This takes RBAR to a new level. It is like RBAR cross join CBAC (character by agonizing character).
_______________________________________________________________
Need help? Help us help you.
Read the article at http://www.sqlservercentral.com/articles/Best+Practices/61537/ for best practices on asking questions.
Need to split a string? Try Jeff Modens splitter http://www.sqlservercentral.com/articles/Tally+Table/72993/.
Cross Tabs and Pivots, Part 1 β Converting Rows to Columns - http://www.sqlservercentral.com/articles/T-SQL/63681/
Cross Tabs and Pivots, Part 2 - Dynamic Cross Tabs - http://www.sqlservercentral.com/articles/Crosstab/65048/
Understanding and Using APPLY (Part 1) - http://www.sqlservercentral.com/articles/APPLY/69953/
Understanding and Using APPLY (Part 2) - http://www.sqlservercentral.com/articles/APPLY/69954/
March 12, 2014 at 12:37 pm
I might be trying to kill a fly with a cannon but this could be an option. It could be turned into an inLine Table-valued Function if needed to avoid writing everything if this code is needed somewhere else.
CREATE TABLE #employee(Surname varchar(50), GivenName varchar(50)) ;
WITH Employee as
(
select 'ABC' as Surname, 'x.yz' as GivenName union all
select 'ABC', 'X.*YZ' union all
select 'A*.BC', 'xyz' union all
select 'ABc', 'x.y.z'
)
INSERT INTO #employee
SELECT *
FROM Employee;
WITH E1(N) AS(
SELECT N FROM (VALUES(0),(0),(0),(0),(0),(0),(0),(0),(0),(0))E(N)
),
E2(N) AS(
SELECT a.N FROM E1 a, E1 b
),
E4(N) AS(
SELECT a.N FROM E2 a, E2 b
)
SELECT e.*
FROM #employee e
CROSS APPLY( SELECT TOP( SELECT MAX(strLen)
FROM( VALUES(LEN( e.Surname)), (LEN( e.Surname))) x(strLen))
ROW_NUMBER() OVER(ORDER BY (SELECT NULL))
FROM E4) Tally(N)
WHERE (SUBSTRING( e.Surname, N, 1) LIKE '[^0-9a-zA-z]'
AND SUBSTRING( e.Surname, N + 1, 1) LIKE '[^0-9a-zA-z]')
OR (SUBSTRING( e.GivenName, N, 1) LIKE '[^0-9a-zA-z]'
AND SUBSTRING( e.GivenName, N + 1, 1) LIKE '[^0-9a-zA-z]') ;
GO
DROP TABLE #employee;
March 12, 2014 at 12:39 pm
Luis Cazares (3/12/2014)
I might be trying to kill a fly with a sledgehammer but this could be an option. It could be turned into an inLine Table-valued Function if needed to avoid writing everything if this code is needed somewhere else.
CREATE TABLE #employee(Surname varchar(50), GivenName varchar(50)) ;
WITH Employee as
(
select 'ABC' as Surname, 'x.yz' as GivenName union all
select 'ABC', 'X.*YZ' union all
select 'A*.BC', 'xyz' union all
select 'ABc', 'x.y.z'
)
INSERT INTO #employee
SELECT *
FROM Employee;
WITH E1(N) AS(
SELECT N FROM (VALUES(0),(0),(0),(0),(0),(0),(0),(0),(0),(0))E(N)
),
E2(N) AS(
SELECT a.N FROM E1 a, E1 b
),
E4(N) AS(
SELECT a.N FROM E2 a, E2 b
)
SELECT e.*
FROM #employee e
CROSS APPLY( SELECT TOP( SELECT MAX(strLen)
FROM( VALUES(LEN( e.Surname)), (LEN( e.Surname))) x(strLen))
ROW_NUMBER() OVER(ORDER BY (SELECT NULL))
FROM E4) Tally(N)
WHERE (SUBSTRING( e.Surname, N, 1) LIKE '[^0-9a-zA-z]'
AND SUBSTRING( e.Surname, N + 1, 1) LIKE '[^0-9a-zA-z]')
OR (SUBSTRING( e.GivenName, N, 1) LIKE '[^0-9a-zA-z]'
AND SUBSTRING( e.GivenName, N + 1, 1) LIKE '[^0-9a-zA-z]') ;
GO
DROP TABLE #employee;
That is pretty slick Luis.
_______________________________________________________________
Need help? Help us help you.
Read the article at http://www.sqlservercentral.com/articles/Best+Practices/61537/ for best practices on asking questions.
Need to split a string? Try Jeff Modens splitter http://www.sqlservercentral.com/articles/Tally+Table/72993/.
Cross Tabs and Pivots, Part 1 β Converting Rows to Columns - http://www.sqlservercentral.com/articles/T-SQL/63681/
Cross Tabs and Pivots, Part 2 - Dynamic Cross Tabs - http://www.sqlservercentral.com/articles/Crosstab/65048/
Understanding and Using APPLY (Part 1) - http://www.sqlservercentral.com/articles/APPLY/69953/
Understanding and Using APPLY (Part 2) - http://www.sqlservercentral.com/articles/APPLY/69954/
March 12, 2014 at 12:43 pm
Sean Lange (3/12/2014)
That is pretty slick Luis.
Thanks Sean, it came after my third cup of coffee, I might need more now. π
March 12, 2014 at 1:49 pm
Sean Lange (3/12/2014)
In an effort to evaluate which of the two posted methods would perform better I put together a quick test with 1000 random names. It is kind of nasty to post all the details here but I will certainly put it together if anybody wants to see what I did. I have a couple of random name tables that I used for this. With 1,000 rows the two performed so close that there is no clear winner. When I used a cross join between my two tables (which means we are now hitting 2 million rows) the nested replace has a VERY slight edge on performance. They are both actually quite fast with even 2 million rows.
Thanks for doing that Sean very interesting results.
March 13, 2014 at 8:14 pm
Sean Lange (3/12/2014)
shan-422658 (3/12/2014)
Thanks for your reply.Just a version 2 of the above queryβ¦
Surname GivenName
ABC x.yz
ABcx.y.z
ABC X.*YZ
A*.BC xyz
The query needs to get me the surname and givenname which have more than 1 non alphanumeric characters together(one after the other). In this case the output will be
abc x.*yz
a*.BC xyz
How would I find just the names that have 2 non alphanumeric chars together(one after the other)
What are you really after? It seems like you are doing some analysis on your data. Is this something you are going to be doing routinely? Trying to find only those values where there are at least 2 non alphanumeric characters in a row is going to suck rotten tomatoes. You are going to have to look at each and every character in every value for all rows in the entire table. This takes RBAR to a new level. It is like RBAR cross join CBAC (character by agonizing character).
I have to agree with Sean that what you're asking for sounds rather nasty. But then again I like a good challenge:
with Employee as
(
select 'ABC' as Surname, 'x.yz' as GivenName union all
select 'A.BC' as Surname, 'X.*YZ' as GivenName union all
select 'A.D.*C' as Surname, 'X.*YZ' as GivenName union all
select 'A*.BC.*.%.%.%' as Surname, 'xyz' as GivenName
),
Tally (n) AS
(
SELECT TOP (SELECT CASE WHEN MAX(LEN(Surname)) > MAX(LEN(GivenName)) THEN MAX(LEN(Surname)) ELSE MAX(LEN(GivenName)) END FROM Employee)
ROW_NUMBER() OVER (ORDER BY (SELECT NULL))
FROM sys.all_columns
)
SELECT a.Surname, a.GivenName
FROM Employee a
WHERE EXISTS
(
SELECT 1
FROM Tally
WHERE n < LEN(Surname) - 1 AND
PATINDEX('%[^a-zA-Z0-9]%', SUBSTRING(Surname, n, 1)) > 0 AND
PATINDEX('%[^a-zA-Z0-9]%', SUBSTRING(Surname, n+1, 1)) > 0
)
OR EXISTS
(
SELECT 1
FROM Tally
WHERE n < LEN(GivenName) - 1 AND
PATINDEX('%[^a-zA-Z0-9]%', SUBSTRING(GivenName, n, 1)) > 0 AND
PATINDEX('%[^a-zA-Z0-9]%', SUBSTRING(GivenName, n+1, 1)) > 0
);
My thought question: Have you ever been told that your query runs too fast?
My advice:
INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.
Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
Since random numbers are too important to be left to chance, let's generate some![/url]
Learn to understand recursive CTEs by example.[/url]
[url url=http://www.sqlservercentral.com/articles/St
Viewing 15 posts - 1 through 15 (of 22 total)
You must be logged in to reply to this topic. Login to reply