March 18, 2016 at 12:05 am
I have a question about SQL Server.
Table Emp:
Id | Name
1 |test“te d'abc
2 |anr¤a
3 | gs-re-C“te d'ab
4 |M‚fe, DF
5 |R™temd
6 |™jad”ji
7 |Cje y ret¢n
8 |J™kl™balu
9 |le“ne-iokd
10 |liode-Pyr‚n‚ie
11 |V„s G”ta
12 |Sƒo Paulo
13 |vAstra gAtaland
14 |¥uble / Bio-Bio
15 |U“pl™n/ds VAsb-y
I need to replace the special characters with empty values.
Based on the above table, I want to output something like below. I tried like this:
select
REPLACE(state, SUBSTRING(state, PATINDEX('%[^"a-z"]%', state), 1), '') as name,
id
from emp
This query doesn't return the expected result.
Output is:
Id | Name
1 |testtedabc
2 |anra
3 |gsreCedab
4 |MfeDF
5 |Rtemd
6 |jadji
7 |Cjeyretn
8 |Jkbalu
9 |eneiokd
10 |iodePyrnie
11 |VsGta
12 |SoPaulo
13 |vAstragAtaand
14 |ubleBioBio
15 |UpndsVAsby
Please tell me how to write a query to achieve this task in SQL Server.
March 18, 2016 at 1:54 am
Why is the output wrong. What is the expected output?
-- Gianluca Sartori
March 18, 2016 at 2:14 am
we need replace bad character with empty string and expected output like below.
id | Name
1 |testtedabc
2 |anra
3 |gsreCedab
4 |MfeDF
5 |Rtemd
6 |jadji
7 |Cjeyretn
8 |Jkbalu
9 |eneiokd
10 |iodePyrnie
11 |VsGta
12 |SoPaulo
13 |vAstragAtaand
14 |ubleBioBio
15 |UpndsVAsby
Please tell me how to write a query to achieve this task in SQL Server.
March 18, 2016 at 3:48 am
This should do the trick:
DECLARE @sampleData TABLE (
Id int,
Name nvarchar(50)
)
INSERT INTO @sampleData VALUES
(1 ,'test“te d''abc')
,(2 ,'anr¤a')
,(3 ,' gs-re-C“te d''ab')
,(4 ,'M‚fe, DF')
,(5 ,'R™temd')
,(6 ,'™jad”ji')
,(7 ,'Cje y ret¢n')
,(8 ,'J™kl™balu')
,(9 ,'le“ne-iokd')
,(10 ,'liode-Pyr‚n‚ie')
,(11 ,'V„s G”ta')
,(12 ,'Sƒo Paulo')
,(13 ,'vAstra gAtaland')
,(14 ,'¥uble / Bio-Bio')
,(15 ,'U“pl™n/ds VAsb-y');
SELECT *
FROM @sampleData AS s
CROSS APPLY (
SELECT Name = (
SELECT SUBSTRING(s.name, number, 1) AS c
FROM master.dbo.spt_values
WHERE type = 'P'
AND number > 0
AND (
ASCII(SUBSTRING(s.name, number, 1)) BETWEEN 97 AND 122
OR
ASCII(SUBSTRING(s.name, number, 1)) BETWEEN 65 AND 90
)
ORDER BY number
FOR XML PATH(''), TYPE
).value('.','nvarchar(50)')
) AS t
-- Gianluca Sartori
March 18, 2016 at 5:06 am
If this is something you need to do frequently, you can create an inline table-valued function to do the cleaning for you. Here's one:
CREATE FUNCTION dbo.CleanAlphaNumeric(@OriginalText Varchar(8000)) RETURNS TABLE WITH SCHEMABINDING
AS
RETURN
WITH
E1(N) AS (SELECT 1 FROM (VALUES (1),(1),(1),(1),(1),(1),(1),(1),(1),(1)) dt(n)),
E2(N) AS (SELECT 1 FROM E1 a, E1 b), --10E+2 or 100 rows
E4(N) AS (SELECT 1 FROM E2 a, E2 b), --10E+4 or 10,000 rows max
Tally(N) AS (
SELECT TOP (ISNULL(DATALENGTH(@OriginalText), 0)) ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) FROM E4
),
cteCleaned(CleanText) AS (
SELECT (SELECT SUBSTRING(@OriginalText, t.N, 1)
FROM Tally t
WHERE ASCII(SUBSTRING(@OriginalText, t.N, 1)) BETWEEN 48 AND 57 --0-9
OR ASCII(SUBSTRING(@OriginalText, t.N, 1)) BETWEEN 65 AND 90 --A-Z
OR ASCII(SUBSTRING(@OriginalText, t.N, 1)) BETWEEN 97 AND 122 --a-z
OR ASCII(SUBSTRING(@OriginalText, t.N, 1)) = 32 --space
FOR XML PATH(''), TYPE).value('.','varchar(8000)')
)
SELECT CleanText
FROM cteCleaned;
Then all you have to do is use the function to clean the undesirable characters from your column. For more information on how a tally table works, see http://www.sqlservercentral.com/articles/T-SQL/62867/.
March 18, 2016 at 5:08 am
Hi except one case everything is working fine.if numbers avilable in state column then we need numbers also in the output.
example: State column have value : abc123&~ then we need output is abc123.
I added extract code ASCII(SUBSTRING(s.state, number, 1)) BETWEEN 48 AND 57 in the above query but numbers are missing. if available in state then number required in the output.please tell me where we need modify in the provide code.
March 18, 2016 at 5:21 am
SELECT *
FROM @sampleData AS s
CROSS APPLY (
SELECT Name = (
SELECT SUBSTRING(s.name, number, 1) AS c
FROM master.dbo.spt_values
WHERE type = 'P'
AND number > 0
AND (
ASCII(SUBSTRING(s.name, number, 1)) BETWEEN 97 AND 122
OR
ASCII(SUBSTRING(s.name, number, 1)) BETWEEN 65 AND 90
OR
ASCII(SUBSTRING(s.name, number, 1)) BETWEEN 48 AND 57
)
ORDER BY number
FOR XML PATH(''), TYPE
).value('.','nvarchar(50)')
) AS t
You can do the same thing on Ed's excellent answer.
-- Gianluca Sartori
March 18, 2016 at 6:59 am
I prefer to use a nice pattern instead of learning ASCII values. But that's just me. 😀
SELECT *
FROM @sampleData AS s
CROSS APPLY (
SELECT Name = (
SELECT SUBSTRING(s.name, number, 1) AS c
FROM master.dbo.spt_values
WHERE type = 'P'
AND number > 0
AND SUBSTRING(s.name, number, 1) LIKE '[a-zA-Z0-9 ]' COLLATE Latin1_General_Bin
ORDER BY number
FOR XML PATH(''), TYPE
).value('.','nvarchar(50)')
) AS t
EDIT: Added collation
March 18, 2016 at 7:51 am
Luis Cazares (3/18/2016)
I prefer to use a nice pattern instead of learning ASCII values. But that's just me. 😀
Me too, but for some reason it didn't catch all the "bad" characters in the test data, so I decided to use ASCII.
Apparently, SQL Server thinks that "™" matches the [A-Z] pattern.
-- Gianluca Sartori
March 18, 2016 at 8:01 am
spaghettidba (3/18/2016)
Luis Cazares (3/18/2016)
I prefer to use a nice pattern instead of learning ASCII values. But that's just me. 😀Me too, but for some reason it didn't catch all the "bad" characters in the test data, so I decided to use ASCII.
Apparently, SQL Server thinks that "™" matches the [A-Z] pattern.
I forgot to include the collation. Using a binary collation won't have that problem. I fixed my previous code.
Viewing 10 posts - 1 through 9 (of 9 total)
You must be logged in to reply to this topic. Login to reply