February 2, 2017 at 10:55 am
I have a function that I'm using that is working with one glitch. It is designed to change non-English characters to their closest equivalent. Problem is it is always returning Upper Case. Any ideas:
FUNCTION:
USE [NBCC_Search]
GO
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
Create Function [dbo].[fn_ReplaceSpecialCharacters] (@TempString AS VARCHAR(100))
RETURNS VARCHAR(100)
AS
BEGIN
RETURN replace(replace(
replace(replace(@TempString
,'Ș', 'S')
,'ș', 's')
,'Ț', 'T')
,'ț', 't')
END
GO
I call the above within a select:
USE [BCC_Search]
GO
SELECT [ContactId],[FirstName],[LastName]
,[dbo].[fn_ReplaceSpecialCharacters]([LastName]) AS 'NewName'
FROM [dbo].[Professional] where country = 'Romania' order by [LastName] desc
GO
But the word "Buchareșt" comes back as "BuchareSt"
February 2, 2017 at 11:09 am
your database is case-insensitive, so ?,S,? and s (four different versions of S)are found in the last replace.
you want to modify the proc to use binary collation insteadALTER Function [dbo].[fn_ReplaceSpecialCharacters] (@TempString AS VARCHAR(100))
RETURNS VARCHAR(100)
AS
BEGIN
RETURN replace(replace(
replace(replace(@TempString COLLATE Latin1_General_BIN
,'?', 'S')
,'?', 's')
,'?', 'T')
,'?', 't')
END
GO
SELECT [dbo].[fn_ReplaceSpecialCharacters]('ÀAlbèert ËEîin?s?têeiìn ÌInstìitúutëe')
Lowell
February 2, 2017 at 11:22 am
Hmm... Problem persists... even the test comes back as ÀAlbèert ËEîinSsStêeiìn ÌInstìitúutëe
February 2, 2017 at 11:49 am
Converting those values to varchar is likely what's getting you here. Those characters get turned into a CHAR(63) in VARCHAR.
Change the parameter and output types to NVARCHAR, and see if that helps.
Cheers!
February 2, 2017 at 11:52 am
Still same problem with nvarchar
February 2, 2017 at 11:58 am
try this:
this is explicitly finding the ascii value, and replacing them; it's a modification of a strip high asciii that i had made previously.
note there are some characters you have to decide to fiddle with yourself.,is the times indicator an x or an asterisk? is theta a zero or an "oh"? use master;
GO
IF OBJECT_ID('[dbo].[StripReplaceASCII_ITVF]') IS NOT NULL
DROP FUNCTION [dbo].[StripReplaceASCII_ITVF]
GO
--#################################################################################################
--StripReplace_ASCII_ITVF removes specific characters from a string.
--usage for ITVF requires cross apply or cross join
--ie SELECT TOP 100 fn.CleanedText,MRNumber From EDLogDetail CROSS APPLY dbo.StripReplace_ASCII_ITVF(MRNumber) fn WHERE MRNumber IS NOT NULL
--#################################################################################################
CREATE FUNCTION dbo.StripReplaceASCII_ITVF(@OriginalText VARCHAR(8000))
RETURNS TABLE WITH SCHEMABINDING AS
RETURN
WITH
E1(N) AS (select 1 from (values (1),(1),(1),(1),(1),(1),(1),(1),(1),(1))dt(n)),
E2(N) AS (SELECT 1 FROM E1 a, E1 b), --10E+2 or 100 rows
E4(N) AS (SELECT 1 FROM E2 a, E2 b), --10E+4 or 10,000 rows max
Tally(N) AS
(
SELECT ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) FROM E4
),HighAsciiReplaced
AS
(
select STUFF(
(
SELECT
CASE
--WHEN ASCII(SUBSTRING(@OriginalText,Tally.N,1)) IN (9,10,13) THEN SUBSTRING(@OriginalText,Tally.N,1) --tab,lf,cr
--WHEN ASCII(SUBSTRING(@OriginalText,Tally.N,1)) BETWEEN 32 AND 127 THEN SUBSTRING(@OriginalText,Tally.N,1)
--WHEN ASCII(SUBSTRING(@OriginalText,Tally.N,1)) <= 32 THEN ''
--WHEN ASCII(SUBSTRING(@OriginalText,Tally.N,1)) >= 128 THEN ''
WHEN ASCII(SUBSTRING(@OriginalText,Tally.N,1)) BETWEEN 192 AND 197 THEN 'A' -- ÀÁÂÃÄÅ -->A 192 to 197
WHEN ASCII(SUBSTRING(@OriginalText,Tally.N,1)) = 198 THEN 'AE' --Æ -->AE 198
WHEN ASCII(SUBSTRING(@OriginalText,Tally.N,1)) = 199 THEN 'C' --Ç -->C 199
WHEN ASCII(SUBSTRING(@OriginalText,Tally.N,1)) BETWEEN 200 AND 203 THEN 'E' --ÈÉÊË -->E 200 to 203
WHEN ASCII(SUBSTRING(@OriginalText,Tally.N,1)) BETWEEN 204 AND 207 THEN 'I' --ÌÍÎÏ -->I 204 to 207
WHEN ASCII(SUBSTRING(@OriginalText,Tally.N,1)) = 208 THEN 'D' --Ð -->D 208
WHEN ASCII(SUBSTRING(@OriginalText,Tally.N,1)) = 209 THEN 'N' --Ñ -->N 209
WHEN ASCII(SUBSTRING(@OriginalText,Tally.N,1)) BETWEEN 210 AND 214 THEN 'O' --ÒÓÔÕÖ -->O 210 to 214
WHEN ASCII(SUBSTRING(@OriginalText,Tally.N,1)) = 215 THEN '*' --× -->x or * 215 (multiplication operator)
WHEN ASCII(SUBSTRING(@OriginalText,Tally.N,1)) = 216 THEN '0' --Ø -->0 216 zero or capital "o"?
WHEN ASCII(SUBSTRING(@OriginalText,Tally.N,1)) BETWEEN 217 AND 220 THEN 'U' --ÙÚÛÜ -->U 217 to 220
WHEN ASCII(SUBSTRING(@OriginalText,Tally.N,1)) = 221 THEN 'Y' --Ý -->Y 221
WHEN ASCII(SUBSTRING(@OriginalText,Tally.N,1)) = 222 THEN ' ' --Þ --> 222
WHEN ASCII(SUBSTRING(@OriginalText,Tally.N,1)) = 223 THEN ' ' --ß --> 223
WHEN ASCII(SUBSTRING(@OriginalText,Tally.N,1)) BETWEEN 224 AND 229 THEN 'a' --àáâãäå -->a 224 to 229
WHEN ASCII(SUBSTRING(@OriginalText,Tally.N,1)) = 230 THEN 'ae' --æ -->ae 230
WHEN ASCII(SUBSTRING(@OriginalText,Tally.N,1)) = 231 THEN 'c' --ç -->c 231
WHEN ASCII(SUBSTRING(@OriginalText,Tally.N,1)) BETWEEN 232 AND 235 THEN 'e' --èéêë -->e 232 to 235
WHEN ASCII(SUBSTRING(@OriginalText,Tally.N,1)) BETWEEN 236 AND 239 THEN 'i' --ìíîï -->i 236 to 239
WHEN ASCII(SUBSTRING(@OriginalText,Tally.N,1)) = 240 THEN 'o' --ð --> 240
WHEN ASCII(SUBSTRING(@OriginalText,Tally.N,1)) = 240 THEN 'n' --ñ -->n 241
WHEN ASCII(SUBSTRING(@OriginalText,Tally.N,1)) BETWEEN 242 AND 246 THEN 'o' --òóôõö -->o 242 to 246
WHEN ASCII(SUBSTRING(@OriginalText,Tally.N,1)) = 247 THEN '/' --÷ -->/ 247 (division operator)
WHEN ASCII(SUBSTRING(@OriginalText,Tally.N,1)) = 248 THEN '0' --ø -->0 248 zero or capital "o"?
WHEN ASCII(SUBSTRING(@OriginalText,Tally.N,1)) BETWEEN 249 AND 252 THEN 'u' --ùúûü -->u 249 to 252
WHEN ASCII(SUBSTRING(@OriginalText,Tally.N,1)) = 253 THEN 'y' --ý -->y 253
WHEN ASCII(SUBSTRING(@OriginalText,Tally.N,1)) = 254 THEN ' ' --þ --> 254
WHEN ASCII(SUBSTRING(@OriginalText,Tally.N,1)) = 255 THEN 'y' --ÿ -->y 255
ELSE SUBSTRING(@OriginalText,Tally.N,1)
END
FROM Tally
WHERE Tally.N <= len(@OriginalText) -- added by ajb
FOR XML PATH(''), TYPE).value('.','varchar(max)'),1,1,'') as CleanedText
)
SELECT CleanedText FROM HighAsciiReplaced
GO
SELECT * FROM [dbo].[StripReplaceASCII_ITVF]('ÀAlbèert ËEîinstêeiìn ÌInstìitúutëe')
Lowell
February 2, 2017 at 12:01 pm
Exactly what changes did you make?
It works fine for me.
If you're using Lowell's example, be sure that the string literal is N'ÀAlbèert ËEîin?s?têeiìn ÌInstìitúutëe'.
Otherwise that starting string is VARCHAR.
Cheers!
February 2, 2017 at 12:04 pm
Also note that the input parameter to the function should be declared as NVARCHAR
February 2, 2017 at 12:07 pm
DesNorton - Thursday, February 2, 2017 12:04 PMAlso note that the input parameter to the function should be declared as NVARCHAR
Indeed, as I mentioned in my first post 😀
February 2, 2017 at 12:10 pm
i think we are just dealing with high ascii here, char 126-255, so switching to nvarchar shouldn't have an impact; i think it's just collation and case sensitivity.
Lowell
February 2, 2017 at 12:11 pm
Lowell - Thursday, February 2, 2017 12:10 PMi think we are just dealing with high ascii here, char 126-255, so switching to nvarchar shouldn't have an impact; i think it's just collation and case sensitivity.
I've tested it, and switching to NVARCHAR does.
Maybe there's something else going on too, but making it all unicode from start to finish fixes it.
Cheers!
February 2, 2017 at 12:17 pm
Here's what I used:
ALTER Function [dbo].[fn_ReplaceSpecialCharacters] (@TempString AS NVARCHAR(100))
RETURNS NVARCHAR(100)
AS
BEGIN
RETURN @TempString COLLATE Latin1_General_BIN
END
GO
USE [NBCC_Search]
GO
SELECT [ContactId]
,[FirstName]
,[LastName]
,[dbo].[fn_ReplaceSpecialCharacters]([LastName]) AS 'NewName'
FROM [dbo].[Professional] where country = 'Romania' order by [LastName] desc
GO --Does not work
SELECT [dbo].[fn_ReplaceSpecialCharacters](N'?') --Works
SELECT [dbo].[fn_ReplaceSpecialCharacters](N'ÀAlbèert ËEîin?s?têeiìn ÌInstìitúutëe') --Works
February 2, 2017 at 12:18 pm
Lowell - Thursday, February 2, 2017 12:10 PMi think we are just dealing with high ascii here, char 126-255, so switching to nvarchar shouldn't have an impact; i think it's just collation and case sensitivity.
I'm not so sure about that. Although, it might be the font of the forum that is messing with my code.
This code returns values of 536, 537, 538, 539
SELECT
UNICODE(N'Ș')
, UNICODE(N'È™')
, UNICODE(N'Èš')
, UNICODE(N'È›')
February 2, 2017 at 12:19 pm
briancampbellmcad - Thursday, February 2, 2017 12:17 PMHere's what I used:ALTER Function [dbo].[fn_ReplaceSpecialCharacters] (@TempString AS NVARCHAR(100))
RETURNS NVARCHAR(100)
AS
BEGIN
RETURN @TempString COLLATE Latin1_General_BIN
END
GOUSE [NBCC_Search]
GO
SELECT [ContactId]
,[FirstName]
,[LastName]
,[dbo].[fn_ReplaceSpecialCharacters]([LastName]) AS 'NewName'
FROM [dbo].[Professional] where country = 'Romania' order by [LastName] desc
GO --Does not workSELECT [dbo].[fn_ReplaceSpecialCharacters](N'È™') --Works
SELECT [dbo].[fn_ReplaceSpecialCharacters](N'ÀAlbèert ËEîinȘsștêeiìn ÌInstìitúutëe') --Works
Thanks for the update!
Could you post the DDL for the Professional table?
Cheers!
February 2, 2017 at 12:28 pm
DesNorton - Thursday, February 2, 2017 12:18 PMLowell - Thursday, February 2, 2017 12:10 PMi think we are just dealing with high ascii here, char 126-255, so switching to nvarchar shouldn't have an impact; i think it's just collation and case sensitivity.I'm not so sure about that. Although, it might be the font of the forum that is messing with my code.
This code returns values of 536, 537, 538, 539
SELECT
UNICODE(N'Ș')
, UNICODE(N'È™')
, UNICODE(N'Èš')
, UNICODE(N'È›')
ahh i see that now that you pasted the unicode;
yeah, with that in the mix, i agree it's gotta be nvarchar. i see that now, thanks Jacob.
Lowell
Viewing 15 posts - 1 through 15 (of 32 total)
You must be logged in to reply to this topic. Login to reply