March 25, 2015 at 11:08 am
The following was found as a solution to detecting non-printable characters in a string. I can find no explanation of how the COLLATE function is used.
Can someone explain how the COLLATE function affects the search string to find the non-printable characters?
patindex('%[^ !-~]%' COLLATE Latin1_General_BIN, titlel) as [Position]
Thanks,
April 11, 2015 at 1:32 pm
robertharrison (3/25/2015)
The following was found as a solution to detecting non-printable characters in a string. I can find no explanation of how the COLLATE function is used.Can someone explain how the COLLATE function affects the search string to find the non-printable characters?
patindex('%[^ !-~]%' COLLATE Latin1_General_BIN, titlel) as [Position]
Thanks,
The pattern you have above has nothing to do with non-printable characters. The code above finds the first character that is NOT a space, exclamation point, dash, or tilde.
{Edit} Please see Eirikur's note below.
The purpose of the COLLATE, in this particular case, is an attempt to improve the performance of the code by forcing it to do a straight forward binary comparison of characters due to the Latin1_General_BIN collation.
Still, the code above has nothing to do with finding non-printable characters.
--Jeff Moden
Change is inevitable... Change for the better is not.
April 11, 2015 at 2:54 pm
Jeff Moden (4/11/2015)
robertharrison (3/25/2015)
The following was found as a solution to detecting non-printable characters in a string. I can find no explanation of how the COLLATE function is used.Can someone explain how the COLLATE function affects the search string to find the non-printable characters?
patindex('%[^ !-~]%' COLLATE Latin1_General_BIN, titlel) as [Position]
Thanks,
The pattern you have above has nothing to do with non-printable characters. The code above finds the first character that is NOT a space, exclamation point, dash, or tilde.The purpose of the COLLATE, in this particular case, is an attempt to improve the performance of the code by forcing it to do a straight forward binary comparison of characters due to the Latin1_General_BIN collation.
Still, the code above has nothing to do with finding non-printable characters.
Quick note, Jeff is almost right, the code will detect non lower ascii characters, that is characters not between ASCII 32 and 126, more than half of those are printable! Jeff, the dash in this case means "between";-)
I normally use something like this code
USE tempdb;
GO
SET NOCOUNT ON;
/* Sample string */
DECLARE @CRAPSTRING VARCHAR(100) =
CHAR(10) -- Line Feed
+ CHAR(12) -- Form Feed
+ CHAR(31) -- Unit Separator
+ CHAR(67) -- Uppercase C
+ CHAR(108) -- Lowercase l
+ CHAR(101) -- Lowercase e
+ CHAR(97) -- Lowercase a
+ CHAR(110) -- Lowercase n
+ CHAR(127) -- Delete
+ CHAR(17) -- Device Control 1 (oft. XON)
+ CHAR(101) -- Lowercase e
+ CHAR(15) -- Shift In / X-Off
+ CHAR(100) -- Lowercase d
;
;WITH T(N) AS (SELECT N FROM (VALUES (NULL),(NULL),(NULL),(NULL),(NULL),(NULL),(NULL),(NULL),(NULL),(NULL)) AS X(N))
, NUMS(N) AS (SELECT TOP(LEN(@CRAPSTRING)) ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) AS N FROM T T1,T T2,T T3,T T4,T T5,T T6,T T7,T T8,T T9)
SELECT
(SELECT
SUBSTRING(@CRAPSTRING,NM.N,1)
FROM NUMS NM
WHERE ASCII(SUBSTRING(@CRAPSTRING COLLATE Latin1_General_BIN,NM.N,1)) NOT BETWEEN 0 AND 31
AND ASCII(SUBSTRING(@CRAPSTRING COLLATE Latin1_General_BIN,NM.N,1)) <> 127
FOR XML PATH(''), TYPE).value('.[1]','VARCHAR(MAX)') AS CLEANED_OUTPUT;
April 11, 2015 at 9:36 pm
Ah crud. More coffee please. Thanks for the correction, Eirikur. You're absolutely correct about the dash.
That also adds more functionality to the COLLATE because it will find only the exact characters according to a binary match instead of possibly finding accented characters, as well.
--Jeff Moden
Change is inevitable... Change for the better is not.
Viewing 4 posts - 1 through 3 (of 3 total)
You must be logged in to reply to this topic. Login to reply
This website stores cookies on your computer.
These cookies are used to improve your website experience and provide more personalized services to you, both on this website and through other media.
To find out more about the cookies we use, see our Privacy Policy