April 22, 2015 at 6:12 am
The weird think is that OP's untrimmable symbol has ASCII =32.
All untrimmable spaces demonstrated so far have other ASCII codes reported by SQL.
April 22, 2015 at 6:37 am
The Unicode space-characters 32 and 0x2000 to 0x2003 return an ASCII value of 32.
RTRIM only trims Unicode character 32.
If a string ends with NCHAR(0x2000) RTIM will not remove the last character, but the ASCII value of the last character shows 32.
You can test this by using UNICODE() instead of ASCII() to show the Unicode value of the character.
April 22, 2015 at 6:57 am
Sql 2008
select s, ASCII(s),UNICODE(s), dump =cast(s as varbinary(10))
from (values
(NCHAR(0x0020)),(NCHAR(0x0120)),(NCHAR(0x0220)),(NCHAR(0x0320))
,(NCHAR(0x2000)),(NCHAR(0x2001)),(NCHAR(0x2002)),(NCHAR(0x2003))
) t(s)
s(No column name)(No column name)dump
32320x2000
G712880x2001
?635440x2002
?638000x2003
6381920x0020
6381930x0120
6381940x0220
6381950x0320
April 22, 2015 at 6:58 am
Louis Hillebrand (4/22/2015)
In Unicode there are more Space-Characters than ASCII(32)You can use UNICODE(Char) to get the Unicode value for the space-character.
I wasn't aware of UNICODE as a function, but it certainly sounds like you've hit the nail on the head. Just need to await the OP now.
Just wanted to add my own little confirmation that it's probably a UNICODE character as Louis points out: -
SELECT a.X AS [Type],
ASCII(a.N) AS [Ascii Value],
UNICODE(a.N) AS [Unicode Value],
ASCII(RTRIM(a.N)) AS [Ascii Value after RTRIM],
UNICODE(RTRIM(a.N)) AS [Unicode Value after RTRIM]
FROM ( VALUES ( 'Not a space', NCHAR(0x2002)), ( 'A space', CHAR(32)) ) a ( X, N );
Type Ascii Value Unicode Value Ascii Value after RTRIM Unicode Value after RTRIM
----------- ----------- ------------- ----------------------- -------------------------
Not a space 32 8194 32 8194
A space 32 32 NULL NULL
So the Unicode value NCHAR(0x2002) will return ASCII 32 before and after an RTRIM (I'm sure there are other values that do the same, that was just the one I pulled out of the hat).
April 22, 2015 at 7:18 am
That is the result of the UNICODE,
I would like to take some time to dig deeper on this and summarize all the information.
Thanks all of you for the support!
April 22, 2015 at 7:19 am
SQL 2008R2 Collation SQL_Latin1_General_CP1_CI_AS
Could also be a difference in regional settings..
s(No column name)(No column name)dump
32320x2000
G712880x2001
?635440x2002
?638000x2003
3281920x0020
3281930x0120
3281940x0220
3281950x0320
April 22, 2015 at 7:24 am
Paul Hernández (4/22/2015)
That is the result of the UNICODE,I would like to take some time to dig deeper on this and summarize all the information.
Thanks all of you for the support!
A space that RTRIM can get rid of would have a UNICODE value of 32, so that's your issue 🙂
April 22, 2015 at 7:27 am
12288 is the Unicode character 'IDEOGRAPHIC SPACE' (0x3000),
Added to my list of Unicode-spaces... which I use in my front-end to cleanup any copy-paste strings..
April 22, 2015 at 7:28 am
NCHAR(0x0020) which is binary dumped as 0x2000 is the only symbol with ASCII =32 and it's trimmed OK. All the rest have other ASCII.
with tt as (
select top(256) n=row_number() over (order by (select null))-1
from sys.all_objects
), chars as (
select bin = cast(256*tt1.n + tt2.n as varbinary(2)) -- ==NCHAR(0x<tt2.n><tt1.n>), big/little-endian stuff
from tt tt1, tt tt2
)
select bin
, char = cast(bin as nvarchar(1))
, ascii = ASCII(cast(bin as nvarchar(1)))
, unc = UNICODE(cast(bin as nvarchar(1)))
from chars
--where ASCII(cast(bin as nvarchar(1))) = 32
order by UNICODE(cast(bin as nvarchar(1)))
April 22, 2015 at 7:41 am
serg-52 (4/22/2015)
NCHAR(0x0020) which is binary dumped as 0x2000 is the only symbol with ASCII =32 and it's trimmed OK. All the rest have other ASCII.
Must be collation based, e.g.
SELECT CHAR(34) + a.[STRING] + CHAR(34) AS [QUOTED STRING],
CHAR(34) + a.[TRIMMED STRING] + CHAR(34) AS [QUOTED TRIMMED STRING],
ASCII(RIGHT(a.[STRING], 1)) AS [ASCII CODE OF LAST CHARACTER],
ASCII(RIGHT(a.[TRIMMED STRING], 1)) AS [ASCII CODE OF LAST CHARACTER AFTER TRIM],
UNICODE(RIGHT(a.[STRING], 1)) AS [UNICODE CODE OF LAST CHARACTER],
UNICODE(RIGHT(a.[TRIMMED STRING], 1)) AS [UNICODE CODE OF LAST CHARACTER AFTER TRIM]
FROM ( SELECT a.N AS [STRING],
RTRIM(LTRIM(a.N)) AS [TRIMMED STRING]
FROM ( VALUES
( CONVERT(NVARCHAR(MAX), 0x540068006900730020006900730020006100200073007400720069006E0067000220) COLLATE SQL_Latin1_General_CP1_CI_AI),
( CONVERT(NVARCHAR(MAX), 0x540068006900730020006900730020006100200073007400720069006E0067002000) COLLATE SQL_Latin1_General_CP1_CI_AI),
( CONVERT(NVARCHAR(MAX), 0x540068006900730020006900730020006100200073007400720069006E0067000030) COLLATE SQL_Latin1_General_CP1_CI_AI) ) a ( N )
) a;
QUOTED STRING QUOTED TRIMMED STRING ASCII CODE OF LAST CHARACTER ASCII CODE OF LAST CHARACTER AFTER TRIM UNICODE CODE OF LAST CHARACTER UNICODE CODE OF LAST CHARACTER AFTER TRIM
---------------------- ---------------------- ---------------------------- --------------------------------------- ------------------------------ -----------------------------------------
"This is a string " "This is a string " 32 32 8194 8194
"This is a string " "This is a string" 32 103 32 103
"This is a string " "This is a string " 32 32 12288 12288
April 22, 2015 at 7:49 am
Yes, ASCII _is_ collation dependent.
select ASCII(s collate Chinese_Simplified_Pinyin_100_BIN)
,ASCII(s collate Greek_100_BIN)
,ASCII(s collate Latin1_General_100_BIN)
from (values
(NCHAR(0x0020)),(NCHAR(0x0120)),(NCHAR(0x0220)),(NCHAR(0x0320))
,(NCHAR(0x2000)),(NCHAR(0x2001)),(NCHAR(0x2002)),(NCHAR(0x2003))
) t(s)
I should check it first of all.
April 22, 2015 at 7:59 am
A Scaler function that trims leading and trailing spaces from a Unicode string.
dbo.Numbers is a Tally-table.
CREATE FUNCTION dbo.fn_NTRIM(@T nvarchar(4000))
RETURNS nvarchar(4000)
AS
BEGIN
DECLARE @r nvarchar(4000);
WITH cte_Space (UC) AS (
SELECT UC
FROM(
VALUES(0x20),--Space
(0xA0),--No-Break Space
(0x180E),--Mongolian Vowel Separator
(0x2000),--En Quad
(0x2001),--Em Quad
(0x2002),--En Space
(0x2003),--Em Space
(0x2004),--Three-Per-Em Space
(0x2005),--Four-Per-Em Space
(0x2006),--Six-Per-Em Space
(0x2007),--Figure Space
(0x2008),--Punctuation Space
(0x2009),--Thin Space
(0x200A),--Hair Space
(0x200B),--Zero Width Space
(0x202F),--Narrow No-Break Space
(0x205F),--Medium Mathematical Space
(0x3000), --Ideographic Space
(0xFEFF)--Zero Width No-Break Space
) AS t(UC)
)
SELECT @r = LTRIM(RTRIM(CAST(Data as nvarchar(4000)))) FROM
(SELECTCASE WHEN SC.UC is null THEN SUBSTRING(@T, num, 1) ELSE NCHAR(32) END
FROMdbo.Numbers N LEFT OUTER JOIN
cte_Space SC ON UNICODE(SUBSTRING(@T, num, 1)) = SC.UC
WHEREN.Num <= LEN(@T)
FORXML PATH (''), TYPE ) AS Z(Data);
RETURN @r
END
GO
--############################################################################################################
DECLARE @T nvarchar(100) = 'this is just a' + NCHAR(160) + 'text' + NCHAR (8195) + 'for test' + NCHAR(8195) + nchar(8196);
SELECT '>' + RTRIM (@T) + '<', '>' + dbo.fn_NTRIM (@T) + '<';
April 22, 2015 at 9:42 am
BEWARE the Scalar UDF though!! It has MANY downsides, including voiding the use of parallelism, preventing accurate estimates but the optimizer (with associated bad performance and bad concurrency), often forcing row-by-agonizing-row operations under the covers, etc.
Best,
Kevin G. Boles
SQL Server Consultant
SQL MVP 2007-2012
TheSQLGuru on googles mail service
April 23, 2015 at 12:08 pm
If you only have alphanumeric values on your column you can use STUFF in combination with PATINDEX to get rid of the junk:
DECLARE @teststring nvarchar(20)
SET @teststring = 'test ' + CHAR(13)
SELECT REVERSE(STUFF(REVERSE(@teststring), 1, PATINDEX('%[a-z0-1]%', REVERSE(@teststring)) - 1, ''))
PATINDEX identifies the first valid character from the back using REVERSE (either letter or number) and removes the rest.
April 24, 2015 at 3:10 am
Paul Hernández (4/22/2015)
Hi Tom, thanks for your answer.Please forget my first post, I copied and pasted the query that SSAS issues, therefore contains this weird aliasing.
Here is a better code:
SELECT
DISTINCT
[dim_vArticles].[ArticleId],
'"'+rtrim([StyleDescription])+'"' AS StyleDescription
, ASCII(RIGHT([StyleDescription],1)) ASCIIChar
FROM [dim].[vArticles] AS [dim_vArticles]
where ArticleId = '82-4056204122396-129'
Here is the result:
I expect the blank space at the end to be removed, but as it can be seen is still there, so the question is, why the RTRIM function does not remove the space at the end of the string?
Thanks for your help!
In sql server you can't use alias in the same select, so "ASCII(RIGHT([StyleDescription],1)) ASCIIChar" fetches the original value, not the trimmed value from alias.
Try this:
SELECT
DISTINCT
[dim_vArticles].[ArticleId],
'"'+rtrim([StyleDescription])+'"' AS StyleDescription_alias
, ASCII(RIGHT([StyleDescription_alias],1)) ASCIIChar
FROM [dim].[vArticles] AS [dim_vArticles]
where ArticleId = '82-4056204122396-129'
This will return "invalid column name StyleDescription_alias".
Viewing 15 posts - 16 through 30 (of 30 total)
You must be logged in to reply to this topic. Login to reply