RTRIM is not working as expected

Question

Post reply

RTRIM is not working as expected

Viewing 15 posts - 16 through 30 (of 31 total)

You must be logged in to reply to this topic. Login to reply

serg-52 SSCrazy Eights Points: 9913 More actions · Answer 1

The weird think is that OP's untrimmable symbol has ASCII =32.

All untrimmable spaces demonstrated so far have other ASCII codes reported by SQL.

Louis Hillebrand SSCarpal Tunnel Points: 4982 More actions · Answer 2

The Unicode space-characters 32 and 0x2000 to 0x2003 return an ASCII value of 32.

RTRIM only trims Unicode character 32.

If a string ends with NCHAR(0x2000) RTIM will not remove the last character, but the ASCII value of the last character shows 32.

You can test this by using UNICODE() instead of ASCII() to show the Unicode value of the character.

serg-52 SSCrazy Eights Points: 9913 More actions · Answer 3

Sql 2008

select s, ASCII(s),UNICODE(s), dump =cast(s as varbinary(10))

from (values

(NCHAR(0x0020)),(NCHAR(0x0120)),(NCHAR(0x0220)),(NCHAR(0x0320))

,(NCHAR(0x2000)),(NCHAR(0x2001)),(NCHAR(0x2002)),(NCHAR(0x2003))

) t(s)

s(No column name)(No column name)dump

32320x2000

G712880x2001

?635440x2002

?638000x2003

6381920x0020

6381930x0120

6381940x0220

6381950x0320

Cadavre SSC-Forever Points: 41690 More actions · Answer 4

Louis Hillebrand (4/22/2015)
In Unicode there are more Space-Characters than ASCII(32)
You can use UNICODE(Char) to get the Unicode value for the space-character.

I wasn't aware of UNICODE as a function, but it certainly sounds like you've hit the nail on the head. Just need to await the OP now.

--EDIT--

Just wanted to add my own little confirmation that it's probably a UNICODE character as Louis points out: -

SELECT a.X AS [Type],

ASCII(a.N) AS [Ascii Value],

UNICODE(a.N) AS [Unicode Value],

ASCII(RTRIM(a.N)) AS [Ascii Value after RTRIM],

UNICODE(RTRIM(a.N)) AS [Unicode Value after RTRIM]

FROM ( VALUES ( 'Not a space', NCHAR(0x2002)), ( 'A space', CHAR(32)) ) a ( X, N );

Type Ascii Value Unicode Value Ascii Value after RTRIM Unicode Value after RTRIM

----------- ----------- ------------- ----------------------- -------------------------

Not a space 32 8194 32 8194

A space 32 32 NULL NULL

So the Unicode value NCHAR(0x2002) will return ASCII 32 before and after an RTRIM (I'm sure there are other values that do the same, that was just the one I pulled out of the hat).

Forever trying to learn
My blog - http://www.cadavre.co.uk/
For better, quicker answers on T-SQL questions, click on the following...http://www.sqlservercentral.com/articles/Best+Practices/61537/
For better, quicker answers on SQL Server performance related questions, click on the following...http://www.sqlservercentral.com/articles/SQLServerCentral/66909/

Paul Hernández SSCarpal Tunnel Points: 4961 More actions · Answer 5

That is the result of the UNICODE,

I would like to take some time to dig deeper on this and summarize all the information.

Thanks all of you for the support!

Paul Hernández

Louis Hillebrand SSCarpal Tunnel Points: 4982 More actions · Answer 6

SQL 2008R2 Collation SQL_Latin1_General_CP1_CI_AS

Could also be a difference in regional settings..

s(No column name)(No column name)dump

32320x2000

G712880x2001

?635440x2002

?638000x2003

3281920x0020

3281930x0120

3281940x0220

3281950x0320

Cadavre SSC-Forever Points: 41690 More actions · Answer 7

Paul Hernández (4/22/2015)
That is the result of the UNICODE,
I would like to take some time to dig deeper on this and summarize all the information.
Thanks all of you for the support!

A space that RTRIM can get rid of would have a UNICODE value of 32, so that's your issue 🙂

Forever trying to learn
My blog - http://www.cadavre.co.uk/
For better, quicker answers on T-SQL questions, click on the following...http://www.sqlservercentral.com/articles/Best+Practices/61537/
For better, quicker answers on SQL Server performance related questions, click on the following...http://www.sqlservercentral.com/articles/SQLServerCentral/66909/

Louis Hillebrand SSCarpal Tunnel Points: 4982 More actions · Answer 8

12288 is the Unicode character 'IDEOGRAPHIC SPACE' (0x3000),

Added to my list of Unicode-spaces... which I use in my front-end to cleanup any copy-paste strings..

serg-52 SSCrazy Eights Points: 9913 More actions · Answer 9

NCHAR(0x0020) which is binary dumped as 0x2000 is the only symbol with ASCII =32 and it's trimmed OK. All the rest have other ASCII.

with tt as (

select top(256) n=row_number() over (order by (select null))-1

from sys.all_objects

), chars as (

select bin = cast(256*tt1.n + tt2.n as varbinary(2)) -- ==NCHAR(0x<tt2.n><tt1.n>), big/little-endian stuff

from tt tt1, tt tt2

)

select bin

, char = cast(bin as nvarchar(1))

, ascii = ASCII(cast(bin as nvarchar(1)))

, unc = UNICODE(cast(bin as nvarchar(1)))

from chars

--where ASCII(cast(bin as nvarchar(1))) = 32

order by UNICODE(cast(bin as nvarchar(1)))

Cadavre SSC-Forever Points: 41690 More actions · Answer 10

serg-52 (4/22/2015)
NCHAR(0x0020) which is binary dumped as 0x2000 is the only symbol with ASCII =32 and it's trimmed OK. All the rest have other ASCII.

Must be collation based, e.g.

SELECT CHAR(34) + a.[STRING] + CHAR(34) AS [QUOTED STRING],

CHAR(34) + a.[TRIMMED STRING] + CHAR(34) AS [QUOTED TRIMMED STRING],

ASCII(RIGHT(a.[STRING], 1)) AS [ASCII CODE OF LAST CHARACTER],

ASCII(RIGHT(a.[TRIMMED STRING], 1)) AS [ASCII CODE OF LAST CHARACTER AFTER TRIM],

UNICODE(RIGHT(a.[STRING], 1)) AS [UNICODE CODE OF LAST CHARACTER],

UNICODE(RIGHT(a.[TRIMMED STRING], 1)) AS [UNICODE CODE OF LAST CHARACTER AFTER TRIM]

FROM ( SELECT a.N AS [STRING],

RTRIM(LTRIM(a.N)) AS [TRIMMED STRING]

FROM ( VALUES

( CONVERT(NVARCHAR(MAX), 0x540068006900730020006900730020006100200073007400720069006E0067000220) COLLATE SQL_Latin1_General_CP1_CI_AI),

( CONVERT(NVARCHAR(MAX), 0x540068006900730020006900730020006100200073007400720069006E0067002000) COLLATE SQL_Latin1_General_CP1_CI_AI),

( CONVERT(NVARCHAR(MAX), 0x540068006900730020006900730020006100200073007400720069006E0067000030) COLLATE SQL_Latin1_General_CP1_CI_AI) ) a ( N )

) a;

QUOTED STRING QUOTED TRIMMED STRING ASCII CODE OF LAST CHARACTER ASCII CODE OF LAST CHARACTER AFTER TRIM UNICODE CODE OF LAST CHARACTER UNICODE CODE OF LAST CHARACTER AFTER TRIM

---------------------- ---------------------- ---------------------------- --------------------------------------- ------------------------------ -----------------------------------------

"This is a string " "This is a string " 32 32 8194 8194

"This is a string " "This is a string" 32 103 32 103

"This is a string " "This is a string " 32 32 12288 12288

Forever trying to learn
My blog - http://www.cadavre.co.uk/
For better, quicker answers on T-SQL questions, click on the following...http://www.sqlservercentral.com/articles/Best+Practices/61537/
For better, quicker answers on SQL Server performance related questions, click on the following...http://www.sqlservercentral.com/articles/SQLServerCentral/66909/

serg-52 SSCrazy Eights Points: 9913 More actions · Answer 11

Yes, ASCII _is_ collation dependent.

select ASCII(s collate Chinese_Simplified_Pinyin_100_BIN)

,ASCII(s collate Greek_100_BIN)

,ASCII(s collate Latin1_General_100_BIN)

from (values

(NCHAR(0x0020)),(NCHAR(0x0120)),(NCHAR(0x0220)),(NCHAR(0x0320))

,(NCHAR(0x2000)),(NCHAR(0x2001)),(NCHAR(0x2002)),(NCHAR(0x2003))

) t(s)

I should check it first of all.

Louis Hillebrand SSCarpal Tunnel Points: 4982 More actions · Answer 12

A Scaler function that trims leading and trailing spaces from a Unicode string.

dbo.Numbers is a Tally-table.

CREATE FUNCTION dbo.fn_NTRIM(@T nvarchar(4000))

RETURNS nvarchar(4000)

AS

BEGIN

DECLARE @R nvarchar(4000);

WITH cte_Space (UC) AS (

SELECT UC

FROM(

VALUES(0x20),--Space

(0xA0),--No-Break Space

(0x180E),--Mongolian Vowel Separator

(0x2000),--En Quad

(0x2001),--Em Quad

(0x2002),--En Space

(0x2003),--Em Space

(0x2004),--Three-Per-Em Space

(0x2005),--Four-Per-Em Space

(0x2006),--Six-Per-Em Space

(0x2007),--Figure Space

(0x2008),--Punctuation Space

(0x2009),--Thin Space

(0x200A),--Hair Space

(0x200B),--Zero Width Space

(0x202F),--Narrow No-Break Space

(0x205F),--Medium Mathematical Space

(0x3000), --Ideographic Space

(0xFEFF)--Zero Width No-Break Space

) AS t(UC)

)

SELECT @R = LTRIM(RTRIM(CAST(Data as nvarchar(4000)))) FROM

(SELECTCASE WHEN SC.UC is null THEN SUBSTRING(@T, num, 1) ELSE NCHAR(32) END

FROMdbo.Numbers N LEFT OUTER JOIN

cte_Space SC ON UNICODE(SUBSTRING(@T, num, 1)) = SC.UC

WHEREN.Num <= LEN(@T)

FORXML PATH (''), TYPE ) AS Z(Data);

RETURN @R

END

GO

--############################################################################################################

DECLARE @T nvarchar(100) = 'this is just a' + NCHAR(160) + 'text' + NCHAR (8195) + 'for test' + NCHAR(8195) + nchar(8196);

SELECT '>' + RTRIM (@T) + '<', '>' + dbo.fn_NTRIM (@T) + '<';

TheSQLGuru SSC Guru Points: 134017 More actions · Answer 13

BEWARE the Scalar UDF though!! It has MANY downsides, including voiding the use of parallelism, preventing accurate estimates but the optimizer (with associated bad performance and bad concurrency), often forcing row-by-agonizing-row operations under the covers, etc.

Best,
Kevin G. Boles
SQL Server Consultant
SQL MVP 2007-2012
TheSQLGuru on googles mail service

Thomas Schutte SSChasing Mays Points: 646 More actions · Answer 14

If you only have alphanumeric values on your column you can use STUFF in combination with PATINDEX to get rid of the junk:

DECLARE @teststring nvarchar(20)
SET @teststring = 'test ' + CHAR(13)
SELECT REVERSE(STUFF(REVERSE(@teststring), 1, PATINDEX('%[a-z0-1]%', REVERSE(@teststring)) - 1, ''))

PATINDEX identifies the first valid character from the back using REVERSE (either letter or number) and removes the rest.

Robert-378556 SSCertifiable Points: 5976 More actions · Answer 15

Paul Hernández (4/22/2015)
Hi Tom, thanks for your answer.
Please forget my first post, I copied and pasted the query that SSAS issues, therefore contains this weird aliasing.
Here is a better code:
SELECT
DISTINCT
[dim_vArticles].[ArticleId],
'"'+rtrim([StyleDescription])+'"' AS StyleDescription
, ASCII(RIGHT([StyleDescription],1)) ASCIIChar
FROM [dim].[vArticles] AS [dim_vArticles]
where ArticleId = '82-4056204122396-129'
Here is the result:
I expect the blank space at the end to be removed, but as it can be seen is still there, so the question is, why the RTRIM function does not remove the space at the end of the string?
Thanks for your help!

In sql server you can't use alias in the same select, so "ASCII(RIGHT([StyleDescription],1)) ASCIIChar" fetches the original value, not the trimmed value from alias.

Try this:

SELECT

DISTINCT

[dim_vArticles].[ArticleId],

'"'+rtrim([StyleDescription])+'"' AS StyleDescription_alias

, ASCII(RIGHT([StyleDescription_alias],1)) ASCIIChar

FROM [dim].[vArticles] AS [dim_vArticles]

where ArticleId = '82-4056204122396-129'

This will return "invalid column name StyleDescription_alias".