March 8, 2010 at 11:35 am
Hi,
I want to check if a field contains Unicode characters or not. This is how I have been doing it:
SELECT *
FROM Tablename
WHERE CAST(Fieldname AS VARCHAR(MAX)) <> Fieldname
It *seems* to work when I do some basic tests, such as these below:
--Test 1:
DECLARE @text NVARCHAR(100)
SET @text = N'This is non-Unicode text, in Unicode'
IF CAST(@text AS VARCHAR(MAX)) <> @text
PRINT 'Contains Unicode characters'
ELSE
PRINT 'No Unicode characters'
GO
--Test 2:
DECLARE @text NVARCHAR(100)
SET @text = N'This is Unicode (字) text, in Unicode'
IF CAST(@text AS VARCHAR(MAX)) <> @text
PRINT 'Contains Unicode characters'
ELSE
PRINT 'No Unicode characters'
GO
Do you think this a safe method to use or not?
Thanks in advance.
March 9, 2010 at 12:39 am
Seems safe to me. I can't help but wonder why you need to know, though.
Paul White
SQLPerformance.com
SQLkiwi blog
@SQL_Kiwi
March 10, 2010 at 1:45 am
Before I populate a VARCHAR field from a NVARCHAR field, I want to check if it is possible to do without raising an error.
March 10, 2010 at 1:49 am
xnl28-574517 (3/10/2010)
Before I populate a VARCHAR field from a NVARCHAR field, I want to check if it is possible to do without raising an error.
Converting Unicode to ASCII in SQL Server does not raise an error. Any characters that don't map, get replaced by question marks.
Paul White
SQLPerformance.com
SQLkiwi blog
@SQL_Kiwi
March 10, 2010 at 4:36 am
You're right, the data gets clobbered.
What I meant to say was that I want to perform a check without clobbering the data.
July 24, 2013 at 2:36 pm
The type precidence between varchar and nvarchar on your code will upconvert the varchar to an nvarchar during the comparison. It will still work though. It's like you are converting and then immediately unconverting and seeing if they match.
create table #test (
uni nvarchar(100)
)
insert into #test (uni) values (N'????'),( 'asdf'), (N'Foo'), ('Bar')
select *
from #test
where uni <> convert(varchar(max), uni)
May 1, 2017 at 4:36 pm
I know, I know, this post is so old and why am I replying.
Because I need to do this exact same thing, and the "I can't help but wonder why" comment inspired me to tell you why I need to, FWIW:
Many of the tables I design were designed by former developers, using varchar(max). Considering that we're now encouraged to avoid max lengths when possible, I'm re-considering all data types rather than continue the old design.
While doing so, I'm also considering nvarchar. Even though I like nvarchar better (after reading a lot about codepages and opinions online), I'm having a hard time convincing our code approvers for me to use it.
I'm searching for any non-Ascii characters in my tables, sometimes, to see if I can prove to them that there may be a reason (other than codepages and applications of the future which may only accept Unicode) that the previous table was designed using nvarchar.
Viewing 7 posts - 1 through 6 (of 6 total)
You must be logged in to reply to this topic. Login to reply