June 21, 2007 at 5:10 am
In the Latin1_General collation, ss is treated like ß (German sharp s) in comparisons:
'ss' = 'ß' --> true
Note that this is also true with LIKE:
'ss' LIKE 'ß' --> true
But if a '%' is part of the search pattern, things become complicated:
'ss' LIKE 'ß%' --> true
'ß' LIKE 'ss%' --> false (!!!)
Notably, the following are all true:
'ß' LIKE '%ss' --> true
'ß' LIKE '%ss%' --> true
'ß' LIKE 'ss%%' --> true
'ßa' LIKE 'ss%' --> true
Take a look at the tests with two ß:
'ßß' LIKE 'ssss%' --> false (guessed that)
'ßßa' LIKE 'ssss%' --> false (now that's a surprise)
'ßßaa' LIKE 'ssss%' --> true
Alright, there's a pattern in this behavior:
'ßßßaa' LIKE 'ssssss%' --> false
'ßßßaaa' LIKE 'ssssss%' --> true
My guess is that LIKE is optimized for the special case of patterns that end with '%' and contain no other wildcards. It seems that the comparison always fails if the searched text is shorter than the non-wildcard part of the pattern.
My questions are:
1. Is this behavior a bug or a feature?
2. Are there other subtleties in the behavior of LIKE?
3. Is the precise algorithm for LIKE comparison documented somewhere?
I need to rebuild the comparison routine in C as accurately as possible. I've tried SQL Server 2000 and 2005.
I appreciate any help on this.
August 25, 2008 at 2:50 am
Hi there...
I found an answer to your problem
Convert them to Nvarchar.
Ok ok, I knowusing CAST or CONVERT would add some more code... but you can do something like this
IF N'ßßa' LIKE N'ssss%'
SELECT 'TRUE'
or you could just put the N'' to only one of the two...
if you start a string with an N automatically converts it into an NVARCHAR Format
Correct me if I'm wrong... but I think its because of the difference in the bytes used... uhmmm like ASCII??? actually I'm not really sure of the reason 😛
:cool:well atleast it worked
_____________________________________________
[font="Comic Sans MS"]Quatrei Quorizawa[/font]
:):D:P;):w00t::cool::hehe:
MABUHAY PHILIPPINES!
August 26, 2008 at 7:25 am
Kirill Müller (6/21/2007)
In the Latin1_General collation, ss is treated like ß (German sharp s) in comparisons:
Apart from your specific problems regarding the LIKE operator, I'm very surprised that ss is treated like ß. Thus in is the sequence:
create table t(c varchar(10))
insert into t values('ß')
insert into t values('ss')
select *
from t
where c='ss'
select *
from t
where c='ß'
both records always get selected.
I was under the impression that ß can 'optionally' be written as 'ss' but this is not true the other way around. In any case, I just don't see the ability to make this distinction built into a collation sequence.
September 5, 2008 at 4:41 am
I was under the impression that ß can 'optionally' be written as 'ss' but this is not true the other way around.
If I remember correctly, under the German spelling reform in the 1990s, all occurrences of ß were changed to ss. This spelling reform had many problems (mostly a lack of popularity), and was officially rejected/changed a while later. So many words were changed back to being officially spelt with ß. Hard work for teachers. I do agree with the quote above, though.
September 5, 2008 at 4:46 am
No, actually, the ß is used after "long" vowels (like in "Fuß" which is pronounced like "fooss") and after diphthongs (like "außen"). All other instances of ß have been replaced by ss, the most popular word that has been changed is "dass" (translated: "so that").
Thanks for your replies!
September 8, 2008 at 6:01 am
Hi Kirill,
Thanks for the explanation; I am trying to learn German, but I am finding it hard work. Bring back Esperanto 🙂
Viewing 6 posts - 1 through 5 (of 5 total)
You must be logged in to reply to this topic. Login to reply