April 9, 2007 at 4:49 pm
Hi,
I am currently troubleshooting an issue with the Support team of our HR Software but I thought of posting it here to see if somebody ever saw that issue. The application has a key generator that is used throughout the application.
We are trying to find out why SQL is saying that string a is less than string b.
If I try this (collation is SQL_Latin1_General_CP1_CI_AS):
select
case when '###À#' > '###AÞ' then 'Greater' else 'Less' end
it returns 'Less'
but
select case when 'À' > 'A' then 'Greater' else 'Less' end
returns 'Greater'
We know that the 'Þ' character is greater than the 'À'.
Why if the 'À' is greater than the 'A', how come it is still considering as being less than? Does the 'Þ' character overides all the character in the string (seems that SQL is not doing a one by one comparison...)
Any thought on this would be greatly appreciated...
Thanks!
April 9, 2007 at 5:31 pm
Collation is set per DATABASE default, or that particular column?
N 56°04'39.16"
E 12°55'05.25"
April 9, 2007 at 5:58 pm
Problem is
select case when 'À#' > 'AÞ' then 'Greater' else 'Less' end
returns "Less".
Character "Þ" probably does not exist in this collation and makes correct matching not possible.
select case when N'À#' COLLATE Greek_BIN > N'AÞ' COLLATE Greek_BIN
then 'Greater' else 'Less' end
returns "Greater"
_____________
Code for TallyGenerator
April 9, 2007 at 6:01 pm
Actually using Latin1_General_BIN or any other "binary" collation brings you right result.
_____________
Code for TallyGenerator
April 9, 2007 at 6:55 pm
USE below to get the desired (Does not matter collation)
SELECT CASE WHEN BINARY_CHECKSUM('###À#') > BINARY_CHECKSUM('###AÞ') THEN 'Greater' ELSE 'Less' END
Ram
April 9, 2007 at 10:59 pm
Yikes! You can't rely on BINARY_CHECKSUM because it is not that unique you can expect!
See http://www.sqlteam.com/forums/topic.asp?TOPIC_ID=70832
N 56°04'39.16"
E 12°55'05.25"
April 10, 2007 at 5:22 am
Peter,
Collation is Database default.
The character does exist.
The problem is more as to find out why does even the À is greater than A, as soon as we add another character, it becomes less? Just trying to understand the logic here.
The solution might be to switch to another collation...we will let the Support team come up with that. The key generator Stored Procedure is used throughout the application! They have a big problem! (and us too!)
April 10, 2007 at 6:25 pm
Yvan, who told you that this character does exist?
Yes, it exists somewhere, but why you so sure it does exist for chosen collation?
"CP1" in the name of that collation means "Code Page 1". Did you ever see set of characters available in that Code page?
That's why I suggested to use binary collations. Those ones use binary representations and don't fail on unknown symbols.
And about the logic. As soon as you add unknown character to the string the whole string becomes out of scope and cannot be compared to anything. Any string within the CodePage will be greater than that.
_____________
Code for TallyGenerator
April 11, 2007 at 5:56 am
Sergiy,
The Support team told me that the code page was 1252. BOL tells the same thing.
Transact-SQL Reference
A single string that specifies the collation name for a SQL collation.
< SQL_collation_name > :: =
SQL_SortRules[_Pref]_CPCodepage_<ComparisonStyle>
<ComparisonStyle> ::=
_CaseSensitivity_AccentSensitivity | _BIN
SortRules
A string identifying the alphabet or language whose sorting rules are applied when dictionary sorting is specified. Examples are Latin1_General or Polish.
Pref
Specifies uppercase preference.
Codepage
Specifies a one to four digit number identifying the code page used by the collation. CP1 specifies code page 1252, for all other code pages the complete code page number is specified. For example, CP1251 specifies code page 1251 and CP850 specifies code page 850.
If I check the content of code page 1252, it does contain all of the characters causing problems. http://www.microsoft.com/globaldev/reference/sbcs/1252.mspx
Switching to Binary changes the whole sequence of sorting, and the key generates already existing keys in the table. It should have been like that from the beginning...
As a workaround, we removed a lot of characters from the list of possible characters used to generate the key. We tested by generating repetively keys in a table and we were up to 200000 records without any duplicates...way more that what we need.
Still don't know why they complicated it with this key generator...identity would have worked very nicely.
Hoping this solution will stand...
Viewing 9 posts - 1 through 8 (of 8 total)
You must be logged in to reply to this topic. Login to reply