IN Operator

  • Comments posted to this topic are about the item IN Operator

  • The explanation correctly states that the leading spaces cause a mismatch for the two entry codes "CH" and "DE", but doesn't really explain why SQL compares 'MU ' as equal to 'MU'.

    I suppose I've seen this behavior before, so got the answer right. But I would like to to know where MS documents the way two strings are compared as equal even when their length differs (due to the defined length or trailing spaces). The MSDN link provided with the explanation merely points to the

    syntax of the "IN" operator, a substiture for multiple "=" operators.

    I did look for it, and found some interesting tidbits about comparisons, but not what I was looking for.

  • john.arnott (10/1/2009)


    But I would like to to know where MS documents the way two strings are compared as equal even when their length differs (due to the defined length or trailing spaces).

    Good point, John!

    I went into Books Online, expecting to be able to find some reference within half an hour or so. But I had to give up. Like you, I found some interesting partial stuff, but never the full official rules.

    I do know that SQL Server adheres to the ANSI standards SQL-92 and above for string comparisons. That standard dictates that when strings of unequal length are compared, the shorter string is first padded with spaces until it matches the length of the longer string, and then a character-by-character comparison follows; the result of the comparison is determined by the result of the first inequal character.

    For example, if you compare 'Boo' to 'Book', the former string will be space-padded to read 'Boo '. Then the character-by-character comparison for the first three characters shows no difference, but for the fourth it does - and since 'k' is larger than ' ' (space) in my collation, 'Book' is considered to be larger than 'Boo'.

    And yes - you are right that this should be properly documented in Books Online, and if it already is it should be easier to find.


    Hugo Kornelis, SQL Server/Data Platform MVP (2006-2016)
    Visit my SQL Server blog: https://sqlserverfast.com/blog/
    SQL Server Execution Plan Reference: https://sqlserverfast.com/epr/

  • Hugo Kornelis (10/2/2009)


    That standard dictates that when strings of unequal length are compared, the shorter string is first padded with spaces until it matches the length of the longer string, and then a character-by-character comparison follows; the result of the comparison is determined by the result of the first inequal character.

    The one that surprised me was this:

    select entryname from test2 where entrycode = 'BE '

    [with a space after BE] returns one row. But

    select entryname from test2 where entrycode like 'BE '

    returns no rows. Intuitively I'd have expected both to return the same thing.

  • Toreador (10/2/2009)


    Hugo Kornelis (10/2/2009)


    That standard dictates that when strings of unequal length are compared, the shorter string is first padded with spaces until it matches the length of the longer string, and then a character-by-character comparison follows; the result of the comparison is determined by the result of the first inequal character.

    The one that surprised me was this:

    select entryname from test2 where entrycode = 'BE '

    [with a space after BE] returns one row. But

    select entryname from test2 where entrycode like 'BE '

    returns no rows. Intuitively I'd have expected both to return the same thing.

    Correct. The former uses comparison. For the row with value 'BE', it is padded with one space to match the length of the 3-character constant in the query. After this padding, the strings are equal.

    The latter uses pattern matching. The pattern specified is "character B / character E / space character". A very unusual pattern, as LIKE is intended to be used with wildcard characters, but syntactically valid. The rules for pattern matching do not involve padding strings to match the length (as most patterns can match multiple length through the use of the % wildcard character). So the BE row does not match, as it misses the third (space) character.


    Hugo Kornelis, SQL Server/Data Platform MVP (2006-2016)
    Visit my SQL Server blog: https://sqlserverfast.com/blog/
    SQL Server Execution Plan Reference: https://sqlserverfast.com/epr/

  • Toreador (10/2/2009)


    Hugo Kornelis (10/2/2009)


    That standard dictates that when strings of unequal length are compared, the shorter string is first padded with spaces until it matches the length of the longer string, and then a character-by-character comparison follows; the result of the comparison is determined by the result of the first inequal character.

    The one that surprised me was this:

    select entryname from test2 where entrycode = 'BE '

    [with a space after BE] returns one row. But

    select entryname from test2 where entrycode like 'BE '

    returns no rows. Intuitively I'd have expected both to return the same thing.

    with LIKE operator you need to use wild card

    SELECT entryname FROM test2 WHERE entrycode LIKE 'BE%'

    this returns the result.

  • Hugo Kornelis (10/2/2009)


    A very unusual pattern, as LIKE is intended to be used with wildcard characters, but syntactically valid.

    We use it widely, where searches which can be either on exact strings or on wildcards. Something like

    ...where colname like @varname

    and @varname might be 'ABC', or 'ABC%', etc.

  • Bhavesh-1094084 (10/2/2009)


    Toreador (10/2/2009)


    Hugo Kornelis (10/2/2009)


    That standard dictates that when strings of unequal length are compared, the shorter string is first padded with spaces until it matches the length of the longer string, and then a character-by-character comparison follows; the result of the comparison is determined by the result of the first inequal character.

    The one that surprised me was this:

    select entryname from test2 where entrycode = 'BE '

    [with a space after BE] returns one row. But

    select entryname from test2 where entrycode like 'BE '

    returns no rows. Intuitively I'd have expected both to return the same thing.

    with LIKE operator you need to use wild card

    SELECT entryname FROM test2 WHERE entrycode LIKE 'BE%'

    this returns the result.

    No, you don't need to use a wildcard.

    SELECT entryname FROM test2 WHERE entrycode LIKE 'BE'; -- note no trailing space

    returns the result as well.


    Hugo Kornelis, SQL Server/Data Platform MVP (2006-2016)
    Visit my SQL Server blog: https://sqlserverfast.com/blog/
    SQL Server Execution Plan Reference: https://sqlserverfast.com/epr/

  • Toreador (10/2/2009)


    Hugo Kornelis (10/2/2009)


    A very unusual pattern, as LIKE is intended to be used with wildcard characters, but syntactically valid.

    We use it widely, where searches which can be either on exact strings or on wildcards. Something like

    ...where colname like @varname

    and @varname might be 'ABC', or 'ABC%', etc.

    Good point. When I called the pattern unusual, I was only thinking of cases like this where the search pattern is hardcoded in the query. In those situations, one would expect to see wildcard characters. You are quite right that the search pattern can be a variable, and in that case it's not unusual for some of the executions to be with no wildcard characters in the search pattern.


    Hugo Kornelis, SQL Server/Data Platform MVP (2006-2016)
    Visit my SQL Server blog: https://sqlserverfast.com/blog/
    SQL Server Execution Plan Reference: https://sqlserverfast.com/epr/

  • I did find something on MSDN, although it's a bit old (says "applies to SQL Server 2000 and 7.0"). As Hugo pointed out, SQL Server follows the ANSI/ISO spec by padding strings to equal length before comparing. Also, as has been discussed in this thread, strings are NOT padded to match the length of a "LIKE" predicate.

    http://support.microsoft.com/kb/316626

  • The explanation is a bit vague.

    " This is because the space is counted as a character that needs to be matched. "

    While I got the question right, it was because I assumed trailing spaces are not significant and that leading spaces are.

    That's not quite right.

    As pointed out above, this explains it thoroughly:

    http://support.microsoft.com/kb/316626"> http://support.microsoft.com/kb/316626

  • Just as a side note Len is similar in that trailing blanks are ignored

    select *, len(entrycode), datalength(entrycode) from test2

    identrycodeentrynameLenDataLenght

    1BEBENGALOORU22

    2CHCHENNAI33

    3DEDELHI33

    4MUMUMBAI23

  • Another thing to take into consideration is the setting of ANSI_PADDING at the time of table creation.

    If it is set to ON, the trailing space(s) are preserved when the rows are inserted, and the LENGTH and DATALENGTH functions will return different results.

    If it is set to OFF, the trailing (only) space(s) are stripped when the rows are inserted, and the LENGTH and DATALENGTH functions will return the same result.

  • Interestingly

    SELECT entryname FROM test2 WHERE entrycode LIKE 'MU' -- no trailing space

    returns the result MUMBAI. MSDN is illuminating on how leading and trailing spaces are treated by LIKE:

    http://msdn.microsoft.com/en-us/library/aa933232(SQL.80).aspx

    Forum Etiquette: How to post data/code on a forum to get the best helpMake sure that you include code in the appropriate IFCode tags, e.g. <your code here>. You can find the IFCode tags under the INSERT options when you are writing a post.

  • Didn't copy and paste the qry and just looked at it but somehow I realised where my mistake was. Nice one.

    What you don't know won't hurt you but what you know will make you plan to know better

Viewing 15 posts - 1 through 14 (of 14 total)

You must be logged in to reply to this topic. Login to reply