Why doesn’t ISNUMERIC work correctly? (SQL Spackle)

  • I love this article, but I'm bothered by the following scenario...

    declare @var varchar(7)

    set @var = ''

    select case when @var NOT LIKE '%[^0-9]%'

    then 'I am a number'

    else 'I am not a number'

    end as [result]

    And the result is

    result

    -----------------

    I am a number

    Not what I was expecting, but by wrapping the var in a NULLIF function, it works out correctly.

    declare @var varchar(7)

    set @var = ''

    select case when nullif(@var,'') NOT LIKE '%[^0-9]%'

    then 'I am a number'

    else 'I am not a number'

    end as [result]

    with the resultresult

    -----------------

    I am not a number

    Am I missing something, or is this the exception to the rule?

  • mark.gilbert (4/16/2013)


    I love this article, but I'm bothered by the following scenario...

    declare @var varchar(7)

    set @var = ''

    select case when @var NOT LIKE '%[^0-9]%'

    then 'I am a number'

    else 'I am not a number'

    end as [result]

    And the result is

    result

    -----------------

    I am a number

    Not what I was expecting, but by wrapping the var in a NULLIF function, it works out correctly.

    declare @var varchar(7)

    set @var = ''

    select case when nullif(@var,'') NOT LIKE '%[^0-9]%'

    then 'I am a number'

    else 'I am not a number'

    end as [result]

    with the resultresult

    -----------------

    I am not a number

    Am I missing something, or is this the exception to the rule?

    The empty string ('') is implicitly converted to a zero (0) which is why it returned 'I am a number'.

  • mark.gilbert (4/16/2013)


    Am I missing something, or is this the exception to the rule?

    What Lynn said. 🙂

    For ensuring that the empty string is not treated as numeric, I would personally not use NULLIF, but:

    SELECT CASE WHEN @var <> '' AND @var NOT LIKE '%[^0-9]%'

    THEN 'I am a number'

    ELSE 'I am not a number'

    END AS Result;


    Hugo Kornelis, SQL Server/Data Platform MVP (2006-2016)
    Visit my SQL Server blog: https://sqlserverfast.com/blog/
    SQL Server Execution Plan Reference: https://sqlserverfast.com/epr/

  • mark.gilbert (4/16/2013)


    I love this article, but I'm bothered by the following scenario...

    Do you think a zero length string contains a character which isn't in 0 to 9? Or that NULL can be like any string or pattern? It has to be one of the other for you to be unhappy, surely?

    Tom

  • Hugo Kornelis (4/16/2013)


    mark.gilbert (4/16/2013)


    Am I missing something, or is this the exception to the rule?

    What Lynn said. 🙂

    For ensuring that the empty string is not treated as numeric, I would personally not use NULLIF, but:

    SELECT CASE WHEN @var <> '' AND @var NOT LIKE '%[^0-9]%'

    THEN 'I am a number'

    ELSE 'I am not a number'

    END AS Result;

    Hello Mark, et al. In as respectful of a tone as possible (cuz I know how easy it is to sound harsh / sarcastic in writing), both Lynn (who said that the behavior was due to the empty string being converted to a zero) and Hugo are incorrect with respect to the cause of the issue, though Hugo is correct in terms of how to best account for what is really happening.

    The reason that conversion to a zero is not an issue here is that this is still a string comparison, even if using a numeric range in the form of [^0-9]. The real issue is the half-assed implementation of Regular Expressions that is provided by the LIKE operator; there is no way to specify how many of character range you are trying to match: it is always a single character position.

    In true Regular Expressions, any particular "thing" to match can have a modifier to indicate how many occurrences of a "thing" should match. The modifiers are:

    + = 1 or more (but has to be at least one)

    * = 0 or more

    ? = 0 or 1

    {X,Y} = between X and Y

    {X,} = at least X, but no maximum

    {,Y} = 0 up to Y

    In the case of the LIKE operator, the [] requires a single character. An empty string, by definition, does not have a single character, even one that would not match the "not 0 through 9". Hence, Hugo's suggestion is correct as it accounts for having 0 characters (which cannot be handled by the '%[]%' pattern that requires at least one character).

    To see these points in action:

    -- Yes, empty string does convert to zero when passed to a numeric datatype,

    -- but not in a string comparison

    SELECT CONVERT(INT, '') [ConvertedToZero],

    CASE

    WHEN '' LIKE '%[0-9]%' THEN 'Is a zero'

    ELSE 'Not a zero'

    END AS [NotConvertedToZero]

    -- Returns: 0, Not a zero

    -- Test empty string -> zero by changing pattern to match

    declare @var2 varchar(7)

    set @var2 = ''

    -- shows "I am a letter" due to no characters;

    -- implicit 0 would return "I am not a letter"

    select case when @var2 NOT LIKE '%[^a-z]%'

    then 'I am a letter'

    else 'I am not a letter'

    end as [result]

    -- explicit 0 shows "I am not a letter" as expected

    set @var2 = '0'

    select case when @var2 NOT LIKE '%[^a-z]%'

    then 'I am a letter'

    else 'I am not a letter'

    end as [result]

    Of course, this is most likely all a moot point given that TRY_PARSE and TRY_CONVERT do a better job (in most cases) of determining Number vs Not a Number ;-).

    Take care,

    Solomon...

    SQL#https://SQLsharp.com/ ( SQLCLR library ofover 340 Functions and Procedures)
    Sql Quantum Lifthttps://SqlQuantumLift.com/ ( company )
    Sql Quantum Leaphttps://SqlQuantumLeap.com/ ( blog )
    Info sitesCollations     •     Module Signing     •     SQLCLR

  • Solomon Rutzky (4/16/2013)


    Hugo Kornelis (4/16/2013)


    mark.gilbert (4/16/2013)


    Am I missing something, or is this the exception to the rule?

    What Lynn said. 🙂

    For ensuring that the empty string is not treated as numeric, I would personally not use NULLIF, but:

    SELECT CASE WHEN @var <> '' AND @var NOT LIKE '%[^0-9]%'

    THEN 'I am a number'

    ELSE 'I am not a number'

    END AS Result;

    Hello Mark, et al. In as respectful of a tone as possible (cuz I know how easy it is to sound harsh / sarcastic in writing), both Lynn (who said that the behavior was due to the empty string being converted to a zero) and Hugo are incorrect with respect to the cause of the issue, though Hugo is correct in terms of how to best account for what is really happening.

    The reason that conversion to a zero is not an issue here is that this is still a string comparison, even if using a numeric range in the form of [^0-9]. The real issue is the half-assed implementation of Regular Expressions that is provided by the LIKE operator; there is no way to specify how many of character range you are trying to match: it is always a single character position.

    In true Regular Expressions, any particular "thing" to match can have a modifier to indicate how many occurrences of a "thing" should match. The modifiers are:

    + = 1 or more (but has to be at least one)

    * = 0 or more

    ? = 0 or 1

    {X,Y} = between X and Y

    {X,} = at least X, but no maximum

    {,Y} = 0 up to Y

    In the case of the LIKE operator, the [] requires a single character. An empty string, by definition, does not have a single character, even one that would not match the "not 0 through 9". Hence, Hugo's suggestion is correct as it accounts for having 0 characters (which cannot be handled by the '%[]%' pattern that requires at least one character).

    To see these points in action:

    -- Yes, empty string does convert to zero when passed to a numeric datatype,

    -- but not in a string comparison

    SELECT CONVERT(INT, '') [ConvertedToZero],

    CASE

    WHEN '' LIKE '%[0-9]%' THEN 'Is a zero'

    ELSE 'Not a zero'

    END AS [NotConvertedToZero]

    -- Returns: 0, Not a zero

    -- Test empty string -> zero by changing pattern to match

    declare @var2 varchar(7)

    set @var2 = ''

    -- shows "I am a letter" due to no characters;

    -- implicit 0 would return "I am not a letter"

    select case when @var2 NOT LIKE '%[^a-z]%'

    then 'I am a letter'

    else 'I am not a letter'

    end as [result]

    -- explicit 0 shows "I am not a letter" as expected

    set @var2 = '0'

    select case when @var2 NOT LIKE '%[^a-z]%'

    then 'I am a letter'

    else 'I am not a letter'

    end as [result]

    Of course, this is most likely all a moot point given that TRY_PARSE and TRY_CONVERT do a better job (in most cases) of determining Number vs Not a Number ;-).

    Take care,

    Solomon...

    Yes, you are correct. However, the empty string will convert to a 0 and a 0 is not like '%[^0-9]%' so it does pass the test where something is a number if does not contain values other than 0-9.

  • Lynn Pettis (4/16/2013)


    Solomon Rutzky (4/16/2013)


    Hugo Kornelis (4/16/2013)


    mark.gilbert (4/16/2013)


    Am I missing something, or is this the exception to the rule?

    What Lynn said. 🙂

    For ensuring that the empty string is not treated as numeric, I would personally not use NULLIF, but:

    SELECT CASE WHEN @var <> '' AND @var NOT LIKE '%[^0-9]%'

    THEN 'I am a number'

    ELSE 'I am not a number'

    END AS Result;

    Hello Mark, et al. In as respectful of a tone as possible (cuz I know how easy it is to sound harsh / sarcastic in writing), both Lynn (who said that the behavior was due to the empty string being converted to a zero) and Hugo are incorrect with respect to the cause of the issue, though Hugo is correct in terms of how to best account for what is really happening.

    The reason that conversion to a zero is not an issue here is that this is still a string comparison, even if using a numeric range in the form of [^0-9]. The real issue is the half-assed implementation of Regular Expressions that is provided by the LIKE operator; there is no way to specify how many of character range you are trying to match: it is always a single character position.

    In true Regular Expressions, any particular "thing" to match can have a modifier to indicate how many occurrences of a "thing" should match. The modifiers are:

    + = 1 or more (but has to be at least one)

    * = 0 or more

    ? = 0 or 1

    {X,Y} = between X and Y

    {X,} = at least X, but no maximum

    {,Y} = 0 up to Y

    In the case of the LIKE operator, the [] requires a single character. An empty string, by definition, does not have a single character, even one that would not match the "not 0 through 9". Hence, Hugo's suggestion is correct as it accounts for having 0 characters (which cannot be handled by the '%[]%' pattern that requires at least one character).

    To see these points in action:

    -- Yes, empty string does convert to zero when passed to a numeric datatype,

    -- but not in a string comparison

    SELECT CONVERT(INT, '') [ConvertedToZero],

    CASE

    WHEN '' LIKE '%[0-9]%' THEN 'Is a zero'

    ELSE 'Not a zero'

    END AS [NotConvertedToZero]

    -- Returns: 0, Not a zero

    -- Test empty string -> zero by changing pattern to match

    declare @var2 varchar(7)

    set @var2 = ''

    -- shows "I am a letter" due to no characters;

    -- implicit 0 would return "I am not a letter"

    select case when @var2 NOT LIKE '%[^a-z]%'

    then 'I am a letter'

    else 'I am not a letter'

    end as [result]

    -- explicit 0 shows "I am not a letter" as expected

    set @var2 = '0'

    select case when @var2 NOT LIKE '%[^a-z]%'

    then 'I am a letter'

    else 'I am not a letter'

    end as [result]

    Of course, this is most likely all a moot point given that TRY_PARSE and TRY_CONVERT do a better job (in most cases) of determining Number vs Not a Number ;-).

    Take care,

    Solomon...

    Yes, you are correct. However, the empty string will convert to a 0 and a 0 is not like '%[^0-9]%' so it does pass the test where something is a number if does not contain values other than 0-9.

    Hi Lynn. My point was that an empty string does not actually convert to a 0 (in this case) because there is no conversion: it is a string-to-string comparison. The original SQL as given by Mark is not a good test as it masks what is happening because there is no way to distinguish a supplied 0 from an implicit conversion to 0. Yes, it appears to do what you are saying but that particular pattern and usage does not allow for testing an opposite case. That is why I changed the pattern to be [^a-z] in the last two examples. If it were true that the empty string was being converted to a zero, then it would display "I am not a letter" (in the 2nd to last query) but instead it displays "I am a letter". And I included the very last example to show that setting the string to an explicit zero, which should behave the same as an implicit conversion of an empty string into 0, does not behave the same as the empty string.

    Take care,

    Solomon...

    SQL#https://SQLsharp.com/ ( SQLCLR library ofover 340 Functions and Procedures)
    Sql Quantum Lifthttps://SqlQuantumLift.com/ ( company )
    Sql Quantum Leaphttps://SqlQuantumLeap.com/ ( blog )
    Info sitesCollations     •     Module Signing     •     SQLCLR

  • Solomon Rutzky (4/16/2013)


    Of course, this is most likely all a moot point given that TRY_PARSE and TRY_CONVERT do a better job (in most cases) of determining Number vs Not a Number ;-).

    BWAAA-HAAAA!!!!! It's definitely NOT a moot point for those still using something less than 2012. 😉

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • mark.gilbert (4/16/2013)


    I love this article, but I'm bothered by the following scenario...

    declare @var varchar(7)

    set @var = ''

    select case when @var NOT LIKE '%[^0-9]%'

    then 'I am a number'

    else 'I am not a number'

    end as [result]

    And the result is

    result

    -----------------

    I am a number

    Not what I was expecting, but by wrapping the var in a NULLIF function, it works out correctly.

    declare @var varchar(7)

    set @var = ''

    select case when nullif(@var,'') NOT LIKE '%[^0-9]%'

    then 'I am a number'

    else 'I am not a number'

    end as [result]

    with the resultresult

    -----------------

    I am not a number

    Am I missing something, or is this the exception to the rule?

    Thanks for the kudo and the code, Mark.

    Shifting gears, you used the word "number" in your code. Let me answer your good question by addressing the concerns of the folks that posted after you. Thanks again for your question.

    BWAAA-HAAAA!!!! OK, everyone. Before you continue the great debate as to whether or not an empty string is a "0" or not, let me suggest that you're making a mistake similar to those that have made the mistake of thinking that "IsNumeric" means "IsAllDigits".

    The article is NOT about making another version of ISNUMERIC. It's about making a formula to check if something [font="Arial Black"]IsAllDigits [/font] or, if you wish, about finding a string that has no non-digit characters in it (which an emptry string qualifies for). Empty strings are a special case that require a bit of equally special consideration. Do you want them to be treated as if they were "AllDigits" or not? Technically, empty strings meet the criteria of "IsAllDigits" because there is nothing in them to the contrary and they're not "unknown" as a NULL would be. On a more practical basis for some, they do, in fact, contain nothing, which also means it is known that they contain no digits which can also technically mean that they are not "IsAllDigits".

    It's up to the user to decide how to use them. If you want empty strings to be considered to be "IsAllDigits", then use the formula as is. If you want it to be treated as a "0" digit, then use ISNULL/NULLIF to make the translation (but that's a waste of time here because the original formula already treats is as "IsAllDigits"). If you want it (or any fully blank value) to be treated as a NULL, then use NULLIF. Technically (although I do agree that it would be easier in Regex), it's also NOT the fault of LIKE conditions because if you forget to accurately define the same problem in REGEX (what do YOU want an empty string, blank, or NULL to be?), you'll have the same problem! You just need to define the problem of what "IsAllDigits" actually means to you when you come across an empty string. 😉

    In any case and while I appreciate the zeal of some of the players on this thread, let's not turn this into an argument similar to what might occur about what a NULL means. Just define what an empty string means to you and, if necessary, modify the formula to accommodate it. It's that easy. 😉

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Jeff Moden (4/16/2013)


    ...Technically (although I do agree that it would be easier in Regex), it's also NOT the fault of LIKE conditions because if you forget to accurately define the same problem in REGEX (what do YOU want an empty string, blank, or NULL to be?), you'll have the same problem!...

    In any case and while I appreciate the zeal of some of the players on this thread, let's not turn this into an argument similar to what might occur about what a NULL means...

    Hi Jeff. Agreed on both. NULL and empty-string are subject to interpretation, although passing back a NULL in both cases might be a way of side-stepping the issue ;-). And my intention was really just to clarify how the single-character range (i.e. [] within a LIKE or PATINDEX) works and to clear up any misunderstanding with regards to the behavior seen by Mark; understanding how [] works will help outside of this particular issue :-).

    Jeff Moden (4/16/2013)


    Solomon Rutzky (4/16/2013)


    Of course, this is most likely all a moot point given that TRY_PARSE and TRY_CONVERT do a better job (in most cases) of determining Number vs Not a Number ;-).

    BWAAA-HAAAA!!!!! It's definitely NOT a moot point for those still using something less than 2012. 😉

    A rather valid point again. So to that end I have just added a TryParseToInt function (for TINYINT, SMALLINT, INT, and BIGINT) to SQL# (will be in the next release) so that people using SQL Server 2005 and 2008(R2) will have essentially the same function as TRY_PARSE(). It does accept the Culture name so you can validate "123 456" as a valid INT in French but not English. So at least a bit closer, hopefully, to it all being a moot point ;-).

    Take care,

    Solomon...

    SQL#https://SQLsharp.com/ ( SQLCLR library ofover 340 Functions and Procedures)
    Sql Quantum Lifthttps://SqlQuantumLift.com/ ( company )
    Sql Quantum Leaphttps://SqlQuantumLeap.com/ ( blog )
    Info sitesCollations     •     Module Signing     •     SQLCLR

  • I've just noticed that "IsAllDigits" includes fractions:

    -- True

    SELECT IsAllDigits

    FROM dbo.IsAllDigits('1¼');

    -- True

    SELECT IsAllDigits

    FROM dbo.IsAllDigits('¾5¼');

    Collation-related of course. There is more between 0 and 9 than just the integers. In the US default collation SQL_Latin1_General_CP1_CI_AI:

    ² (superscript 2)

    ³ (superscript 3)

    ¹ (superscript 1)

    ¼

    ½

    ¾

  • Paul White (11/26/2013)


    I've just noticed that "IsAllDigits" includes fractions:

    -- True

    SELECT IsAllDigits

    FROM dbo.IsAllDigits('1¼');

    -- True

    SELECT IsAllDigits

    FROM dbo.IsAllDigits('¾5¼');

    Collation-related of course. There is more between 0 and 9 than just the integers. In the US default collation SQL_Latin1_General_CP1_CI_AI:

    ² (superscript 2)

    ³ (superscript 3)

    ¹ (superscript 1)

    ¼

    ½

    ¾

    that could be the QotD mate 🙂

  • twin.devil (11/26/2013)


    that could be the QotD mate 🙂

    Ha! I missed an opportunity there.

  • Paul White (11/26/2013)


    I've just noticed that "IsAllDigits" includes fractions:

    -- True

    SELECT IsAllDigits

    FROM dbo.IsAllDigits('1¼');

    -- True

    SELECT IsAllDigits

    FROM dbo.IsAllDigits('¾5¼');

    Collation-related of course. There is more between 0 and 9 than just the integers. In the US default collation SQL_Latin1_General_CP1_CI_AI:

    ² (superscript 2)

    ³ (superscript 3)

    ¹ (superscript 1)

    ¼

    ½

    ¾

    Thanks for the headsup on this, Paul.

    When I installed 2005, I used the default US collation and it was SQL_Latin1_General_CP1_CI_AS. I just checked my 2008 installation where I also accepted the default and it's also SQL_Latin1_General_CP1_CI_AS. I'm pretty sure that's the actual US Default Collation.

    Also, when I run the code you posted, they both return "0" even if I force the collation to be SQL_Latin1_General_CP1_CI_AI using COLLATE both internal to the function and externallly. Is there something else or some other collation that you may have been using?

    What do you have for a default collation on the server that you tested the code with? I'd like to give it a try with that. Thanks.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • In SQL 2012 the fractions are between 0 and 9 in Latin1_General_CI_AS but not in SQL_Latin1_General_CP1_CI_AS. It's the same in SQL 2008 R2, and unless my memory is broken also in SQL 2008. I imagine it would have been the same in SQL 2005 but I didn't use it enough to find out. I'm not certain what the situation was in SQL 2000, but I think it was probably the same - I used it enough to make it likely that I would have noticed if this had changed when I moved to SQL 2008 or 2008 R2.

    edit: Oops, I'd missed out CP1_ from the SQL collation name. So I've added it.

    Tom

Viewing 15 posts - 151 through 165 (of 168 total)

You must be logged in to reply to this topic. Login to reply