Is Division by Zero NULL?

  • Gregory Hart (9/8/2016)


    My understanding of databases and mathematics for NULL and infinity is as follows:

    Infinity represents a known value, but for which is infinite in size. Size can be either macro or micro. Examples would include the size of the universe or the size of a point. With these references to infinity, there is no number which could be substituted (i.e. ... If you wanted to talk about the size of the universe multiplied by 2, there is no number which could be used for the calculation).

    This is generally correct although the infinitesimally small numbers in the IEEE standard are considered +/-0

    You should also note that the sign of both infinity and infinitesimally small numbers is known hence you can have +infinity, -infinity, +0 and -0

    Gregory Hart (9/8/2016)


    Infinity can also represent a known value that is seemingly impossible to identify the exact value. An example of this would be Pie. It's known to be greater than 3 and less than 4. The difference here is that a number can be used to substitute for it and the accuracy of your results depends upon how many decimal places to carry it to when applying it in a formula.

    Numbers such as Pi are known as irrational - they have a known value which can be used in calculations although this can only ever be an approximation irrespective of their precision. They are not infinite although they do have an infinite number of digits to the right of the point.

    Note also some numbers can be represented exactly in some bases but not others.

    e.g. 1/3 is approximately 0.33333333333333.... in base 10 (decimal) but can be exactly represented in base 3, conversely 1/10 can be exactly represented in decimal but not in binary (and failing to understand this has resulted in huge numbers of software bugs).

    Gregory Hart (9/8/2016)


    A NULL value represents the lack of a number. There is no knowledge if it is bigger or smaller than anything. When a row is created and one of the columns is a numeric but the data to populate that column hasn't been input yet, then NULL identifies that the value could be anything or nothing. The only thing that can be said about the value is that it is unknown.

    This is part of the reason that NULL should not be used

    +infinity is always > any real number

    -infinity is always < any real number

    +0 > 0

    -0 < 0

    As a result of this you can perform operations on these values and get a meaningful result unlike NULL

    A better description of this is given relating to the IEEE standard here

    https://en.wikipedia.org/wiki/IEEE_754-1985

  • Steve Jones - SSC Editor (9/8/2016)


    I had a report about this thread. I think it's a valid discussion, and as long as everyone remains respectful, that's fine.

    There have been a few people saying NULL is the absence of a value, which isn't what I've learned in database theory. This is a common misconception, partially because of the use of NULL, partially a result of the semantics around how we speak of NULL values.

    However, read Gail's initial description. NULL is unknown. x/0 is not unknown. This is a known quantity, and happens to be something not allowed in mathematics.

    I really prefer a much more pragmatic description of NULL, that there has just been no value stored in that spot of memory that exists to have values stored there. It has nothing to do with whether we know about something or not, or whether anything is unknown, and I think trying to extend semantics beyond what causes a NULL to be just adds to the confusion.

    If an expression evaluates to NULL then the same deal, even with the propagation of NULL, one of the variables or columns or "placeholders" in the expression did not have a value stored to it.

    Whether you want x/0 to return null or an error, you have to handle the value either way. Any calculation looking for a division result that gets a NULL is likely to break as easily as if an error were thrown.

    Personally I prefer an error, but perhaps things are handled easier in some cases if a NULL is returned. I wouldn't be upset over such a setting as an option, and you should feel free to post this on Connect and lobby for votes. I suspect you will get few, but I could be wrong.

    The thing with division by zero, is that every placeholder for values in the expression that attempts to divide by zero DID have values assigned to them, so there is nothing there that should evaluate to NULL, given that the reason NULL occurs in the first place is that a placeholder for values (columns, variables, expressions) did not get a value stored to it.

    Divide by zero does not mean data has not been stored.

  • I've taken about an hour this morning to review my company's stored procedures and other queries to review where our function "dbo.Divide(@Numerator Float, @Denominator Float)" is used. It seems that the greater majority of the time it gets used is when a numeric field does not allow NULL values, and has been set to default to Zero. This is overwhelmingly common in our data sets, which consist of financial data for mortgages. The reason for this is to prevent errors and reduce error handling when performing a mass crunching of numbers. The addition of 2 numbers such as the value of the house and the value of a pool would be a disaster if either was NULL. It would also require a lot of additional error coding to protect against such errors in all the places mathematical operations are performed using these numbers.

    The above scenario would play out like this:

    What is the loan to equity value for the land?

    What is the loan to equity value for the land and the house?

    What is the loan to equity value for the land, house and all the extras?

    In real-world business usage of databases, which is the largest usage of them, division by Zero error messages are a complete pain in the derriere. In the end, I like the solution where there should be an option switch within SQL to turn on and off division by Zero errors.

  • Our Function for Division:

    CREATE FUNCTION dbo.Divide(@Numerator Real, @Denominator Real)

    RETURNS Real AS

    /*

    Purpose: Handle Division by Zero errors

    Description: User Defined Scalar Function

    Parameter(s): @Numerator and @Denominator

    Test it:

    SELECT 'Numerator = 0' Division, dbo.fn_CORP_Divide(0,16) Results

    UNION ALL

    SELECT 'Denominator = 0', dbo.fn_CORP_Divide(16,0)

    UNION ALL

    SELECT 'Numerator is NULL', dbo.fn_CORP_Divide(NULL,16)

    UNION ALL

    SELECT 'Denominator is NULL', dbo.fn_CORP_Divide(16,NULL)

    UNION ALL

    SELECT 'Numerator & Denominator is NULL', dbo.fn_CORP_Divide(NULL,NULL)

    UNION ALL

    SELECT 'Numerator & Denominator = 0', dbo.fn_CORP_Divide(0,0)

    UNION ALL

    SELECT '16 / 4', dbo.fn_CORP_Divide(16,4)

    UNION ALL

    SELECT '16 / 3', dbo.fn_CORP_Divide(16,3)

    */

    BEGIN

    RETURN

    CASE WHEN @Denominator = 0 THEN

    NULL

    ELSE

    @Numerator / @Denominator

    END

    END

    GO

  • I would conclude that division by zero is in fact the VERY definition of NULL, and such should resolve to that returned value. I most DEFINITELY would not suspect that division by zero would raise an error, thus requiring the programmer to create a specific function to handle such error. So, ... again, I reiterate, ... am I missing out on an understanding of the term "NULL"?

    Nope, division by zero is an error and always has been. We defined a null as a marker (not a value) for a missing value. SPARC actually defined twenty-something kinds of missing values (underflow, overflow, computational error, etc.)

    The first principle of the null in SQL is that it propagates. Computations that use the null, usually produce a null for the result. One way that SQL varies from a pure Dr. Codd relational model is that nulls to have a data type. We needed that to allocate storage, put things in tables, etc. in real hardware.

    Play with 0/0, null/0, 0/null and null/null results. And yes it is weird.

    Books in Celko Series for Morgan-Kaufmann Publishing
    Analytics and OLAP in SQL
    Data and Databases: Concepts in Practice
    Data, Measurements and Standards in SQL
    SQL for Smarties
    SQL Programming Style
    SQL Puzzles and Answers
    Thinking in Sets
    Trees and Hierarchies in SQL

  • Gregory Hart (9/8/2016)


    I've taken about an hour this morning to review my company's stored procedures and other queries to review where our function "dbo.Divide(@Numerator Float, @Denominator Float)" is used. It seems that the greater majority of the time it gets used is when a numeric field does not allow NULL values, and has been set to default to Zero. This is overwhelmingly common in our data sets, which consist of financial data for mortgages. The reason for this is to prevent errors and reduce error handling when performing a mass crunching of numbers. The addition of 2 numbers such as the value of the house and the value of a pool would be a disaster if either was NULL. It would also require a lot of additional error coding to protect against such errors in all the places mathematical operations are performed using these numbers.

    The above scenario would play out like this:

    What is the loan to equity value for the land?

    What is the loan to equity value for the land and the house?

    What is the loan to equity value for the land, house and all the extras?

    In real-world business usage of databases, which is the largest usage of them, division by Zero error messages are a complete pain in the derriere. In the end, I like the solution where there should be an option switch within SQL to turn on and off division by Zero errors.

    For this very reason you should want it to return an error. I don't know exactly what your calculations are but let's say for the sake of argument that the PurchasePrice of your mortgage does not allow NULL and defaults to 0. Using the logic you defined you would get NULL for a monthly payment because the PurchasePrice is 0 instead of NULL. I realize this example is not wholly accurate but you get the idea. This is an error and should be treated as such.

    _______________________________________________________________

    Need help? Help us help you.

    Read the article at http://www.sqlservercentral.com/articles/Best+Practices/61537/ for best practices on asking questions.

    Need to split a string? Try Jeff Modens splitter http://www.sqlservercentral.com/articles/Tally+Table/72993/.

    Cross Tabs and Pivots, Part 1 – Converting Rows to Columns - http://www.sqlservercentral.com/articles/T-SQL/63681/
    Cross Tabs and Pivots, Part 2 - Dynamic Cross Tabs - http://www.sqlservercentral.com/articles/Crosstab/65048/
    Understanding and Using APPLY (Part 1) - http://www.sqlservercentral.com/articles/APPLY/69953/
    Understanding and Using APPLY (Part 2) - http://www.sqlservercentral.com/articles/APPLY/69954/

  • If you think SQL's nulls are bad, then look at the IEEE floating-point standards which have NaN (not a number) configurations. The list includes +INF and -INF for positive and negative infinities.

    Books in Celko Series for Morgan-Kaufmann Publishing
    Analytics and OLAP in SQL
    Data and Databases: Concepts in Practice
    Data, Measurements and Standards in SQL
    SQL for Smarties
    SQL Programming Style
    SQL Puzzles and Answers
    Thinking in Sets
    Trees and Hierarchies in SQL

  • Creating a scalar UDF to handle division by zero is a bad idea. It will only slow down queries.

    What's the problem with using NULLIF?

    SELECT numerator / NULLIF( denominator, 0)

    You can even use ISNULL to produce different results (1, 0, numerator, 100) depending on business rules.

    Luis C.
    General Disclaimer:
    Are you seriously taking the advice and code from someone from the internet without testing it? Do you at least understand it? Or can it easily kill your server?

    How to post data/code on a forum to get the best help: Option 1 / Option 2
  • Luis Cazares (9/8/2016)


    Creating a scalar UDF to handle division by zero is a bad idea. It will only slow down queries.

    What's the problem with using NULLIF?

    SELECT numerator / NULLIF( denominator, 0)

    You can even use ISNULL to produce different results (1, 0, numerator, 100) depending on business rules.

    Especially as the inline table-valued-function is actually a simpler bit of code!

    My take on the divide-by-zero topic is that if I try to do an invalid mathematical operation, I want it to throw an error. Doubly so if it's financial data.

  • We don't want an error, because; if the divisor is zero, it means we merely haven't collected that data. In the example I gave, most obviously some homes do not have a pool. We surely need to calculate costs and ratios, and addition of NULLs would be outlandish as would the additional code to handle NULLs. To return a NULL for division by Zero means that when it shows on a report, it is blank, not zero. This immediately flags us to know that there is no divisor, and that works perfectly.

    The reality is the largest usage of databases is within the business sector. For most business data management applications, the preference is to default numeric values to be zero and not allow NULLs. This practice is so because it prevents bad errors of addition and multiplication and subtraction. I have worked with many large clients (see my resume, ghart.net, to identify the large companies and the diversity of businesses and industries I have worked over the past 22+ years), and everyone of them has something in place to handle division by zero errors. This extends to Admin Assists working with Excel spreadsheets and every bit of coding done by our GUI developers.

    So, ... the point is that division by Zero errors may be appreciated MAYBE one in a million, while returning a value, preferably NULL (yes, some of my clients actually prefer to have Zero returned) seems to be the case 99.9999% of the time. Being that not one comment on this forum has produced a real life need to where an error for division by Zero is actually beneficial, it seems to be that it would be far more beneficial to us, the end users of the product SQL server, for SQL to cater to our REAL needs and not a technically mathematically correct need, or; at minimal, to give us a switch we can flip to provide us with an option to choose. And, ... PLEASE, PLEASE post a valid usage if you're aware of one in REAL business data management.

  • The reality is the largest usage of databases is within the business sector.

    Well yes it is. Are you asserting that that the other members of this forum are just hobbyists?

    For most business data management applications, the preference is to default numeric values to be zero and not allow NULLs.

    No that is your preference. Having zeroes instead of nulls screws up the calculation of averages and percentiles.

    everyone of them has something in place to handle division by zero errors.

    You have been given that answer here. Dividing by NULLIF([column],0) fixes your problem and gives you the behavior you want. Why you continue to ignore this solution mystifies me, unless I choose to think unkindly of your character.

    This extends to Admin Assists working with Excel spreadsheets and every bit of coding done by our GUI developers.... (some of my clients actually prefer to have Zero returned)

    It is the function of the client application to format the data returned from the database. You can replace nulls with zeros at the time its DISPLAYED. Our GUI people do it all the time without blinking. A single search/replace does it in Excel.

    Being that not one comment on this forum has produced a real life need to where an error for division by Zero is actually beneficial

    I beg to differ, you just aren't choosing to listen. Or are you attempting to set yourself up as the sole judge of the validity of arguments, because so few people are agreeing with yours. Nobody tries to divide by zero. Errors in calculations or problems with data are masked by returning a null.

    give us a switch we can flip to provide us with an option to choose.

    NULLIF()

    And, ... PLEASE, PLEASE post a valid usage if you're aware of one in REAL business data management.

    Averages, percentiles, counts of non-responses.

    it seems to be that it would be far more beneficial to us, the end users of the product SQL server, for SQL to cater to our REAL needs

    The rest of us are users, developers, administrators, and the majority of us seem to disagree with you about what the need is. Have the humility to at least admit you are the minority. Steve pointed you to where you can make the suggestion directly to Microsoft and try to rely support for it from other users.

    Good luck.

    __________________________________________________

    Against stupidity the gods themselves contend in vain. -- Friedrich Schiller
    Stop, children, what's that sound? Everybody look what's going down. -- Stephen Stills

  • Gregory Hart (9/8/2016)


    For most business data management applications, the preference is to default numeric values to be zero and not allow NULLs.

    It must be "For lousy developers of business data management applications, the preference is to default numeric values to be zero and not allow NULLs".

    Not sure if the most of business data management applications have been developed by lousy developers, but if you say so...

    This practice is so because it prevents bad errors of addition and multiplication and subtraction.

    This practice actually CREATES bads errors with aggregations, averages, etc.

    Can't count how many times I was fixing reports skewed by zeros instead of NULLs.

    ... everyone of them has something in place to handle division by zero errors.

    Yes. Something.

    That "something" is based on the real meaning of zero value in each particular place.

    If you use zero instead of NULL - then it's obvious what do you need to do prior submitting that value into any calculation - restore the real value from the substitution.

    NULLIF(Amount, 0)

    Easy as that.

    It seems like the function you're asking for is developed quite some time ago, you just choose not to notice it.

    _____________
    Code for TallyGenerator

  • Why we use a UDF instead of NullIf:

    It's very simple, ... standards and code control.

    Also, for those who have a really strong understanding of SQL, they will know that many UDF's may cause performance issues, but not the way mine is written. It does not use variables and does not use TSQL. It merely returns a value based on the inputs, and uses only a simple "CASE" statement. Such use of a UDF is ideal, and will respond at least as efficiently as a standard SQL function. Check it out, ... do a query execution plan and any other test using the UDF I posted (dbo.Divide) compared to NullIf. You may find yourself VERY surprised (I won't brag and tell you the outcome, ... please do the tests, especially do a test of timing the executions using both methods and against a very large data set).

  • OK, ... let this post/thread be a lesson to all you "know-it-all" experts. Lessons learned:

    Not all UDF are bad, ... if correctly written, they can be tremendous for establishing and persisting in standards and also in code control when modifications occur to a process that is repeatedly used.

    When a software company truly listens to their users, they become successful and thus generate greater profits. Microsoft has persistently NOT listened to its users, hence why they could barely even give away the Windows 10 operating system. In contrast, their approach to SQL 2016 (and all other versions of SQL Enterprise that are Post Version 6.5) listens VERY CAREFULLY to their user base, and thus has attributed to the tremendous success of MS SQL products.

    Even though a behavior may be TECHNICALLY correct, that does not outweigh the practical application of the process. It may be technically correct and logically validated that Hitler was one of the greatest leaders in all recorded history, but; such does not validate the horrors of his rule. Similarly, an error output for division by Zero may be "technically" correct, but the derived benefit from such is far outweighed by its inconvenience. Thus, it would seem to be a great "selling point" for Microsoft to adopt this approach instead of fighting it.

    Why don't I post it on their forums?

    A working and in-place confidentially agreement that prohibits me.

  • Gregory Hart (9/8/2016)


    Such use of a UDF is ideal, and will respond at least as efficiently as a standard SQL function. Check it out, ... do a query execution plan and any other test using the UDF I posted (dbo.Divide) compared to NullIf. You may find yourself VERY surprised (I won't brag and tell you the outcome, ... please do the tests, especially do a test of timing the executions using both methods and against a very large data set).

    With pleasure.

    CREATE TABLE SomeTable (

    Numerator REAL, -- I ordinarily wouldn't use FLOAT/REAL in financial systems, but to avoid any suggestion that the function overhead is from data type conversions

    Divisor REAL

    )

    GO

    -- insert a million rows

    INSERT INTO dbo.SomeTable (Numerator, Divisor)

    SELECT TOP (1000000) RAND(CHECKSUM(NEWID()))*50000, RAND(CHECKSUM(NEWID()))*50000

    FROM msdb.sys.columns c CROSS JOIN msdb.sys.columns c2

    GO

    -- and set some to 0

    UPDATE dbo.SomeTable

    SET Divisor = 0

    WHERE Divisor IN (SELECT TOP(5) PERCENT Divisor FROM dbo.SomeTable ORDER BY NEWID())

    GO

    CREATE FUNCTION dbo.Divide(@Numerator Real, @Denominator Real)

    RETURNS Real AS

    /*

    Purpose: Handle Division by Zero errors

    Description: User Defined Scalar Function

    Parameter(s): @Numerator and @Denominator

    Test it:

    SELECT 'Numerator = 0' Division, dbo.fn_CORP_Divide(0,16) Results

    UNION ALL

    SELECT 'Denominator = 0', dbo.fn_CORP_Divide(16,0)

    UNION ALL

    SELECT 'Numerator is NULL', dbo.fn_CORP_Divide(NULL,16)

    UNION ALL

    SELECT 'Denominator is NULL', dbo.fn_CORP_Divide(16,NULL)

    UNION ALL

    SELECT 'Numerator & Denominator is NULL', dbo.fn_CORP_Divide(NULL,NULL)

    UNION ALL

    SELECT 'Numerator & Denominator = 0', dbo.fn_CORP_Divide(0,0)

    UNION ALL

    SELECT '16 / 4', dbo.fn_CORP_Divide(16,4)

    UNION ALL

    SELECT '16 / 3', dbo.fn_CORP_Divide(16,3)

    */

    BEGIN

    RETURN

    CASE WHEN @Denominator = 0 THEN

    NULL

    ELSE

    @Numerator / @Denominator

    END

    END

    GO

    Then with Stats Time ON.

    -- Insert into a table variable to remove the overhead of SSMS's row display.

    DECLARE @t TABLE (Result REAL)

    INSERT INTO @t (Result)

    SELECT dbo.Divide(Numerator, Divisor) FROM dbo.SomeTable

    GO

    SQL Server Execution Times:

    CPU time = 3328 ms, elapsed time = 6453 ms.

    DECLARE @t TABLE (Result REAL)

    INSERT INTO @t (Result)

    SELECT Numerator/NULLIF(Divisor,0) FROM dbo.SomeTable

    GO

    SQL Server Execution Times:

    CPU time = 219 ms, elapsed time = 212 ms.

    Nope, not particularly surprised (but maybe that's because I did this kind of test for a blog post earlier this year)

    But, you did say 'very large data set', and a million rows isn't very large, so let me re-test with 100 million rows (this might take a while), I'll post when it's done.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass

Viewing 15 posts - 31 through 45 (of 64 total)

You must be logged in to reply to this topic. Login to reply