Cast Varchar to Float Resulting In (Scientific) E Notation

  • waxingsatirical (8/25/2016)


    Well what is the "Why"? Why are floats not considered best practice by some?

    I don't know what unnecessary work later is incurred by using floats, but I know of lots that is incurred by using decimals. Such as, resizing columns when data sizes go up or accuracy changes, having to constantly avoid multiplication gotchas which cause tricky-to-spot rounding bugs.

    FLOAT definitely has its useful place in the world but shouldn't be used generically nor even for most things that require decimal math because of the very definition of the FLOAT datatype in SQL Server. From the following link...

    https://msdn.microsoft.com/en-us/library/ms173773.aspx

    ... we get the following definition of what the FLOAT datatype is. The emphasis is mine...

    [font="Arial Black"]Approximate-number [/font]data types for use with floating point numeric data. [font="Arial Black"]Floating point data is approximate[/font]; therefore, [font="Arial Black"]not all values in the data type range can be represented exactly[/font].

    A great example came up in a post just two days ago. It does the simple calculation of...

    A B

    --- - ---

    C C

    Assigning some simple values, as below, figure out the answer in your head or using your favorite 4 function calculator and you find that it works out to be ...

    A B 7 2

    --- - --- = --- - --- = .7 - .2 = .5

    C C 10 10

    Now, let's put that to code using the FLOAT datatype. Looking at the code below and knowing that the correct answer to the problem is .5 and so the result of the code should be "GE.5", explain why it isn't.

    --===== Create variables and assign values

    DECLARE @a FLOAT = 7

    ,@B FLOAT = 2

    ,@C FLOAT = 10

    ;

    --===== Solve the problem as stated in the previously mentioned formula

    -- and use CASE to determine if the answer is >= .5 or less than .5.

    SELECT CASE

    WHEN @a/@C - @b-2/@C >= 0.5 THEN 'GE.5'

    ELSE 'LT.5'

    END

    ;

    What's really frustrating is that the following code "proves" that it should all work out just fine.

    --===== Create variables and assign values

    DECLARE @a FLOAT = 7

    ,@B FLOAT = 2

    ,@C FLOAT = 10

    ;

    SELECT @a/@C, @b-2/@C, @a/@C - @b-2/@C, CAST(.5 AS FLOAT)

    ;

    Results:

    ---------------------- ---------------------- ---------------------- ----------------------

    0.7 0.2 0.5 0.5

    (1 row(s) affected)

    Microsoft did a pretty good job of masking the "problem" by "correcting" the display but that doesn't correct the actual results. You can see the "problem" that occurs during the subtraction using the following code, which does the FLOAT math and then displays the results as DECIMAL so that you can actually see why the CASE example failed to provide the correct answer. (I have "problem" in quotes because its not a problem if you understand BINARY floating point math and, ironically, how to round to get the actual correct answer as a true value instead of just a displayed value).

    --===== Create variables and assign values

    DECLARE @a FLOAT = 7

    ,@B FLOAT = 2

    ,@C FLOAT = 10

    ;

    SELECT CAST(@A/@C AS DECIMAL(38,28))

    ,CAST(@B/@C AS DECIMAL(38,28))

    ,CAST(@A/@C - @b-2/@C AS DECIMAL(38,28))

    ;

    Results:

    --------------------------------------- --------------------------------------- ---------------------------------------

    0.7000000000000000000000000000 0.2000000000000000000000000000 0.4999999999999999400000000000

    (1 row(s) affected)

    You see, FLOAT in SQL Server is NOT actually DECIMAL math behind the scenes. It's BINARY math and, with the limited precision of "only" 15 digits, not all DECIMAL numbers can successful be converted to BINARY with 100% accuracy. Hence the term "imprecise" in most peoples definition of FLOAT and "approximate" in the MS definition. Here's some more "fun". WYSI[font="Arial Black"]N[/font]WYG 😉 The column on the left is what is displayed and the column on the right is the number actually used.

    --===== Create variables and assign values

    DECLARE @a FLOAT = 7

    ,@B FLOAT = 2

    ,@C FLOAT = 10

    ;

    SELECT LooksLike = CAST(number AS FLOAT)/10

    ,ActuallyIs = CAST(CAST(number AS FLOAT)/10 AS DECIMAL(38,28))

    FROM master.dbo.spt_values

    WHERE type = 'P'

    AND number BETWEEN 0 AND 20

    ;

    Results:

    LooksLike ActuallyIs

    ---------------------- ---------------------------------------

    0 0.0000000000000000000000000000

    0.1 0.1000000000000000000000000000

    0.2 0.2000000000000000000000000000

    0.3 0.3000000000000000000000000000

    0.4 0.4000000000000000000000000000

    0.5 0.5000000000000000000000000000

    0.6 0.6000000000000000000000000000

    0.7 0.7000000000000000000000000000

    0.8 0.8000000000000000000000000000

    0.9 0.9000000000000000000000000000

    1 1.0000000000000000000000000000

    1.1 1.1000000000000001000000000000

    1.2 1.2000000000000000000000000000

    1.3 1.3000000000000001000000000000

    1.4 1.3999999999999999000000000000

    1.5 1.5000000000000000000000000000

    1.6 1.6000000000000001000000000000

    1.7 1.7000000000000000000000000000

    1.8 1.8000000000000000000000000000

    1.9 1.8999999999999999000000000000

    2 2.0000000000000000000000000000

    (21 row(s) affected)

    And you say FLOAT avoids the "tricky" problem of rounding? 😀 Check out what MS says about FLOAT and rounding at the following URL...

    https://msdn.microsoft.com/en-us/library/ms190476.aspx

    ... where it clearly states that FLOAT has a rounding "problem" (emphasis is mine).

    The float and real data types are known as approximate data types. The behavior of float and real follows the IEEE 754 specification on approximate numeric data types.

    [font="Arial Black"]Approximate numeric data types do not store the exact values specified for many numbers[/font]; they store an extremely close approximation of the value. For many applications, the tiny difference between the specified value and the stored approximation is not noticeable. At times, though, the difference becomes noticeable. [font="Arial Black"]Because of the approximate nature of the float and real data types, do not use these data types when exact numeric behavior is required, such as in financial applications, [/font]in operations involving rounding, [font="Arial Black"]or in equality checks. Instead, use the integer, decimal, money, or smallmoney data types.[/font]

    Avoid using float or real columns in WHERE clause search conditions, especially the = and <> operators. It is best to limit float and real columns to > or < comparisons.

    The IEEE 754 specification provides four rounding modes: round to nearest, round up, round down, and round to zero. Microsoft SQL Server uses round up. All are accurate to the guaranteed precision but can result in slightly different floating-point values. [font="Arial Black"]Because the binary representation of a floating-point number may use one of many legal rounding schemes, it is impossible to reliably quantify a floating-point value.[/font]

    Sounds like a "Best Practice" to me.

    Also remember that FLOAT has a fixed precision of 15 digits with a floating decimal point. As the number of digits on the left side of the decimal points goes up, you lose scale... which is another kind of (sort of) "rounding" problem... the very problem that you're trying to avoid.

    This is also why Luis said "it's lazy" of developers to use FLOAT for everything (or most anything) and I agreed. If they had taken the time to look all that stuff up, they wouldn't be so adamant about using FLOAT to make things "easy". Using the correct datatypes for the precision and scale expected is not a form of pre-optimization... and it's not just a "Best Practice". It's absolutely necessary for coming up with the right bloody answer. 😉

    P.S. You should also avoid the use of any of the "MONEY" datatypes and small scale DECIMAL types for any work with any operators other than simple addition and subtraction.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Ok, I must admit, I've not seen such am alarming example of float mathematics going wrong before. This is kinda scary.

    While I have your attention though... how do you get around the decimal rounding problem:

    declare @a decimal (38,28) = 1.123456789

    declare @b-2 decimal (38,28) = 1.123456789

    declare @C decimal (38,28) = 1.123456789

    select @a * @b-2 * @C

    ?

  • Start by identifying how many digits of precision you need.

    Because, even if we had infinite precision calculations, when displaying the data back to users, we're not going to be displaying 28 decimal places (unless you're doing sub-atomic physics maybe), so we will be rounding at some point for display

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • Jeff Moden (8/25/2016)


    P.S. You should also avoid the use of any of the "MONEY" datatypes and small scale DECIMAL types for any work with any operators other than simple addition and subtraction.

    I've been bitten by that problem too. I couldn't get to match the average until I discovered that the money data type was being used.

    Luis C.
    General Disclaimer:
    Are you seriously taking the advice and code from someone from the internet without testing it? Do you at least understand it? Or can it easily kill your server?

    How to post data/code on a forum to get the best help: Option 1 / Option 2
  • GilaMonster (8/26/2016)


    Start by identifying how many digits of precision you need.

    Because, even if we had infinite precision calculations, when displaying the data back to users, we're not going to be displaying 28 decimal places (unless you're doing sub-atomic physics maybe), so we will be rounding at some point for display

    Absolutely agreed. I typically do the calculations with the maximum scale (NOT implying always using DECIMAL(38,28). "It Depends") I can get and then do the proper rounding for display. If it's important to maintain the precision and scale of the answers in a table, I have no qualms about doing so. For example, the "hidden" balance remaining in mortgage tables just so Granny doesn't beat me to death with her calculator (she DOES mind the pennies). 😀

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • waxingsatirical (8/26/2016)


    While I have your attention though... how do you get around the decimal rounding problem:

    declare @a decimal (38,28) = 1.123456789

    declare @b-2 decimal (38,28) = 1.123456789

    declare @C decimal (38,28) = 1.123456789

    select @a * @b-2 * @C

    ?

    That's exactly where what Gail said comes into play and what I mean by "using the MAX scale I can get". If you look again at the following URL...

    https://msdn.microsoft.com/en-us/library/ms190476.aspx

    ... you'll see the precision/scale rules for multiplication.

    Operation Result precision Result scale*

    --------- ---------------- -------------

    e1 * e2 p1 + p2 + 1 s1 + s2

    Ah... but then there's that bloody asterisk, which means we have a note to read. That note states that...

    * The result precision and scale have an absolute maximum of 38. When a result precision is greater than 38, the corresponding scale is reduced to prevent the integral part of a result from being truncated.

    Why they decided to do such a thing instead of displaying the answer with the max scale available, I'll never know. According to the other "Result precision" and "Result scale" formulas on that page, the scale of "6" frequently appears and, although not stated in the note above, appears to be the scale that the results are usually reduced to. It's a real pain in the butt because even a 4 function calculator is better than that.

    To get around it, you have to do like Gail said. If you want to maintain the "same math" that any cheap calculator would do, you have to make sure that you don't violate a precision of 38 (FLOAT is variable and its scale is only 15 max when there's nothing to the left of the decimal point). In Luis' terms, that means that you can't be lazy about picking your datatypes. The following, which is re-datatyped from your example, gives the answer (well, up to about 13 digits of scale anyway) that a 4 function calculator bought at a dollar store would correctly give without so much worry.

    declare @a decimal (12,9) = 1.123456789

    declare @b-2 decimal (12,9) = 1.123456789

    declare @C decimal (12,9) = 1.123456789

    select @a * @b-2 * @C

    Result:

    1.417976779622360717860897069

    So, you are absolutely correct in saying that there are "rounding" problems (truncation, really) but FLOAT won't fix those. The only way to fix the problem is to do the "hard" thing and set the precision and scale of your datatypes correctly or pay the price in truncated results, which can lead to incorrect answers especially if used in "cascading" formulas.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • It would be much better to convince the "developers" that FLOAT is the wrong thing to use unless the scale of the number varies in an astronomical way.

    I found one way to convince developers is to go to their manager and show him the GAAP or EU regulations concerning rounding and the number of decimal places. In the US. It used to be for places (I do not know what it is today) and in the EU they had triangulation rules for currency conversion and carry things out the five places. Again, I do not know what the current rules are.

    But the argument is now very simple! ROI now means "risk of incarceration" and the boss understands that. :w00t: If he does not, the accounting department will explain it to him, or worse yet, the legal department will explain.

    Books in Celko Series for Morgan-Kaufmann Publishing
    Analytics and OLAP in SQL
    Data and Databases: Concepts in Practice
    Data, Measurements and Standards in SQL
    SQL for Smarties
    SQL Programming Style
    SQL Puzzles and Answers
    Thinking in Sets
    Trees and Hierarchies in SQL

  • CELKO (8/26/2016)


    It would be much better to convince the "developers" that FLOAT is the wrong thing to use unless the scale of the number varies in an astronomical way.

    I found one way to convince developers is to go to their manager and show him the GAAP or EU regulations concerning rounding and the number of decimal places. In the US. It used to be for places (I do not know what it is today) and in the EU they had triangulation rules for currency conversion and carry things out the five places. Again, I do not know what the current rules are.

    But the argument is now very simple! ROI now means "risk of incarceration" and the boss understands that. :w00t: If he does not, the accounting department will explain it to him, or worse yet, the legal department will explain.

    Heh... incarceration would be an almost welcome vacation from the madness. 😛 I'm more afraid of Granny and her 20 pound purse and umbrella. 🙂

    It would be wonderful if we could get MS to realize that there's no need for truncated answers and to use the full precision available to things like the DECIMAL datatype at all times for all calculations and let the user decide what the display scale/rounding should be. Even the calculator in Windows does that. Certainly, most spreadsheets are capable of the same thing. Why not SQL Server and T-SQL.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Each of the following could serve your purpose. As others in this post have illustrated, you need to define the precision that will work for you and apply it to the decimal data type. I suggest reading more about decimal at books online.

    select cast('0.00004' as decimal(18,05))

    select cast('0.00004' as decimal(18,10))

    ----------------------------------------------------

Viewing 9 posts - 16 through 23 (of 23 total)

You must be logged in to reply to this topic. Login to reply