October 15, 2007 at 7:46 am
I've moved a databse from prod to dev and a calculation on a sum of a flaot column gives a different value. There are approx 16k rows and here are the results ( so you can see the difference )
-648365.80999999424
-648365.81000000483
The data is identical as I've checked it, assuming data compare can be trusted and the dev db is a restored backup from production.
I seem to remember something about the float datatype from way back when, which is why i don't personally use it, however this is a client database. collation etc. is the same - anyone any ideas?
[font="Comic Sans MS"]The GrumpyOldDBA[/font]
www.grumpyolddba.co.uk
http://sqlblogcasts.com/blogs/grumpyolddba/
October 15, 2007 at 10:10 am
you have to take into account the number of "significant positions", meaning - the precision. The purpose of FLOAT is to store "approximate numbers", so some of the display digits might not be within the precision (that's actually usually a given).
Looks to me that you're summing Float(24) numbers (which have a 7-digit precision). Note that the precision has nothing to do with the decimal point - that's ALL digits).
Take a look here - BOL describes this a bit:
http://msdn2.microsoft.com/en-us/library/aa258876(SQL.80).aspx
----------------------------------------------------------------------------------
Your lack of planning does not constitute an emergency on my part...unless you're my manager...or a director and above...or a really loud-spoken end-user..All right - what was my emergency again?
October 15, 2007 at 10:31 am
can you change the comparison to a money or decimal type ?
SELECT CONVERT(MONEY,-648365.80999999424)AS P1,
CONVERT(MONEY,-648365.81000000483) AS P2
SELECT CONVERT(DECIMAL(10,4),-648365.80999999424)AS P1,
CONVERT(DECIMAL(10,4),-648365.81000000483) AS P2
P1 P2
--------------------- ---------------------
-648365.8100 -648365.8100
Lowell
October 16, 2007 at 2:18 am
I know all about this and as I say i wouldn't normally use a float datatype, the issue remains that the identical data gives different results ( consistantly ) on two different servers.
I seem to remember something about float accuracy relating to the cpu's but my searches couldn't unearth anything.
both sql server is 2000 ent sp4 but the cpu and o/s on one box is intel and x64 whilst the other is amd and x32
The "problem" is that you should be able to take a financial system ( database ) from one server to another without the figures changing - despite the small variance it's still not the same - my client is concerned!
[font="Comic Sans MS"]The GrumpyOldDBA[/font]
www.grumpyolddba.co.uk
http://sqlblogcasts.com/blogs/grumpyolddba/
October 16, 2007 at 2:37 am
As other has said before me, FLOAT is just an approximation based on binary storage of the number.
See here http://www.sqlteam.com/forums/topic.asp?TOPIC_ID=81849
Can't you just change the datatype to DECIMAL(20, 12) ?
N 56°04'39.16"
E 12°55'05.25"
October 16, 2007 at 2:41 am
The 'problem' with financial systems is that they all too often use floats.. And expect them to be 'reliable'. Wrong tool for the wrong purpose.
/Kenneth
October 16, 2007 at 6:47 am
I've seen a financial system that used floats as surrogate primary keys. I'm not sure what was going on there, but I assume that the database was converted from some other SQL engine. At least I hope that's how it happened....
Colin, I would say that your sums are approximately equal. If you want to make sure that X = Y using the float datatype, you'll need to decide how close they have to be for them to be considered "equal".
I agree that it is counterintuitive that you would get different results from the same database on different hardware, but that is yet another reason not to use an imprecise data type for precise data.
Have you tried converting the floats to a decimal type and then performing the sums? Perhaps comparing the converted types row-by-row. It would be interesting to see where the error is showing up.
October 16, 2007 at 7:13 am
oh I do agree that float is a bad choice.
select sum( convert(numeric(38,28),xxxxxxx)) from dbo.yyyyyy
returns the same value from each server and if I physically convert the float column to numeric(38,28) the sum on each table returns the same result set.
-648365.8100000000199072000000000000
however it doesn't explain why moving a database to another server returns a different result - interesting but not the whole answer
[font="Comic Sans MS"]The GrumpyOldDBA[/font]
www.grumpyolddba.co.uk
http://sqlblogcasts.com/blogs/grumpyolddba/
October 16, 2007 at 7:44 am
I'm sorry to be hard-nosed about it - but it's a very clear-cut answer: because you're outside of the stated precision of the calculation/ data type. The only way floats/reals would be "accurate" would be if they could hold an unlimited amount of digits. Until they do - they're accurate to a certain precision, and the non-significant digits are NOT to be used.
The fact that you'd extend it that far past what the stated precision doesn't ultimately add anything to the conversation: those digits are non-significant and can't be used. Not SHOULDN'T, CAN'T. It's a common notation in science to include 1-2 non significant digits, so that you don't start introducing error into the significant digits; any more than that denotes a lack of understanding of the concept.
It's the same thing with datetime - it's precise within +/- 3.3 ms. I'm sure you could conjure up a calculation that needs ns, but it still doesn't change the fact that you wouldn't be able to accurately show it in a SQL datetime field. It's part of the definition, which is there for a reason.
If they need something other than that - they need to switch their data to money/decimal. I understand that their system design made some poor choices, but that doesn't change the fact that it will remain scientifically approximate until you switch the data type. No amount of display tricks will change that.
----------------------------------------------------------------------------------
Your lack of planning does not constitute an emergency on my part...unless you're my manager...or a director and above...or a really loud-spoken end-user..All right - what was my emergency again?
October 16, 2007 at 8:10 am
Hi Colin,
Are there any differences in CPU, O/S, or SQL build between the two boxes?
October 16, 2007 at 8:47 am
Colin,
One more question. Can you attempt the float sum again, but this time specify the ordering of the rows? I ask because an unconfirmed source (wikipedia) seems to indicate that addition is NOT associative.
Thus (a + b) + c = a + (b + c) is not always true for floating point numbers, if this article is true.
select ID,[some float column] from [your table]
order by ID
Compute sum([some float column])
October 16, 2007 at 9:15 am
I just tried a little experiment myself:
create table foo (
ID int identity,
X float)
set nocount on
declare @ctr int
select @ctr = 0
while @ctr < 1000
BEGIN
insert foo (X)
select rand()
select @ctr= @ctr + 1
END
select X from foo
order by newid()
compute sum(X)
select X from foo
order by newid()
compute sum(X)
drop table foo
The compute clauses in the identical select statements return different results. This is apparently caused by the ordering of the data.
October 16, 2007 at 9:23 am
Not for me.
I have tested on all these and your code gives exact same sum.
SQL Server Express 2005 9.0.3186
SQL Server Enterprise 2000 8.00.818
SQL Server Standard 2000 8.00.818
SQL Server Enterprise 2000 8.00.2039
N 56°04'39.16"
E 12°55'05.25"
October 16, 2007 at 9:41 am
Jeff - the differences you're seeing are past the precision scale (7th number), aren't they? I'll bet you even get a substantially more "precise" result if and when you declare it as a Float(53). Funny how float point arithmetic works EXACTLY like BOL says it would.....
----------------------------------------------------------------------------------
Your lack of planning does not constitute an emergency on my part...unless you're my manager...or a director and above...or a really loud-spoken end-user..All right - what was my emergency again?
October 16, 2007 at 10:04 am
I'm not pointing out anything other than the fact that the order in which the rows are added changes the result, thus suggesting that addition of floats is not associative. I'm precision agnostic, just trying to help Colin explain the results he is getting.
I get different results every time I run the code.
Viewing 15 posts - 1 through 15 (of 18 total)
You must be logged in to reply to this topic. Login to reply