Friday 19 June 2009

Floating point values are evil

If you want to do comparisons like > < =, always consider to convert your floating point numbers to integers and store them in integer variables. The basic problem is, that simple values like 0.1 do not exist in the double type.

There are several solutions:

* Instead of showing currencies as the number of dollars, where 0.20 is 20 cents, consider to register the number of cents, instead, so that $0.20 is stored as the integer value 20.

* Instead of comparing timestamps like "if a<b+1/86400", consider to use a time registration in seconds, like on Linux, or even milliseconds.

* Instead of storing fractions like factor:=height/width, keep storing the height and width in two integer values. If you want to calculate value:=factor*xposition, then you can convert this into an integer calculation like value:=height*xposition div width.

* Instead of adding fractions up to get an integer value, like fraction:=height/width and then "for x:=1 to width do y:=y+fraction", keep it as fractions in two integer variables: "for x:=1 to width do begin ynominator:=ynominator+height; y:=ynominator div width; end;".

The benefits are, that your code is much more deterministic, debuggable, explainable. However, it does not always make sense, sometimes it gets more difficult to read, so that's why I kept using the word "consider".


Victor said...

Come on, Floating point values aren't really evil, they were just born that way :-)
I'd consider changing the title to "Floating point comparisons are tricky". Or what about: "All floating point values are equal, but some values are more equal than others"?

Anonymous said...

Basically the reason why TBCD and Currency exist ;-)

Anonymous said...

Regarding the height/width example: I've got into the habit of using MulDiv in such cases, e.g.
value:=MulDiv(xposition, height, width)

Lars Frische said...

For comparison of floating number, the rtl (math unit) comes with a set of helpful functions which allow you to define an epsilon within which two numbers are considered the same:

function CompareValue(const A: Extended; const B: Extended; Epsilon: Extended = 0): TValueRelationship; overload;

function SameValue(const A: Extended; const B: Extended; Epsilon: Extended = 0): Boolean; overload;

function IsZero(const A: Extended; Epsilon: Extended = 0): Boolean; overload;

Anonymous said...

"* Instead of storing fractions like factor:=height/width, keep storing the height and width in two integer values."

This error exists in the .NET framework forms. Having made the same mistake myself, I recognize the odd behavior.

Lars D said...

My point with "evil" was, that many programmers start to use floating point when they need to store fractional values, even though floating point variables are often not suited for overall goal.

I had a good example , where we were doing a comparison of two floating point values, where one was a limit, and the other one was calculated. The only realistic scenario, where the two values could be extremely close to each other, was if one was a copy of the other, so our "<" comparison worked perfectly. However, a source code change suddenly made one of these become the result of a calculation, so that rounding errors could make the calculated value larger than the limit, breaking the code.

If this had been done using integers, everything had worked perfectly. Most of the code in that unit was initially done using integers and integer fractions, but the temptation to use a floating point variable in an extra feature became the first step to a bug in a later version.

Warren said...

It might be better to write and regular use FloatCompare functions that specify the degree of similarity (significant figures) to use when determining equality.

if FloatCompareDigits(float1,float2,significantdecimals)= 0then

Jolyon Smith said...

w.r.t fractions, I have a fraction implementation that keeps numerator and denominator separate, exactly as you describe (among other things).

Quite apart from anything else, you HAVE to do this if you wish to perform arithmetic on fractions with any semblance of accuracy, not least because floating point provides no representation for irrational numbers.


given : a := 1 / 3;
and : b := a * 3;

you get the result : b <> 1 !

But as is pointed out elsewhere, the problem is usually simply that in most cases an application is not interested in the complete precision of a Double, so it's not so much that floating points are bad than applications use floating points badly.

After all, if you really are testing for equality in two doubles, then two doubles containing the same value that has no double representation will still nevertheless have the SAME representation.

The problems - for comparison - really arise when you compare (e.g):

0.01000001 vs 0.01001

and wish to treat them both as 0.01

The only way around that is to use a storage type where that excess precision is avoided in the first place, OR use the CompareValue() routine (or equivalent) with an appropriate Epsilon, as Lars suggests.

Cobus Kruger said...

All the other strategies seem to add work but not value - BCD for example is unpleasant to work with excessive accuracy.

I have found that using SameValue, IsZero and CompareValue as listed by Lars F normally gives me adequate accuracy. I suppose remembering to always use them is a fundamental thing like keeping in mind objects are held by reference or that events are not multicast and need to be chained.

Lars D said...

We have an external system where floating point values are delivered using "single" accuracy in variants via a COM interface. However, since they are delivered as floating point variables, there is nothing in the API that specifies, what the actual precision is, so we need to use double to save the values, but when saving single values in double variables, without knowing that they are of precision "single", SameValue() does not work.

The solution in this case is, that the API gets documented what the precision is, either by commenting/describing the API, or by designing the API so that the precision is obvious.