3.3.4.1 Why Are My Floating-point Results Not Quite What I Am Expecting?
The size of the floating point type can be adjusted for both
float
and double
types only when building to the C90
standard (see 4.6.14.2 Short Float Option and
4.6.14.1 Short Double Option).
Since floating-point variables only have a finite number of bits to represent the values they are assigned, they only hold an approximation of their assigned value (see 5.3.4 Floating-Point Data Types). A floating-point variable can only hold one of a set of discrete real number values. If you attempt to assign a value that is not in this set, it is rounded to the nearest value. The more bits used by the mantissa in the floating-point variable, the more values can be exactly represented in the set, and the average error due to the rounding is reduced.
Whenever floating-point arithmetic is performed, rounding also occurs. This can also lead to results that do not appear to be correct.