10.3 Floating-Point Data Types

The compiler uses the IEEE-754 format. The following table shows floating point data types that are supported. All floating point data types are arithmetic type real.

Table 10-2. Floating Point Data Types
TypeBitsE MinE MaxN MinN Max
float32-1261272-1262128
double*32-1261272-1262128
long double64-102210232-102221024
E = Exponent

N = Normalized (approximate)

* double is equivalent to long double if -fno-short-double is used.

All floating point values are specified in little endian format, which means:

  • The least significant byte (LSB) is stored at the lowest address
  • The least significant bit (LSb) is stored at the lowest-numbered bit position

As an example, the double value of 1.2345678 is stored at address 0x100 as follows:

0x1000x1010x1020X103
0x510x060x9E0x3F

As another example, the double value of 1.2345678 is stored in registers w4 and w5:

w4w5
0x06510x3F9E

Floating-point types are always signed and the unsigned keyword is illegal when specifying a floating-point type. Preprocessor macros that specify valid ranges are available after including <float.h> in your source code. For information on implementation-defined behavior of floating point numbers, see section Floating Point.