9.3 Floating-Point Data Types
The compiler uses the IEEE-754 format. The following table shows floating point data types that are supported. All floating point data types are arithmetic type real.
Type | Bits | E Min | E Max | N Min | N Max |
---|---|---|---|---|---|
float |
32 | -126 | 127 | 2-126 | 2128 |
double* |
32 | -126 | 127 | 2-126 | 2128 |
long double |
64 | -1022 | 1023 | 2-1022 | 21024 |
E = Exponent N = Normalized (approximate) * |
All floating point values are specified in little endian format, which means:
- The least significant byte (LSB) is stored at the lowest address
- The least significant bit (LSb) is stored at the lowest-numbered bit position
As an example, the double
value of 1.2345678 is stored at
address 0x100
as follows:
0x100 |
0x101 |
0x102 |
0X103 |
---|---|---|---|
0x51 |
0x06 |
0x9E |
0x3F |
As another example, the double
value of 1.2345678 is
stored in registers w4 and w5:
w4 | w5 |
---|---|
0x0651 |
0x3F9E |
Floating-point types are always signed and the unsigned
keyword is illegal when specifying a floating-point type. Preprocessor macros that specify
valid ranges are available after including <float.h>
in your source
code. For information on implementation-defined behavior of floating point numbers, see
section 22.6 Floating Point.