7.8.1 Optimization

-mfast-float

The switch causes swift, non-IEEE compliant versions of some of the optimized AVR 32-bit floating-point library functions to be used. This switch is by default enabled if the ‘-ffast-math’ switch is used.

-funsafe-math-optimizations

Allow optimizations for floating-point arithmetic that (a) assume that arguments and results are valid and (b) may violate IEEE or ANSI standards. When used at link-time, it may include libraries or start-up files that change the default FPU control word or similar optimizations. This option is not turned ON by any ‘-O’ option since it can result in incorrect output for programs depending on an exact implementation of IEEE or ISO rules/specifications for math functions. It may, however, yield faster code for programs that do not require the ensure of these specifications. Enables ‘-fno-signed-zeros’, ‘-fno-trapping-math’, ‘-fassociative-math’, and ‘-freciprocal-math’.

-ffast-math

This option causes the preprocessor macro __FAST_MATH__ to be defined. This option is not turned on by any ‘-O’ option since it can result in incorrect output for programs depending on an exact implementation of IEEE or ISO rules/specifications for math functions. It may, however, yield faster code for programs that do not require the ensure of these specifications. It sets ‘-fno-math-errno’, ‘-funsafe-math-optimizations’, ‘-ffinite-math-only’, ‘-fno-rounding-math’, ‘-fno-signaling-nans’ and ‘-fcx-limited-range’.

-fpic

If supported for the target machine, generate position-independent code (PIC) suitable for use in a shared library. Such code accesses all constant addresses through a global offset table (GOT). The dynamic loader resolves the GOT entries when the program starts (the dynamic loader is not part of GCC; it is part of the operating system). If the GOT size for the linked executable exceeds a machine-specific maximum size, you get an error message from the linker indicating that ‘-fpic’ does not work; in that case, recompile with ‘-fPIC’ instead. (These maximums are 8k on the SPARC and 32k on the m68k and RS/6000. The 386 has no such limit.) Position-independent code requires specific support and therefore works only on some machines. For the 386, GCC supports PIC for System V but not for the Sun 386i. Code generated for the IBM RS/6000 is always position-independent. When this flag is set, the macros __pic__ and __PIC__ are defined to 1.

-mno-init-got

Do not initialize the GOT register before using it when compiling the PIC code.

-masm-addr-pseudos

This option is enabled by default and causes GCC to output the pseudo instructions call and lda.w for calling direct functions and loading symbol addresses, respectively. It can be turned OFF by specifying the switch ‘-mno-asm-addr-pseudos’. The advantage of using these pseudo-instructions is that the linker can optimize these instructions at link time if linker relaxing is enabled. The ‘-mrelax’ option can be passed to GCC to signal to the assembler that it should generate a relaxable object file.

-mforce-double-align

Force double-word alignment for double-word memory accesses.

-mimm-in-const-pool

When GCC needs to move immediate values not suitable for a single move instruction into a register, it has two possible choices. It can either put the constant into the code somewhere near the current instruction (the constant pool) and then use a single load instruction to load the value, or it can use two immediate instruction for loading the value directly without using a memory load. If a load from the code memory is faster than executing two simple one-cycle immediate instructions, putting these immediate values into the constant pool will be most optimal for speed. This is often true for MCU architectures implementing an instruction cache, whereas architectures with code executing from the internal Flash will probably need several cycles for loading values from code memory. By default, GCC will use the constant pool for AVR 32-bit products with an instruction cache and two immediate instructions for Flash-based MCUs. Override this by by using the option ‘-mimm-in-const-pool’ or its negated option ‘-mno-imm-in-const-pool’.

-muse-rodata-sections

GCC will, by default, output read-only data into the code (.text) section. If the code memory is slow, it might be more optimal for performance to put read-only data into another faster memory, if available. Do this by specifying the switch ‘-muse-rodata-section’, which makes GCC put read-only data into the .rodata section. Then the linker file can specify where to place the content of the .rodata section. However, this might mean that the read-only data must be located in Flash and then copied to another memory at start-up, which requires extra memory usage with this scheme for systems running code from Flash.