7.6.6.1 Options For Specific Optimization Control

The following options control specific optimizations. The -O2 option turns on all of these optimizations except -funroll-loops, -funroll-all-loops and -fstrict-aliasing.

You can use the following flags in the rare cases when “fine-tuning” of optimizations to be performed is desired.

Table 7-12. Specific Optimization Options
OptionDefinition
-falign-functions

-falign-functions=n

Align the start of functions to the next power-of-two greater than n, skipping up to n bytes. For instance, -falign-functions=32 aligns functions to the next 32-byte boundary, but -falign-functions=24 would align to the next 32-byte boundary only if this can be done by skipping 23 bytes or less.

-fno-align-functions and -falign-functions=1 are equivalent and mean that functions will not be aligned.

The assembler only supports this flag when n is a power of two; so n is rounded up. If n is not specified, use a machine-dependent default.

-falign-labels

-falign-labels=n

Align all branch targets to a power-of-two boundary, skipping up to n bytes like -falign-functions. This option can easily make code slower, because it must insert dummy operations for when the branch target is reached in the usual flow of the code.

If -falign-loops or -falign-jumps are applicable and are greater than this value, then their values are used instead.

If n is not specified, use a machine-dependent default which is very likely to be 1, meaning no alignment.

-falign-loops

-falign-loops=n

Align loops to a power-of-two boundary, skipping up to n bytes like -falign-functions. The hope is that the loop will be executed many times, which will make up for any execution of the dummy operations.

If n is not specified, use a machine-dependent default.

-fcaller-savesEnable values to be allocated in registers that will be clobbered by function calls, by emitting extra instructions to save and restore the registers around such calls. Such allocation is done only when it seems to result in better code than would otherwise be produced.
-fcse-follow-jumpsIn common subexpression elimination, scan through jump instructions when the target of the jump is not reached by any other path. For example, when CSE encounters an if statement with an else clause, CSE will follow the jump when the condition tested is false.
-fcse-skip-blocksThis is similar to -fcse-follow-jumps, but causes CSE to follow jumps which conditionally skip over blocks. When CSE encounters a simple if statement with no else clause, -fcse-skip-blocks causes CSE to follow the jump around the body of the if.
-fexpensive-
optimizationsPerform a number of minor optimizations that are relatively expensive.
-ffunction-sections

-fdata-sections

Place each function or data item into its own section in the output file. The name of the function or the name of the data item determines the section’s name in the output file.

Only use these options when there are significant benefits for doing so. When you specify these options, the assembler and linker may create larger object and executable files and will also be slower.

See also The -ffunction-sections Option.

-fgcsePerform a global common subexpression elimination pass. This pass also performs global constant and copy propagation.
-fgcse-lmWhen -fgcse-lm is enabled, global common subexpression elimination will attempt to move loads which are only killed by stores into themselves. This allows a loop containing a load/store sequence to be changed to a load outside the loop, and a copy/store within the loop.
-fgcse-smWhen -fgcse-sm is enabled, a store motion pass is run after global common subexpression elimination. This pass will attempt to move stores out of loops. When used in conjunction with -fgcse-lm, loops containing a load/store sequence can be changed to a load before the loop and a store after the loop.
-fno-defer-popAlways pop the arguments to each function call as soon as that function returns. The compiler normally lets arguments accumulate on the stack for several function calls and pops them all at once.
-fno-peephole

-fno-peephole2

Disable machine specific peephole optimizations. Peephole optimizations occur at various points during the compilation. -fno-peephole disables peephole optimization on machine instructions, while -fno-peephole2 disables high level peephole optimizations. To disable peephole entirely, use both options.
-foptimize-
register-move

-fregmove

Attempt to reassign register numbers in move instructions and as operands of other simple instructions in order to maximize the amount of register tying.

-fregmove and -foptimize-register-moves are the same optimization.

-frename-registersAttempt to avoid false dependencies in scheduled code by making use of registers left over after register allocation. This optimization will most benefit processors with lots of registers. It can, however, make debugging impossible, since variables will no longer stay in a “home register”.
-frerun-cse-after-
loopRerun common subexpression elimination after loop optimizations has been performed.
-frerun-loop-optRun the loop optimizer twice.
-fschedule-insnsAttempt to reorder instructions to eliminate Read-After-Write stalls (see your device Family Reference Manual (FRM) for more details). Typically improves performance with no impact on code size.
-fschedule-insns2Similar to -fschedule-insns, but requests an additional pass of instruction scheduling after register allocation has been done.
-fstrength-reducePerform the optimizations of loop strength reduction and elimination of iteration variables.
-fstrict-aliasingAllows the compiler to assume the strictest aliasing rules applicable to the language being compiled. For C, this 
activates optimizations based on the type of expressions. In particular, an object of one type is assumed never to reside at the same address as an object of a different type, unless the types are almost the same. For example, an unsigned int can alias an int, but not a void* or a double. A character type may alias any other type.

Pay special attention to code like this:

union a_union { 
  int i;
  double d;
};

int f() {
  union a_union t;
  t.d = 3.0;
  return t.i;
}

The practice of reading from a different union member than the one most recently written to (called “type-punning”) is common. Even with -fstrict-aliasing, type-punning is allowed, provided the memory is accessed through the union type. So the code above will work as expected, but the following code might not:

int f() {
  a_union t;
  int* ip;
  t.d = 3.0;
  ip = &t.i;
  return *ip; 
}
-fthread-jumpsPerform optimizations where a check is made to see if a jump branches to a location where another comparison subsumed by the first is found. If so, the first branch is redirected to either the destination of the second branch or a point immediately following it, depending on whether the condition is known to be true or false.
-funroll-loopsPerform the optimization of loop unrolling. This is only done for loops whose number of iterations can be determined at compile time or run time. -funroll-loops implies both -fstrength-reduce and -frerun-cse-after-loop.
-funroll-all-loopsPerform the optimization of loop unrolling. This is done for all loops and usually makes programs run more slowly. -funroll-all-loops implies -fstrength-reduce, as well as -frerun-cse-after-loop.