FPU Status Conditional Branches

The CPU has the ability to conditionally branch off various status bits generated within the coprocessor. In the case of the FPU, an internal status register (FSR) is supported which is updated at the end of each floating-point operation.

The FPU FSR is comprised of instruction exception status and FCPS/FCPQ/FTST instruction status. Conditional branching is supported within the CPU for the FCPS/FCPQ compare instructions only.

The CPU ISA includes a set of generic coprocessor conditional branch instructions, CBRA0 through CBRA15, each of which can operate with any instantiated coprocessor and branch based upon the state of a corresponding bit within a vector supplied by each coprocessor. In the case of the FPU, CBRA0 through CBRA13 are used, each represented as an FBRA instruction with its corresponding assembler attribute, for the FCPS/FCPQ instruction status branch conditions. The FCPS/FCPQ status is held in FSR [19:16] and indicates the comparison result. CBRA[n] timing is the same as any other CPU conditional branch, such that the condition is examined at the end of the CBRA[n] R-stage. If the condition is true, the branch is taken. If the condition is not true, the branch is not taken and sequential execution continues.

As is the case for all conditional branches, the instruction(s) immediately following the branch are speculatively executed, and they will either be part of the taken or the not taken path, based on the direction of the branch. These instructions are permitted to be floating-point operations. This requires that the FPU accommodate the possibility that these instructions could ultimately be killed due to a branch mispredict.

Note that the FPU will not return the result of FBRA instruction until any FCPS/FCPQ instruction already underway in the coprocessor pipeline has retired. The CPU will consequently stall until such time that the msw of the FSR is available to be read (though these are fast operations, so stalls should be minimal). In effect, a CPU conditional branch instruction operation will synchronize the integer and floating-point pipelines with respect to FPU FCPS/FCPQ status.

The LS 3-bits of the branch opcode concatenated with the sub-opcode bit (such that the sub-opcode bit becomes the LSb of this value), may be used by the CPU decoder as a bit pointer into the 16-bit branch status test value to select the corresponding branch predicate result. The branch then decides if the outcome is true (taken) or false (not taken) based on the state of the selected bit (where true is when the bit is set, false when clear).

Note: FCPS/FCPQ and FTST instructions update two different portions of the FSR. Consequently, execution of an FTST instruction (which also updates the FSR) will not inhibit the CPU CBRAn instructions from using the branch status generated from the FSR ordering relation bits.