5.5.3 Multiplication
The PIC18 instruction set includes several 8-bit by 8-bit hardware multiple instructions, with these being used by the compiler in many situations. Non-PIC18 targets always use a library routine for multiplication operations.
There are three ways that 8x8-bit integer multiplication can be implemented by the compiler:
- Hardware Multiply Instructions (HMI)
- These assembly instructions are the most efficient method of multiplication, but they are only available on PIC18 devices.
- A bitwise iteration (8loop)
- Where dedicated multiplication instructions are not available, this
implementation produces the smallest amount of code – a loop cycles through the
bit pattern in the operands and constructs the result bit-by-bit.
The speed of this implementation varies and is dependent on the operand values; however, this is typically the slowest method of performing multiplication.
- An unrolled bitwise sequence (8seq)
- This implementation performs a sequence of instructions that is identical to the
bitwise iteration (above), but the loop is unrolled.
The generated code is larger, but execution is faster than the loop version.
Multiplication of operands larger than 8 bits can be performed one of the following two ways:
- A bitwise iteration (xloop)
- This is the same algorithm used by 8-bit multiplication (above) but the loop
runs over all (x) bits of the operands.
Like its 8-bit counterpart, this implementation produces the smallest amount of code but is typically the slowest method of performing multiplication.
- A bytewise decomposition (bytdec)
- This is a decomposition of the multiplication into a summation of many 8-bit
multiplications. The 8-bit multiplications can then be performed using any of
the methods described above.
This decomposition is advantageous for PIC18 devices, which can then use hardware multiply instructions.
For other devices, this method is still fast, but the code size can become impractical.
Multiplication of floating-point operands operates in a similar way – the integer mantissas can be multiplied using either a bitwise loop (xfploop) or by a bytewise decomposition.
The following tables indicate which of the multiplication methods are chosen by the compiler when performing multiplication of both integer and floating point operands. The method is dependent on the size of the operands, the type of optimizations enabled and the target device.
The table below shows the methods chosen when speed optimizations are enabled (see 4.6.6 Options for Controlling Optimization).
Device | 8-bit | 16-bit | 24-bit | 32-bit | 24-bit FP | 32-bit FP |
---|---|---|---|---|---|---|
PIC18 | HMI | bytdec+HMI | bytdec+HMI | bytdec+HMI | bytdec+HMI | bytdec+HMI |
Enhanced Mid-range | 8seq | bytdec+8seq | bytdec+8seq | bytdec+8seq | bytdec+8seq | bytdec+8seq |
Mid-range/ Baseline | 8seq | 16loop | 24loop | 32loop | 24fploop | 32fploop |
The table below shows the method chosen when space optimizations are enabled or when no C-level optimizations are enabled.
Device | 8-bit | 16-bit | 24-bit | 32-bit | 24-bit FP | 32-bit FP |
---|---|---|---|---|---|---|
PIC18 | HMI | bytdec+HMI | 24loop | 32loop | 24fploop | 32fploop |
Enhanced Mid-range | 8loop | bytdec+8loop | 24loop | 32loop | 24fploop | 32fploop |
Mid-range/Baseline | 8loop | 16loop | 24loop | 32loop | 24fploop | 32fploop |
The source code for the multiplication routines (documented with the algorithms employed) is available in the pic/c99/sources directory, located in the compiler’s installation directory. Look for files whose name has the form Umulx.c. where x is the size of the operation in bits.
If your device and optimization settings dictate the use of a bitwise multiplication loop you can sometimes arrange the multiplication operands in your C code to improve the operation’s speed. Where possible, ensure that the left operand to the multiplication is the smallest of the operands.
For example, in the code:
x = 10;
y = 200;
result = x * y; // first multiply
result = y * x; // second multiply
the variable result will be assigned the same value in both statements, but the first multiplication expression will be performed faster than the second.