12.7 Array Alignment and Data Layout

The compiler provides a mechanism to specify the alignment of variables by using __attribute__ ((aligned (bytes))). The alignment is important to loading or storing SIMD variables: "v4i8" and "v2q15". If an array is aligned to a 4-byte boundary, that is, word-aligned, the compiler can load or store four 8-bit data for v4i8 variables (or two 16-bit data for v2q15 variables) at a time using the load word class of instructions. The following example shows that when a char array A is aligned to a 4-byte boundary, we can cast this array to a v4i8 array and load four items to a v4i8 variable at a time by using the lwx instruction. However, if this char array A is not aligned to a 4-byte boundary, executing the following code will result in an address exception due to a mis-aligned load.

Example:

 /* v4i8 Example */
    char A[128] __attribute__ ((aligned (4)));
    v4i8 test (int i)
    {
      v4i8 a;
      v4i8 *myA = (v4i8 *)A;
      a = myA[i];
      return a;
    }
  # Illustrative generated assembly with optimizations
  test:
    lui     $2,%hi(A)
    sll     $4,$4,2
    addiu   $2,$2,%lo(A)
    lwx      $2,$2($4)
    j       $31

After SIMD data is loaded from memory into a register, ensure that the SIMD variables in the register are ready for use without requiring any rearrangement of the data. To avoid such data rearrangement which can reduce the benefit of parallelism, design your array with an efficient data layout that is favorable for SIMD calculations.