12.6 Accessing Elements in SIMD Variables

The use of SIMD variables enables operations on multiple data in parallel. However, in certain situations, programmers need to access elements inside a SIMD variable. This can be done by using a union type that unites a SIMD type and an array of a basic type as follows.

  typedef union
  {
    v4i8 a; 
    unsigned char b[4];
  } v4i8_union;
  typedef short q15;
  typedef union
  {
    v2q15 a; 
    q15 b[2];
  } v2q15_union;

As shown in the figure above for a v4i8 variable, b[0] is used to access the first element in the variable. The element b[0] is right-most position. The following examples show how to extract or assign elements.

Example:

 /* v4i8 Example */
  v4i8 i;
  unsigned char j, k, l, m;
  v4i8_union temp;
  /* Assume we want to extract from i.  */
  temp.a = i;
  j = temp.b[0];
  k = temp.b[1];
  l = temp.b[2];
  m = temp.b[3];
 /* Assume we want to assign j, k, l, m to i.  */
  temp.b[0] = j;
  temp.b[1] = k;
  temp.b[2] = l;
  temp.b[3] = m;
  i = temp.a;
  /* -------------------------------------------------------- */

Example:

 /* v2q15 Example  */
  v2q15 i;
  q15 j, k;
  v2q15_union temp;
  /* Assume we want to extract from i.  */
  temp.a = i;
  j = temp.b[0];
  k = temp.b[1];
  /* Assume we want to assign j, k to i.  */
  temp.b[0] = j;
  temp.b[1] = k;
  i = temp.a;

Using SIMD data types is a very powerful technique. Programmers can enjoy the performance improvement from SIMD data types by calling the DSP built-in functions (see 29 Built-In Functions) and/or using generic C operators. For SIMD data types, the compiler can map C operators (e.g., +, -, *, /) to hardware instructions directly, so long as the selected target PIC32 MCU features the DSP-enhanced core.

Note: In many cases, optimization level -O1 or greater may be required to optimize the code to use the SIMD instruction.

Here are some examples:

  typedef signed char v4i8 __attribute__ ((vector_size(4)));
  v4i8 a, b, c;
  c = a + b; // compiler generates addu.qb
  c = a - b; // compiler generates subu.qb
  /* -------------------------------------------------------- */
  typedef short v2q15 __attribute__ ((vector_size(4)));
  v2q15 d, e, f;
  f = d + e; // compiler generates addq.ph
  f = d - e; // compiler generates subq.ph
  /* -------------------------------------------------------- */
  typedef short v2i16 __attribute__ ((vector_size(4)));
  v2i16 x, y, z;
  z = x * y; // compiler generates mul.ph
  /* -------------------------------------------------------- */
  typedef _Sat _Fract sat_v2hq __attribute__ ((vector_size(4)));
  sat_v2hq a, b, c;
  c = a + b; // compiler generates addq_s.ph
  c = a - b; // compiler generates subq_s.ph
  c = a * b; // compiler generates mulq_rs.ph
Note: When char or short data elements are packed into SIMD data types, the first data must be aligned to 32 bits; otherwise, the unaligned memory accesses may generate general exceptions or decrease performance.