20.2.1 Examples:

Insert Bit-field

This example demonstrates how to use the BFI instruction to insert a bit-field into a 32-bit wide variable. This function-like macro uses inline assembly to emit the BFI instruction, which is not commonly generated from C/C++ code.

/* Thumb2 insert bits */
#define _ins(tgt,val,pos,sz) __extension__({                      \
    unsigned int __t = (tgt), __v = (val); \
    __asm__ ("bfi\t%0,%1,%2,%3"                    /* template  */ \
             : "+r" (__t)                          /* output    */ \
             : "r" (__v), "M" (pos), "M" (sz));    /* input    */ \
    __t;                                                          \
})

Here __v, pos, and sz are input operands. The __v operand is constrained to be of type 'r' (a register). The pos and sz operands are constrained to be of type 'M' (a constant in the range 0-32 or any power of 2).

The __t output operand is constrained to be of type 'r' (a register). The '+' modifier means that this operand is both read and written by the instruction and so the operand is both an input and an output.

The following example shows this macro in use.

unsigned int result;
void example (void)
{
    unsigned int insertval = 0x12;
    result = 0xAAAAAAAAu;
    result = _ins(result, insertval, 4, 8);
    /* result is now 0xAAAAA12A */
}

For this example, the compiler may generate assembly code similar to the following.

    movs   r2, #18             @ 0x12
    mov    r3, #-1431655766    @ 0xaaaaaaaa

    bfi    r3,r2,#4,#8         @ inline assembly

    ldr    r2, .L2             @ load result address
    str    r3, [r2]            @ assign the result
    bx     lr                  @ return
    ...
 .L2:
    .word result

Multiple Assembler Instructions

This example demonstrates how tot use a couple of REV instructions to perform a 64-bit byte swap. The REV instruction is swapping (reversing the order of) the bytes in a 32-bit word. This function-like macro uses inline assembly to create a “byte-swap double word” using instructions that are not commonly generated from C/C++ code. However, the same functionality can be gained by using one of the GCC built-in functions, __builtin_bswap64(). As a general rule, built-ins should be preferred over inline assembly, whenever possible.

The following shows the definition of the function-like macro, _bswapdw.

/* Thumb2 byte-swap double word */
#define _bswapdw(val) __extension__({            \
  union { uint32_t i[2]; uint64_t l; } __i, __o; \
  __i.l = (val);                                 \
  __asm__ ("rev\t%0, %3\n\t"                     \
           "rev\t%1, %2" /* template */          \
           : "=&r" (__o.i[0]), "=r" (__o.i[1])   \
           : "r" (__i.i[0]), "r" (__i.i[1]));    \
  __o.l; \
})

A union is used to reference the two 32-bit halves of a 64-bit integer. For example, the C expressions for the input operands are '__i.i[0]' and '__i.i[1]' and the ones for the output operands are '__o.i[0]' and '__o.i[1]', respectively.

All operands use the constraint 'r' (32-bit register). To be noted the '&' modifier for operand 0, indicating that it is an “early-clobber” (written before all the input operands are consumed, with the implication that the compiler will allocate a register different that the input ones). This is needed because the 32-bit halves themselves need to be swapped.

The function-like macro is shown in the following example assigning to result the content of value, swapped.

uint64_t result;
int example (void)
{
   uint64_t value = 0x0123456789ABCDEFull;
   result = _bswapdw (value);
   /* result == 0xEFCDAB8967452301 */
}

The compiler may generate assembly code similar to the following for this example:

 ldr r2, .L6 @ r2 = 0x01234567
 ldr r3, .L6+4 @ r3 = 0x89ABCDEF

 rev r1, r2 @ from inline asm
 rev r3, r3 @ from inline asm

 ldr r2, .L6+8 @ r2 = address of 'result'
 stm r2, {r1, r3} @ store value to 'result'
 bx lr @ return
 ...
 .align 2
.L6:
 .word 19088743 @ 0x01234567
 .word -1985229329 @ 0x89ABCDEF
 .word result