26.7 ACLE Built-in Functions

Table 26-1.
Function prototype	Description
`uint32_t __sadd8(uint32_t op1, uint32_t op2)`	Performs four parallel signed 8-bit additions.
`uint32_t __qadd8(uint32_t op1, uint32_t op2)`	Performs four parallel signed 8-bit additions with saturation.
`uint32_t __shadd8(uint32_t op1, uint32_t op2)`	Performs four parallel signed 8-bit additions, then divides each sum by 2.
`uint32_t __uadd8(uint32_t op1, uint32_t op2)`	Performs four parallel unsigned 8-bit additions.
`uint32_t __uqadd8(uint32_t op1, uint32_t op2)`	Performs four parallel unsigned 8-bit additions with saturation.
`uint32_t __uhadd8(uint32_t op1, uint32_t op2)`	Performs four parallel unsigned 8-bit additions, then divides each sum by 2.
`uint32_t __ssub8(uint32_t op1, uint32_t op2)`	Performs four parallel signed 8-bit subtractions.
`uint32_t __qsub8(uint32_t op1, uint32_t op2)`	Performs four parallel signed 8-bit subtractions with saturation.
`uint32_t __shsub8(uint32_t op1, uint32_t op2)`	Performs four parallel signed 8-bit subtractions, then divides each difference by 2.
`uint32_t __usub8(uint32_t op1, uint32_t op2)`	Performs four parallel unsigned 8-bit subtractions.
`uint32_t __uqsub8(uint32_t op1, uint32_t op2)`	Performs four parallel unsigned 8-bit subtractions with saturation.
`uint32_t __uhsub8(uint32_t op1, uint32_t op2)`	Performs four parallel unsigned 8-bit subtractions, then divides each difference by 2.
`uint32_t __sadd16(uint32_t op1, uint32_t op2)`	Performs two parallel signed 16-bit additions.
`uint32_t __qadd16(uint32_t op1, uint32_t op2)`	Performs two parallel signed 16-bit additions with saturation.
`uint32_t __shadd16(uint32_t op1, uint32_t op2)`	Performs two parallel signed 16-bit additions, then divides each sum by 2.
`uint32_t __uadd16(uint32_t op1, uint32_t op2)`	Performs two parallel unsigned 16-bit additions.
`uint32_t __uqadd16(uint32_t op1, uint32_t op2)`	Performs two parallel unsigned 16-bit additions with saturation.
`uint32_t __uhadd16(uint32_t op1, uint32_t op2)`	Performs two parallel unsigned 16-bit additions, then divides each sum by 2.
`uint32_t __ssub16(uint32_t op1, uint32_t op2)`	Performs two parallel signed 16-bit subtractions.
`uint32_t __qsub16(uint32_t op1, uint32_t op2)`	Performs two parallel signed 16-bit subtractions with saturation.
`uint32_t __shsub16(uint32_t op1, uint32_t op2)`	Performs two parallel signed 16-bit subtractions, then divides each difference by 2.
`uint32_t __usub16(uint32_t op1, uint32_t op2)`	Performs two parallel unsigned 16-bit subtractions.
`uint32_t __uqsub16(uint32_t op1, uint32_t op2)`	Performs two parallel unsigned 16-bit subtractions with saturation.
`uint32_t __uhsub16(uint32_t op1, uint32_t op2)`	Performs two parallel unsigned 16-bit subtractions, then divides each difference by 2.
`int32_t __ssat16(int32_t x, const unsigned int sat_width)`	Saturates each 16-bit halfword of a 32-bit integer to a specified signed bit width.
`uint32_t __usat16(uint32_t x, const unsigned int sat_width)`	Saturates an unsigned integer to the range of an unsigned 16-bit integer.
`uint32_t __smlad(uint32_t op1, uint32_t op2, uint32_t op3)`	Performs two signed 16x16-bit multiplications and accumulates the results.
`uint32_t __smuad(uint32_t op1, uint32_t op2)`	Performs two signed 16x16-bit multiplications and returns the sum of the products.
`uint32_t __smlsd(uint32_t op1, uint32_t op2, uint32_t acc)`	Performs two signed 16x16-bit multiplications and subtracts the result from an accumulator.
`uint64_t __smlald(uint32_t op1, uint32_t op2, uint64_t acc)`	Performs two signed 16x16-bit multiplications and accumulates the results into a 64-bit accumulator.
`uint32_t __smlabb(uint32_t value1, uint32_t value2, uint32_t value3)`	Performs a signed multiply-accumulate on the lower 16-bit halfword of `value1` and the lower 16-bit halfword of `value2`, then accumulates the result with `value3`.
`uint32_t __smlabt(uint32_t value1, uint32_t value2, uint32_t value3)`	Performs a signed multiply-accumulate on the lower 16-bit halfword of `value1` and the upper 16-bit halfword of `value2`, then accumulates the result with `value3`.
`uint32_t __smlatb(uint32_t value1, uint32_t value2, uint32_t value3)`	Performs a signed multiply-accumulate on the upper 16-bit halfword of `value1` and the lower 16-bit halfword of `value2`, then accumulates the result with `value3`.
`uint32_t __smlatt(uint32_t value1, uint32_t value2, uint32_t value3)`	Performs a signed multiply-accumulate on the upper 16-bit halfword of `value1` and the upper 16-bit halfword of `value2`, then accumulates the result with `value3`.
`uint32_t __smlawb(uint32_t value1, uint32_t value2, uint32_t value3)`	Performs a signed multiply-accumulate between a 32-bit operand and the lower 16-bit halfword of another operand, then accumulates the result with `value3`.
`uint32_t __smlawt(uint32_t value1, uint32_t value2, uint32_t value3)`	Similar to `__smlawb` but operates on the upper 16-bit halfword of the second operand.
`int32_t __smmls(int32_t value1, int32_t value2, int32_t value3)`	Performs a signed multiply-subtract operation with the most significant word of the product.
`int32_t __smmlar(int32_t value1, int32_t value2, int32_t value3)`	Performs a signed multiply-accumulate operation with rounding and the most significant word of the product.
`int32_t __smmlsr(int32_t value1, int32_t value2, int32_t value3)`	Performs a signed multiply-subtract operation with rounding and the most significant word of the product.
`int32_t __smmulr(int32_t value1, int32_t value2)`	Performs a signed multiply operation with rounding and returns the most significant word of the product.
`uint32_t __smulbb(uint32_t value1, uint32_t value2)`	Executes a signed multiply between the lower 16-bit halfwords of both operands.
`uint32_t __smulbt(uint32_t value1, uint32_t value2)`	Executes a signed multiply between the lower 16-bit halfword of `value1` and the upper 16-bit halfword of `value2`.
`uint32_t __smultb(uint32_t value1, uint32_t value2)`	Executes a signed multiply between the upper 16-bit halfword of `value1` and the lower 16-bit halfword of `value2`.
`uint32_t __smultt(uint32_t value1, uint32_t value2)`	Executes a signed multiply between the upper 16-bit halfwords of both operands.
`uint32_t __smulwb(uint32_t value1, uint32_t value2)`	Executes a signed multiply between a 32-bit operand and the lower 16-bit halfword of another operand.
`uint32_t __smulwt(uint32_t value1, uint32_t value2)`	Similar to __smulwb but operates on the upper 16-bit halfword of the second operand.
`uint64_t __umaal(uint32_t value1, uint32_t value2, uint32_t value3, uint32_t value4)`	Performs an unsigned multiply-accumulate long operation.
`uint64_t __smlalbb(uint32_t value1, uint32_t value2, uint64_t value3)`	Performs a signed multiply-accumulate long operation on the lower 16-bit halfwords of both operands.
`uint64_t __smlalbt(uint32_t value1, uint32_t value2, uint64_t value3)`	Performs a signed multiply-accumulate long operation on the lower 16-bit halfword of `value1` and the upper 16-bit halfword of `value2`.
`uint64_t __smlaltb(uint32_t value1, uint32_t value2, uint64_t value3)`	Performs a signed multiply-accumulate long operation on the upper 16-bit halfword of `value1` and the lower 16-bit halfword of `value2`.
`uint64_t __smlaltt(uint32_t value1, uint32_t value2, uint64_t value3)`	Performs a signed multiply-accumulate long operation on the upper 16-bit halfwords of both operands.

uint32_t __sadd8(uint32_t op1, uint32_t op2)

Performs four parallel signed 8-bit additions.

uint32_t __qadd8(uint32_t op1, uint32_t op2)

Performs four parallel signed 8-bit additions with saturation.

uint32_t __shadd8(uint32_t op1, uint32_t
                            op2)

Performs four parallel signed 8-bit additions, then divides each sum by 2.

uint32_t __uadd8(uint32_t op1, uint32_t op2)

Performs four parallel unsigned 8-bit additions.

uint32_t __uqadd8(uint32_t op1, uint32_t
                            op2)

Performs four parallel unsigned 8-bit additions with saturation.

uint32_t __uhadd8(uint32_t op1, uint32_t
                            op2)

Performs four parallel unsigned 8-bit additions, then divides each sum by 2.

uint32_t __ssub8(uint32_t op1, uint32_t op2)

Performs four parallel signed 8-bit subtractions.

uint32_t __qsub8(uint32_t op1, uint32_t op2)

Performs four parallel signed 8-bit subtractions with saturation.

uint32_t __shsub8(uint32_t op1, uint32_t
                            op2)

Performs four parallel signed 8-bit subtractions, then divides each difference by 2.

uint32_t __usub8(uint32_t op1, uint32_t op2)

Performs four parallel unsigned 8-bit subtractions.

uint32_t __uqsub8(uint32_t op1, uint32_t
                            op2)

Performs four parallel unsigned 8-bit subtractions with saturation.

uint32_t __uhsub8(uint32_t op1, uint32_t
                            op2)

Performs four parallel unsigned 8-bit subtractions, then divides each difference by 2.

uint32_t __sadd16(uint32_t op1, uint32_t
                            op2)

Performs two parallel signed 16-bit additions.

uint32_t __qadd16(uint32_t op1, uint32_t
                            op2)

Performs two parallel signed 16-bit additions with saturation.

uint32_t __shadd16(uint32_t op1, uint32_t
                            op2)

Performs two parallel signed 16-bit additions, then divides each sum by 2.

uint32_t __uadd16(uint32_t op1, uint32_t
                            op2)

Performs two parallel unsigned 16-bit additions.

uint32_t __uqadd16(uint32_t op1, uint32_t
                            op2)

Performs two parallel unsigned 16-bit additions with saturation.

uint32_t __uhadd16(uint32_t op1, uint32_t
                            op2)

Performs two parallel unsigned 16-bit additions, then divides each sum by 2.

uint32_t __ssub16(uint32_t op1, uint32_t
                            op2)

Performs two parallel signed 16-bit subtractions.

uint32_t __qsub16(uint32_t op1, uint32_t
                            op2)

Performs two parallel signed 16-bit subtractions with saturation.

uint32_t __shsub16(uint32_t op1, uint32_t
                            op2)

Performs two parallel signed 16-bit subtractions, then divides each difference by 2.

uint32_t __usub16(uint32_t op1, uint32_t
                            op2)

Performs two parallel unsigned 16-bit subtractions.

uint32_t __uqsub16(uint32_t op1, uint32_t
                            op2)

Performs two parallel unsigned 16-bit subtractions with saturation.

uint32_t __uhsub16(uint32_t op1, uint32_t
                            op2)

Performs two parallel unsigned 16-bit subtractions, then divides each difference by 2.

int32_t __ssat16(int32_t x, const unsigned int
                                sat_width)

Saturates each 16-bit halfword of a 32-bit integer to a specified signed bit width.

uint32_t __usat16(uint32_t x, const unsigned int
                                sat_width)

Saturates an unsigned integer to the range of an unsigned 16-bit integer.

uint32_t __smlad(uint32_t op1, uint32_t op2, uint32_t
                                op3)

Performs two signed 16x16-bit multiplications and accumulates the results.

uint32_t __smuad(uint32_t op1, uint32_t op2)

Performs two signed 16x16-bit multiplications and returns the sum of the products.

uint32_t __smlsd(uint32_t op1, uint32_t op2, uint32_t
                                acc)

Performs two signed 16x16-bit multiplications and subtracts the result from an accumulator.

uint64_t __smlald(uint32_t op1, uint32_t op2, uint64_t
                                acc)

Performs two signed 16x16-bit multiplications and accumulates the results into a 64-bit accumulator.

uint32_t __smlabb(uint32_t value1, uint32_t value2, uint32_t
                                value3)

Performs a signed multiply-accumulate on the lower 16-bit halfword of value1 and the lower 16-bit halfword of value2, then accumulates the result with value3.

uint32_t __smlabt(uint32_t value1, uint32_t value2, uint32_t
                                value3)

Performs a signed multiply-accumulate on the lower 16-bit halfword of value1 and the upper 16-bit halfword of value2, then accumulates the result with value3.

uint32_t __smlatb(uint32_t value1, uint32_t value2, uint32_t
                                value3)

Performs a signed multiply-accumulate on the upper 16-bit halfword of value1 and the lower 16-bit halfword of value2, then accumulates the result with value3.

uint32_t __smlatt(uint32_t value1, uint32_t value2, uint32_t
                                value3)

Performs a signed multiply-accumulate on the upper 16-bit halfword of value1 and the upper 16-bit halfword of value2, then accumulates the result with value3.

uint32_t __smlawb(uint32_t value1, uint32_t value2, uint32_t
                                value3)

Performs a signed multiply-accumulate between a 32-bit operand and the lower 16-bit halfword of another operand, then accumulates the result with value3.

uint32_t __smlawt(uint32_t value1, uint32_t value2, uint32_t
                                value3)

Similar to __smlawb but operates on the upper 16-bit halfword of the second operand.

int32_t __smmls(int32_t value1, int32_t value2, int32_t
                                value3)

Performs a signed multiply-subtract operation with the most significant word of the product.

int32_t __smmlar(int32_t value1, int32_t value2, int32_t
                                value3)

Performs a signed multiply-accumulate operation with rounding and the most significant word of the product.

int32_t __smmlsr(int32_t value1, int32_t value2, int32_t
                                value3)

Performs a signed multiply-subtract operation with rounding and the most significant word of the product.

int32_t __smmulr(int32_t value1, int32_t
                            value2)

Performs a signed multiply operation with rounding and returns the most significant word of the product.

uint32_t __smulbb(uint32_t value1, uint32_t
                            value2)

Executes a signed multiply between the lower 16-bit halfwords of both operands.

uint32_t __smulbt(uint32_t value1, uint32_t
                            value2)

Executes a signed multiply between the lower 16-bit halfword of value1 and the upper 16-bit halfword of value2.

uint32_t __smultb(uint32_t value1, uint32_t
                            value2)

Executes a signed multiply between the upper 16-bit halfword of value1 and the lower 16-bit halfword of value2.

uint32_t __smultt(uint32_t value1, uint32_t
                            value2)

Executes a signed multiply between the upper 16-bit halfwords of both operands.

uint32_t __smulwb(uint32_t value1, uint32_t
                            value2)

Executes a signed multiply between a 32-bit operand and the lower 16-bit halfword of another operand.

uint32_t __smulwt(uint32_t value1, uint32_t
                            value2)

Similar to __smulwb but operates on the upper 16-bit halfword of the second operand.

uint64_t __umaal(uint32_t value1, uint32_t value2, uint32_t
                                value3, uint32_t value4)

Performs an unsigned multiply-accumulate long operation.

uint64_t __smlalbb(uint32_t value1, uint32_t value2, uint64_t
                                value3)

Performs a signed multiply-accumulate long operation on the lower 16-bit halfwords of both operands.

uint64_t __smlalbt(uint32_t value1, uint32_t value2, uint64_t
                                value3)

Performs a signed multiply-accumulate long operation on the lower 16-bit halfword of value1 and the upper 16-bit halfword of value2.

uint64_t __smlaltb(uint32_t value1, uint32_t value2, uint64_t
                                value3)

Performs a signed multiply-accumulate long operation on the upper 16-bit halfword of value1 and the lower 16-bit halfword of value2.

uint64_t __smlaltt(uint32_t value1, uint32_t value2, uint64_t
                                value3)

Performs a signed multiply-accumulate long operation on the upper 16-bit halfwords of both operands.