Arithmetic With ap_[u]fixpt Types
(Ask a Question)The Arbitrary Precision Fixed Point library supports all standard arithmetic, logical
bitwise, shifts, and comparison operations. During arithmetic intermediate results are kept
in a wide enough type to hold all of the possible resulting values. Operands are shifted to
line up decimal points, and sign or zero-extended to match widths before an operation is
performed. For fixed point arithmetic, whenever the result of a calculation can be negative
the intermediate type is an ap_fixpt
instead of ap_ufixpt
regardless of whether any of the operands were ap_fixpt. Overflow and quantization handling
only happen when the result is assigned to a fixed point type.
ap_[u]fixpt
types. Also, non-assigning shifts (<<, >>,
.ashr(x)) do not change the width or type of the fixed point they are applied to. This
means that bits can be shifted out of range.Fixed point types can be mixed freely with other arbitrary precision and c++ numeric types for arithmetic, logical bitwise, and comparison operations, with some caveats for floating point types.
ap_[u]fixpt
type before being used, because of the wide range
of possible values the floating point type could represent. It is also a good idea, but not
required, to use ap_[u]int
types in place of C++ integers when less width
is required.ap_fixpt
just
big enough to hold all values of the ap_[u]fixpt
type being compared
against, with the AP_TRN and AP_WRAP modes on.#include "hls/ap_fixpt.hpp" #include <iostream> #include <stdio.h> using namespace hls;
//... ap_ufixpt<65, 14> a = 32.5714285713620483875274658203125; ap_ufixpt<15, 15> b = 7; ap_fixpt<8, 4> c = -3.125; // the resulting type is wide enough to hold all // 51 fractional bits of a, and 15 integer bits of b // the width, and integer width are increased by 1 to hold // all possible results of the addition ap_ufixpt<67, 16> d = a + b; // 39.5714285713620483875274658203125 std::cout << "d = " << d << std::endl; // the resulting type is a signed fixed point // with width, and integer width that are the sum // of the two operands' widths ap_fixpt<23, 19> e = b * c; // -21.875 std::cout << "e = " << e << std::endl; // Assignment triggers the AP_TRN_ZERO quantization mode ap_fixpt<8, 7, AP_TRN_ZERO> f = e; // -21.5 std::cout << "f = " << f << std::endl; // Mask out bits above the decimal f &= 0xFF; // -22 std::cout << "f = " << f << std::endl; // Assignment triggers the AP_SAT overflow mode, // and saturates the negative result to 0 ap_ufixpt<8, 4, AP_TRN, AP_SAT> g = b * d; // 0 std::cout << "g = " << g << std::endl;