3.5.1.13 Multi-threading with SmartHLS Threads

In an FPGA hardware system, the same module can be instantiated multiple times to exploit spatial parallelism, where all module instances execute in parallel to achieve higher throughput. SmartHLS allows easily inferring such parallelism with the use of SmartHLS Threads which is a simplified API of std::thread commonly used in software. Parallelism described in software with SmartHLS threads is automatically compiled to parallel hardware with SmartHLS. Each thread in software becomes an independent module that concurrently executes in hardware.

For example, the code snippet below creates N threads running the Foo function in software. SmartHLS will correspondingly create N hardware instances all implementing the Foo function, and parallelize their executions. SmartHLS also supports mutex and barrier APIs so that synchronization between threads can be specified using locks and barriers.

void Foo (int* arg);

for (i = 0; i < N; i++) {
    thread[i] = hls::thread<void>(Foo, &args[i]);
}

SmartHLS supports hls::thread APIs, which are listed below in Supported HLS Thread APIs.

Note that for a hls::thread kernel, SmartHLS will automatically in-line any of its descendant functions. The inlining cannot be overridden with the noinline pragma (see SmartHLS Pragmas Manual).