3.5.1.17.11 Memory Allocation Library

For SoC designs on platforms running Linux, the SmartHLS Memory Allocation Library can be used to allocate memory in special areas outside of the areas normally used by the operating system. Memory in these areas is ensured to be physically contiguous and free from the virtual-physical mapping that is normally imposed by the operating system. Having contiguous, physically pinned down memory is required for efficient memory transfers between DDR and components on the FPGA fabric, since it avoids issues such as paging and virtual-to-physical address translation. The SmartHLS memory allocation library should be used for any accelerator arguments that use DMA Copy or Accelerator Direct Access 3.5.1.22.2 SoC Data Transfer Methods. An Example of how to use the library is shown below:

#include "hls/hls_alloc.h"

// Allocate enough memory for an array of 8 32-bit numbers in the default memory
// region using hls_malloc. This call has the same function signature as
// standard C malloc().
uint32_t array_size = 8 * sizeof(uint32_t);
uint32_t *array_ptr = (uint32_t *)hls_malloc(array_size);

// Allocate the same amount of memory in the non-cached DDR region. The second
// optional argument is used to specify which memory region to use.
uint32_t *noncached_array_ptr =
(uint32_t *)hls_malloc(array_size, HLS_ALLOC_NONCACHED);

// Use hls_memcpy to move data from one array to another. This call has the same
// signature as standard C memcpy(), with additional arguments to specify where
// the transfer is going and what method to use. In this example, we move data
// from one array in MSS DDR to another array in MSS DDR, using the hard DMA
// controller in the MSS.
hls_memcpy(noncached_array_ptr, array_ptr, array_size, HLS_ALLOC_MSS_TO_MSS,
HLS_ALLOC_PDMA);

// Free the allocated buffers using hls_free, has the same function signuture as
// standard C free().
hls_free(array_ptr);
hls_free(noncached_array_ptr);

The optional second argument in hls_malloc defines what memory region memory should be allocated in, and is of type hls_alloc_memory_type_t, defined in the hls_alloc.h. Using the SmartHLS reference SoC linux image, there are three memory regions available for use with the Memory Allocation Library, oulined in the table below. The address and size of each region can be modified to fit other Linux images by changing the hls_alloc_buffer_regions struct in hls_alloc.h and recompiling the library.

RegionAddressSize (bytes)Description
HLS_ALLOC_CACHED0xae0000000x02000000Cached DDR. Default if region unspecified. Recommended for best overall transfer times.
HLS_ALLOC_NONCACHED_WCB0xd80000000x08000000Non-cached DDR with write-combine buffer. Slightly better performance than Cached DDR for writes, but worse for reads.
HLS_ALLOC_NONCACHED0xc00000000x08000000Non-cached DDR. Not recommended (lower performance than other options).

There are two extra arguments in hls_memcpy compared to the standard C memcpy. The first extra argument is of type hls_alloc_direction_t, and describes the direction in which data is moving, which is required by the underlying library to properly move data between the MSS DDR and buffers on the FPGA fabric. The second extra argument is of type hls_alloc_transfer_type_t, and allows the user to choose between two transfer methods. Selecting HLS_ALLOC_MEMCPY will invoke memcpy under the hood, and let the OS choose the best way to move the data. Selecting HLS_ALLOC_PDMA will use the platform DMA engine in the MSS to move the data. Argument types are defined as enums in hls_alloc.h.

Note: The hls_memcpy function is automatically used as part of the accelerator driver generated by SmartHLS. Users are not expected to have to invoke this function to use SmartHLS accelerators.