2 CoreMark

CoreMark is a benchmarking suite that is intended for the performance of the CPU, when running with specific workloads. The internal CPU architecture is independent of the on-chip memory and peripherals. CoreMark focuses on processor core and cache memory read/write by utilizing the CPU pipeline architecture. The advantage of CoreMark is its size, which allows CoreMark to easily fit in a processor's memory and makes it suitable to get the performance outcome. To evaluate processor performance, CoreMark uses about 20k size of code, which includes I/O computations. All the computations are made at run time. CoreMark mainly focuses on basic read/write operations and integer operations. CoreMark uses realistic workloads. The CoreMark workload comprises of several commonly used algorithms, including:

  • Matrix manipulation, to exercise common math operations.
  • Linked list manipulation, to exercise the common use of pointers.
  • State machine operation, to exercise data-dependent branches.
  • Cyclic Redundancy Check (CRC) is a common function in the embedded systems.

The outcome of the CoreMark execution is as follows:

  • The four bits of workload tested are matrix manipulation, linked lists, state machines, and CRCs. The output of each stage of the compilers is intended to get the best outcome of the performance results.
  • PolarFire SoC device can load the entire test in the cache, DDR, and Flash memory.

The GCC on the RISC-V gives a good performance with -O2 optimization compiler flag and without a compressed extension (RVC).

For more information about CoreMark, see EEMBC.

CoreMark Benchmarking Results using Signed Index

The CoreMark benchmarking results are captured for one Application Processor Core U54 for both Bare Metal and Linux. The following table lists the CoreMark benchmarking results when signed index is used as loop counter. RISC-V is designed to be more efficient in handling the common case loop index variables in the form of “(signed) int”. Hence, a typedef in the header file was modified to be a signed data type (ee_u32).

Table 2-1. CoreMark Benchmarking Results when Signed Index is used as Loop Counter

CoreMark

Bare Metal

Linux

Memory Section

LIM

(1.8 MB)

ITIM

(28 KB)

eNVM

(128 KB)

eNVM

(128 KB)

Scratchpad

mem (512 KB)

LPDDR4

LPDDR4

Code (main loop is run from)

LIM

ITIM

eNVM

eNVM

Scratchpad

LPDDR4

LPDDR4

Stack located in

LIM

Scratchpad

LIM

Scratchpad

Scratchpad

LPDDR4

LPDDR4

CoreMark Result (CoreMark/MHz)

0.950

3.128

0.950

3.030

3.128

3.128

3.125

Code Size

Full application image size is 70 KB. CoreMark code size is less than 20 KB.

Full image size is around 50 KB

Compiler Flags for Bare Metal-Wno-maybe-uninitialized -fno-common -funroll-loops -finline-functions -falign-functions=16 -falign-jumps=4 -falign-loops=4 -finline-limit=1000 -fno-if-conversion2 -fselective-scheduling -fno-tree-dominator-opts
Compiler Flags for Linux-O3 -O2 -DPERFORMANCE_RUN=1 -DHAS_STDIO -DHAS_TIME_H -DUSE_CLOCK -fno-common -funroll-loops -finline-functions -falign-functions=16 -falign-jumps=4 -falign-loops=4 -finline-limit=1000 -fno-if-conversion2 -fselective-scheduling -fno-tree-dominator-opts -lpthread -DHAS_FLOAT=0 -mtune=sifive-7-series -lrt
Note:

CoreMark application chooses heap or stack during its run. Here, it is configured for stack.

CoreMark Benchmarking Results using Unsigned Index

The following table lists the CoreMark Benchmarking results using default header file (with no modifications to the typedef unsigned int ee_u32).

Table 2-2. CoreMark Benchmarking Results when Unsigned Index is used as Loop Counter

CoreMark

Bare Metal

Linux

Memory Section

LIM

(1.8 MB)

ITIM

(28 KB)

eNVM

(128 KB)

eNVM

(128 KB)

Scratchpad

mem (512 KB)

LPDDR4

LPDDR4

Code (main loop is run from)

LIM

ITIM

eNVM

eNVM

Scratchpad

LPDDR4

LPDDR4

Stack located in

LIM

Scratchpad

LIM

Scratchpad

Scratchpad

LPDDR4

LPDDR4

CoreMark Result (CoreMark/MHz)

0.92

2.645

0.916

2.568

2.65

2.65

2.56

Code Size

Full application image size is 70 KB. CoreMark code size is less than 20 KB.

Full image size is around 50 KB

Compiler Flags for Bare Metal-Wno-maybe-uninitialized -fno-common -funroll-loops -finline-functions -falign-functions=16 -falign-jumps=4 -falign-loops=4 -finline-limit=1000 -fno-if-conversion2 -fselective-scheduling -fno-tree-dominator-opts
Compiler Flags for Linux-O3 -O2 -DPERFORMANCE_RUN=1 -DHAS_STDIO -DHAS_TIME_H -DUSE_CLOCK -fno-common -funroll-loops -finline-functions -falign-functions=16 -falign-jumps=4 -falign-loops=4 -finline-limit=1000 -fno-if-conversion2 -fselective-scheduling -fno-tree-dominator-opts -lpthread -DHAS_FLOAT=0 -mtune=sifive-7-series -lrt
Note:

CoreMark application chooses heap or stack during its run. Here, it is configured for stack.