3.5.1.17.12 Error Correction Code (ECC) Library
(Ask a Question)The Error Correction Code (ECC) library offers API functions for exposing ECC signals from the hardware memory. To use the ECC library, include the header file:
#include "hls/ecc.hpp"
Accessing ECC Signals
ECC signals can be accessed through the API functionread_ecc
. read_ecc
will take in a pointer to the
array element as the argument and returns a data wrapper that contains 3 elements:data
- The data read from the address of the array element. When single-bit error is detected, the data output is automatically corrected.
sb_correct
- True if a single-bit error was detected and corrected.
db_detect
- True if a double-bit error was detected.
#include <stdio.h> #include "hls/ecc.hpp" #define SIZE 100 using namespace hls; #pragma HLS memory impl variable(x) ecc(true) int x[SIZE]; int f(int c) { #pragma HLS function top int sb_count = 0; int db_count = 0; // Initialize the memory for (int i = 0; i < SIZE; i++) x[i] = c; int sum = 0; for (int i = 0; i < SIZE; i++) { auto ecc_out = read_ecc(&x[i]); // Check for single-bit error (corrected) if(ecc_out.sb_correct) { // Write back to correct the contents of the RAM x[i] = ecc_out.data; sb_count++; } // Check for double-bit error (detected) if(ecc_out.db_detect) db_count++; sum += ecc_out.data; } printf("sb_count: %d\n", sb_count); printf("db_count: %d\n", db_count); return sum; }
In the example, variable x
has ECC enabled by
setting ecc(true)
in the memory pragma. In the top-level function
f
, the elements of x
are initialized using
the argument c
. After the initialization, x
is
read element-by-element using read_ecc(&x[i])
instead of
directly using x[i]
. The call to read_ecc
returns
a data wrapper ecc_out
that contains the data read from memory and
the ECC signals. The ECC signals are used to increment the corresponding counters
sb_count
, db_count
, and the read data is
added to sum
.
Note that when single-bit error was detected
(sb_correct = true
), the read data output is corrected but data
in the RAM location is not updated. Thus, in the example, corrected data is manually
written back to the RAM for single-bit errors.
ECC RAM Wrapper
The ECC library also offers a C++ wrapper to encapsulate some of the common functionality to handle errors usingECC_RAM
. ECC_RAM
is a pure C++
implementation using read_ecc
to show case how the access to the
low-level ECC signals can be abstracted and used seamlessly in the
design.ECC_RAM<data_type, depth, SB_WRITE_BACK, DB_OVERRIDE, DB_DEFAULT> ecc_ram
ECC_RAM
uses template parameters to configure the memory and error handling:Data type
- the element type of the memory.
Depth
- the number of elements in the memory.
SB_WRITE_BACK
- if true, when a single-bit error is detected, the corrected value is
immediately written-back to correct the corrupted data in the memory.CAUTION: Enabling immediate write-back using
SB_WRITE_BACK
, can affect the performance since the load operation can invoke an immediate store to the same address. This should be taken into consideration if the latency is critical (e.g. in a loop pipeline). DB_OVERRIDE
- if true, instead of returning the corrupted data when double-bit error is detected, a default value is used.
DB_DEFAULT
- a default value when double-bit error is detected.
ECC_RAM
.#include <hls/ecc.hpp> #include <stdio.h> using namespace hls; #define __REPORT_ECC__ #define N 1000 #pragma HLS memory impl variable(ecc_ram) ecc(true) ECC_RAM<int, // data type N, // depth true, // SB_WRITE_BACK true, // DB_OVERRIDE -1 // DB_DEFAULT > ecc_ram; // ----- Top function: Read i, j and write to k int f(int i, int j, int k, int val) { #pragma HLS function top // ----- Reading and handling errors implicitly int d_i = ecc_ram[i]; // ----- Reading and handling errors explicitly int d_j = 0; if (!ecc_ram.read(j, d_j)) d_j = -2; int sum = d_i + d_j; // ----- Writing ecc_ram[k] = val; // ----- Reporting auto sb_count = ecc_ram.sb_count(); // ----- Scrubbing after a certain number of SB errors if (ecc_ram.sb_count() > N / 2) ecc_ram.scrub(); return d_i + d_j; }
In the example, ecc_ram
uses
ECC_RAM
to instantiate an int
array with 1000
elements.
ECC_RAM
can be accessed similar to a C++ array
using []
, however the implementation uses
read_ecc
to access data and the ECC signals, and handle the
errors based on the template configuration. With the configuration in the
example:
ecc_ram[i]
will read the data and implicitly handle errors:- If a single-bit error is detected, write back the corrected data to the RAM and return the corrected data
- If a double-bit error is
detected, discard the read data and return the
DB_DEFAULT
value-1
.
ecc_ram.read(j, d_j)
will read the data atj
and and return the RAM data ind_j
(correct or erroneous) to the caller:- If a single-bit error is
detected, write back the corrected data to the RAM and set
d_j
to the read data - If a double-bit error is
detected, set
d_j
to the erroneous read data. - Return
true
if the data is correct,false
if a double-bit error is detected.
- If a single-bit error is
detected, write back the corrected data to the RAM and set
ECC_RAM
has internal counter that counts the number single-bit
and double-bit errors for any read operation. The counters can be accessed using
ecc_ram.sb_count()
and ecc_ram.db_count()
.
The counters can be reset using
ecc_ram.reset_counters()
.
ecc_ram.scrub()
will scrub the memory by reading element-by-element and write back them back. This
can be useful when there are many errors to refresh the entries with single-bit
errors and avoid further corruption.
ECC_RAM
can report the
error handling process to the standard output when __REPORT_ECC__
is
defined.
-- DB error detected at 0 Overriding with default value -1
- SB error corrected at 2 Writing back corrected value
ECC_RAM
is a wrapper around the the array that represents the
actual memory. In this release, it is not possible to apply memory optimizations on
the underlying data (e.g. partitioning).Class Method | Description |
---|---|
ECC_RAM<data_type, depth, SB_WRITE_BACK, DB_OVERRIDE,
DB_DEFAULT>() |
Create a new ECC RAM with the specified parameters. |
operator[i] |
Read/write data at index i and handle errors
based on the configuration. |
bool read(i, d_i) |
Read data at index i and save the read data from
memory into d_i . Return true if
the data is correct, false otherwise. |
int sb_count() |
Return the number of single-bit error detected and corrected. |
int db_count() |
Return the number of double-bit error detected. |
void reset_counters() |
Reset both single-bit and double-bit counters to 0. |
void scrub() |
Read the memory element-by-element and write back them back. Note
that scrub() automatically calls
reset_counters() to reset the error
counters. |