1.8 SRAM - March C-Minus Algorithm

Test Name: SRAM Test with March Algorithms.

Purpose of test: Detect stuck bits and coupling faults in SRAM and on the data bus, as well as any addressing problems.

Faulure Modes Covered: Stuck at Faults (SAFs), Transition Faults (TFs), Address Decoder Faults (AFs), Inversion Coupling Faults (CFins), Static/State Coupling Faults (CFsts) and Idempotent coupling Faults (CFid).

Acceptable Measure (Annex. H): Static Memory Test (H.2.19.6).

Description: The internal SRAM is used for volatile storage of data and any faults related to this can be catastrophic for correct operation an application. This can be mitigated by executing the Word Oriented Memory (WOM) March C- Algorithm implemented in this diagnostic test. Note that the AoUs listed in the Assumption of Use section below shall be followed to ensure correct operation. The Background section gives a brief introduction to some common March tests, a more detailed description of the fault models they cover and a rationale for choosing the March C- algorithm for the implementation. Finally, the Word Oriented Memory Conversion section outlines the disadvantages of using a Bit Oriented Memory March algorithm and summarizes how and why a WOM March C- algorithm was converted and implemented with coverage of a so called unrestricted coupling fault model.

API Documentation: SRAM - March C-Minus Algorithm

Assumptions of Use

  • AoU-SRAM_MARCH_TEST-01: The SRAM March test shall be called before any initialization, configuration or self-test functions executed prior to the execution of the main function.

    Reason: As the test is destructive, any data placed in SRAM before the test will be lost. Calling the test before any other functions are executed ensures that no data is lost.

    Exception: A bootloader may run before the SRAM March test, if it does not depend on data passed from the application in SRAM.

  • AoU-SRAM_MARCH_TEST-02: The March test function shall be run on all SRAM memory being used by the application, including the stack area of the application.

    Reason: This is required to ensure test coverage of the Fault Models covered by the algorithm.

  • AoU-SRAM_MARCH_TEST-03: The used SRAM shall be allocated as a continuous block and include the lowest address (start address) in SRAM.

    Reason: If the application requires less SRAM resources than available, it is possible to restrict the SRAM size through linker flags settings and only test this section to reduce the Worst Case Execution Time (WCET) of the SRAM March test.

  • AoU-SRAM_MARCH_TEST-04: The value of the SRAM_DATA_REGION_LEN macro shall correspond to the size of the SRAM area used by the application.

    Reason: The SRAM March test function uses the value of the SRAM_DATA_REGION_LEN macro to calculate the start of the stack.

  • AoU-SRAM_MARCH_TEST-05: The stack pointer (SP) shall be set to the highest address in the SRAM area used by the application before calling the SRAM March test function. If the size of the SRAM is restricted to a smaller size than physically available on the device though a linker option, an additional linker option to configure the SP to point at the new end address must be used.

    Reason: When the SRAM March test function is called, the two-byte return address is pushed to stack. Since the test is destructive, the SRAM March test stores the return address in two CPU registers and restores it on stack after the test is completed to return correctly after the test. The SRAM March test requires that the return address is stored at the two highest SRAM addresses used by the application, defined by the SRAM_DATA_REGION_LEN macro value. However, since the linker flag option that points the SP to the highest address in the SRAM area used by the application will only take effect after the SRAM March test has been executed (in a later init section), the SP must additionally be set to the end address in code before calling the SRAM March test.

  • AoU-SRAM_MARCH_TEST-06: The SRAM March test shall only be used for devices with 128 kB of flash memory or less.

    Reason: For devices with a Flash larger than 128 kB, the program counter (PC) is extended to three bytes and, subsequently, a function call will involve pushing three bytes to the stack, which has not been accounted for in the current implementation.

  • AoU-SRAM_MARCH_TEST-07: The SRAM March test function shall be called with all interrupts disabled, including Non-Maskable Interrupts (NMI), High-Priority Interrupts and Normal Priority Interrupts.

    Reason: Interrupting the SRAM March test can lead to undefined behavior and cause the test to return an incorrect test result.

  • AoU-SRAM_MARCH_TEST-08A: The system integrator shall ensure that the WDT does not expire, i.e., cause a reset, during the execution of the SRAM March test.

    Reason: If the worst-case minimum duration of the WDT timeout is longer than the worst-case execution time (WCET) of the SRAM March test function call, a WDT reset instructions can be issued after the test is completed. Refer to #GUID-9B44F0C7-C60C-419E-A310-D3D2D4C9BCEA__table_qyd_5cf_wqbbelow to see the number of cycles required to execute the SRAM March test when no failure is detected for different optimization levels. Divide the number of cycles with the used main clock frequency during the test, defined by the SRAM_MARCH_CLK_FRQ macro, and multiply with 1000 to find a good approximation of the WCET in milliseconds. Since the size of the SRAM will directly affect the WCET of the SRAM March test function call, a reasonable approximation for the WCET with a different SRAM size can be found by reducing the WCET by the same factor that the SRAM size is reduced by.

  • AoU-SRAM_MARCH_TEST-08B: If the WDT timeout cannot be selected to be longer than the WCET of the SRAM March test, the system integrator may insert appropriate WDT reset instructions in the SRAM March test function itself to ensure timely resetting of the WDT and avoid a system reset during the SRAM March test.

    Reason: Executing WDT reset instructions (by using the macro wdt_reset() defined in avr/wdt.h) do not interfere with the March C- test.

  • AoU-SRAM_MARCH_TEST-09: The March test will only be performed if the WDT Reset Flag is not set. All other reset sources will trigger the execution of the March test after the reset. It is the responsibility of the system integrator to ensure that the SRAM March test is executed upon a reset after an unintentional WDT reset is detected.

    Reason: As the WDT Diagnostic test involve issuing WDT resets intentionally, the SRAM March test should not be run again while the WDT test is executed. One way to ensure that the March test is run, is checking for WDT resets in the application code, clear the flag and issue a software reset.

  • AoU-SRAM_MARCH_TEST-10: The system shall operate correctly even if the SRAM March exits early.

    Reason: The SRAM March test return earlier than the listed WCET if a memory fault is detected which can affect the timing of a WDT reset instruction.

  • AoU-SRAM_MARCH_TEST-11: The system integrator shall ensure that the SRAM_MARCH_CLK_FRQ macro is defined such that the device operates within the datasheet specification.

    Reason: The SRAM March test uses the SRAM_MARCH_CLK_FRQ macro to set the main clock frequency used when executing the March algorithm to support reduction of the WCET of the test. The main clock frequency is restored to the default value upon completion of the test. However, for some devices, the maximum main clock frequency is limited by the supply voltage and temperature range of the device. Refer to the Electrical Characteristics chapter of the respective device datasheet for more information.

Table 1-2. SRAM March test cyclesThe number of cycles used to execute the SRAM March test on the entire physically available SRAM with different optimization levels on different devices and the Worst Case Execution Time (WCET) in milliseconds using the maximum allowed main clock frequeny for the device (24 Mhz on AVR DA and 16 Mhz on the automotive version of the AVRtiny1 family) and –Os optimization. The values are obtained by using the stopwatch feature in the MPLAB X IDE 5.50 (Window->Debugging-> Stopwatch) to count the number of cycles used to execute the SRAM March test on the simulator. The number of cycles is used to calculate the WCET, depending on the main clock frequency, and has been verified to match values measured using both an on-device timer (TCA) and an external logic analyzer. However, the listed WCET does not account for any potential inaccuracy of the main clock source frequency.
Device-O0-O1-O2-O3-OsWCET Max Freq -Os
AVR128DAXX 5358195 3244283 3047631 3047631 3342524 139 ms
AVR64DAXX 2679413 1622267 1523919 1523919 1671356 69 ms
AVR32DAXX 1340019 811257 762061 762061 835770 35 ms
ATtiny321X 670323 405753 381133 381133 417978 26 ms
ATtiny161X 670323 405753 381133 381133 417978 26 ms
ATtiny81X 168050 101624 95435 95435 104632 7 ms
ATtiny41X 66073 50936 47819 47819 52408 3 ms
ATtiny21X 42468 25593 24012 24012 26304 2 ms

Background

March tests are a family of memory tests that are used to test specific fault models of variable/volatile memory, namely RAM or Static RAM (SRAM). The main idea is to “march” through the memory in ascending and descending order, cell by cell, and do write and read operations to sensitize and detect certain fault models. The design of a specific march element requires a systematic approach to mathematically prove that all the fault models in question are covered. The different march elements must be executed in a specific order, specified by the March algorithm, to ensure that all transitions are covered. The notation of March algorithms consists of prefixing each march element with an arrow and separating each march sequence with a semicolon. The arrows indicate an ascending (⇑ - start at lowest address in memory under test) or descending (⇓ - start at highest address in memory under test) order for the following march element. There is also the case where the address order does not matter (⇕). A march element consists of a sequence of writing operations (w0 – write 0 or w1 – write 1) and reading operations (r0 – read 0 or r1 – read 1) operations. In the case of the read operations, the cell is read and compared with the expected value (r0 expects 0 and r1 expects 1). All operations in a march element are performed on a single RAM cell (bit), before moving to the next cell in either ascending or descending order and the sequence is repeated. Once all cells in the memory under test has been tested with a march element, the next march element is executed.

The most commonly used March algorithms are called March C- and March B. March C- is a slightly more efficient version of March C, where a redundant ‘r0’ operation is removed. In March notation, they are described as:

March C-: {⇕(w0);⇑(r0,w1);⇑(r1,w0);⇓(r0,w1);⇓(r1,w0);⇕(r0)}

March B: {⇕(w0);⇑(r0,w1,r1,w0,r0,w1);⇑(r1,w0,w1);⇓(r1,w0,w1,w0);⇓(r0,w1,w0)}

Where e.g. ⇑(r0,w1,r1,w0,r0,w1) is March element number 1, or M1, of March B.

Both March B and March C- has been designed to cover the following fault models:

  • Stuck at Faults (SAFs): The logic value of a cell (or line in the sense amplifier or driver) is always 0 or 1.
  • Transition Faults (TFs): A cell fails to make a 0 to 1 transition or a 1 to 0 transition when it is written.
  • Address Decoder Faults (AFs): Possible functional faults in the address decoders:
    • With a certain address, no cell will be accessed
    • Certain address accesses multiple cells
    • A certain cell is accessed with multiple addresses
    • Certain cells are accessed with their own and other addresses
  • Coupling Faults (CFs): Coupling faults are faults in which a fault occurs in a cell because of coupling with other cells. The terminology distinguishes between aggressor cells (a-cell) and a victim cell (v-cell). That is, an operation performed on an a-cell can trigger a fault in the v-cell. There can be an exponential number of combinations in which a cell can be coupled with other cells. However, in the widely used coupling fault model, it is assumed that any two cells can be coupled together leading to irregular behavior in these two cells. This is called a 2-cell coupling fault model. There are several categories of 2-cell coupling CFs and both March C- and March B covers the most common:
    • Inversion Coupling Faults (CFins): An upper (0 to 1) or lower (1 to 0) transition write operation in an a-cell causes an inversion (toggle) in the v-cell.
    • Static/State coupling Faults (CFsts): A given value 0 or 1 of the a-cell forces a certain value 0 or 1 in a v-cell.
    • Idempotent coupling Faults (CFid): An upper (0 to 1) or lower (1 to 0) transition write operation in an a-cell forces a certain value (0 or 1) in a v-cell.

March B additionally covers some Linked Faults (LFs). LFs can be defined as the combination of single and/or two-cell faults in which faulty behavior of the cell can be masked by faulty behavior due to another fault in the same cell. Linked Faults takes place when more than one Fault Primitive (FP) is sensitized in a defective cell of a memory. Specifically, March B covers CFid-TF (CFid linked with TF) and CFid-CFid (CFid linked with CFid). Since March B covers more fault models than March C-, they differ in Test Length (TL), that is, the combined number of read and write instructions necessary to complete the test. The TL for March C- is 10*n while the TL for March B is 17*n, where n is the number of bits/cells in the memory under test.

As the March C algorithm has the same coverage as March C-, but with a longer execution time, it is not implemented for this test. Since March B is so much more expensive in terms of test length and code size and does not fully cover all LFs, while LFs are comparatively rare faults to occur, it is not implemented for this test.

Word Oriented Memory Conversion

March test are designed with Bit Oriented Memories (BOM) in mind. That is, it is assumed that it is possible to read and write to a single cell or bit at the time. However, this is not the case for Word Oriented Memories (WOM) where it is only possible to read or write to an entire word at the time, which is the case for most architectures. On AVR devices, 8 bits or a byte is read or written simultaneously when executing a read or write instruction. One way to circumvent this issue is to use bit masks to mask out a single bit of the word being read with a bitwise AND operation. This implementation requires additional instructions for either bit-shifting a single mask variable to iterate through the bits in a word or loading a predefined mask for each bit in a word, in addition to performing the masking operation itself before a compare can be done to verify the value. When writing a specific bit to 1, a bitwise OR operation is used between the bit mask for that bit and the read word, before writing the word back to memory. When setting a specific bit to zero, a bitwise AND operation is used with the inverted bit mask. All these additional instructions significantly increase the execution time of the test.

This issue can be circumvented by converting a BOM March Algorithm into a WOM March Algorithm as outlined in [1][2][3][4]. That is, convert the algorithm to one where you can write to an entire word for each step in a march element. Generally, to create a WOM March test, a set of data backgrounds (DBs) that satisfies coverage for the fault models in question is defined. A DB is a specific bit-string equal to the number of bits in a word of a specific architecture (e.g. 00001111 or 10101010 for an 8-bit architecture). A March element is then created for all the data backgrounds, where you read and write each data background and its inverted value in ascending and descending order. Moreover, it is possible to make this even more efficient by separating the test into an inter-word test and an intra-word test and then concatinate them into one algorithm. Finally, the test can be optimized by removing redundant operations. By following the methods described in [4], the BOM March C- algorithm was converted into a WOM Unrestricted Coupling Fault (uCFs) March C- algorithm and implemented in this diagnostic test. The uCFs fault model ensures full coverage of all the listed coupling faults covered by the BOM March algorithm between every cell in the memory under test and makes no assumptions of the underlying architecture.

The WOM uCFs March C- algorithm for an 8 bit architecture is as follows:

{⇕0(w00000000);⇑1(r00000000,w11111111);⇑2(r11111111,w00000000);

⇓3(r00000000,w11111111);⇓4(r11111111,w00000000);}

{⇓5(r00000000, w01010101); ⇑6 (r01010101, w10101010);

⇓7(r10101010, w01010101); ⇑8(r01010101, w00110011);

⇓9(r00110011, w11001100); ⇑10(r11001100, w00110011);

⇓11(r00110011, w00001111); ⇑12(r00001111, w11110000);

⇓13(r11110000, w00001111); ⇑14(r00001111)}

The WOM uCFs March C- algorithm has a TL of 28*n/8, which results in a reduction of read/write instructions of approximately 65% compated to the BOM March C- algorithm.

[1] A. J. van de Goor and I. B. S. Tlili, "March tests for word-oriented memories," Proceedings Design, Automation and Test in Europe, Paris, France, 1998, pp. 501-508, doi: 10.1109/DATE.1998.655905.

[2] Goor, A.J. & Tlili, I.B.S. & Hamdioui, Said. (1998). Converting March tests for bit-oriented memories into tests for word-oriented memories. 46 - 52. 10.1109/MTDT.1998.705945.

[3] V. G. Mikitjuk, V. N. Yarmolik and A. J. van de Goor, "RAM testing algorithms for detection multiple linked faults," Proceedings ED&TC European Design and Test Conference, Paris, France, 1996, pp. 435-439, doi: 10.1109/EDTC.1996.494337.

[4] A. J. van de Goor and I. B. S. Tlili, "A systematic method for modifying march tests for bit-oriented memories into tests for word-oriented memories," in IEEE Transactions on Computers, vol. 52, no. 10, pp. 1320-1331, Oct. 2003, doi: 10.1109/TC.2003.1234529.