9.2 Variable Memory

A March algorithm has been selected for testing the variable memory. The specific March algorithm implemented is known as March X test, which can be described as follows:

⇕(w0);⇑(r0,,w1);⇓(r1,w0);⇕(r0)

The first phase is to write a 0 to all memory locations, in any order. The second phase consists of three operations that are performed on each bit, starting with the lowest address:

  • Read a bit and verify that it is 0. If it is 1, a Fault has occurred
  • Write 1 to its location
  • Repeat for the next bit

The third phase also consists of three operations that are done in the opposite order of addresses (with respect to the second phase):

  • Read a bit and verify that it is 1. If it is 0, a Fault has occurred
  • Write 0 to its location
  • Repeat for the next bit

The fourth and final phase consists of verifying that all bits are 0, in any order. Note that the actual address order used in phases two and three does not matter as long as they are done in the exact reverse order. This test is equivalent to the more common March C test where steps 3 and 4 are skipped.

March X can detect the following faults:

  • Address decoder faults
  • Single cell faults: stuck-at, transition, or data retention faults
  • Faults between memory cells: Some, but not all, possible coupling faults (CFs).
Note: Detecting all possible coupling faults is very hard to do in a timely fashion. There are several reasons for this, one of them comes from the partitioning of SRAM. As the test needs to operate on partitions of the SRAM, the test will not have access to check if there are some CFs between partitions. Even if partitions do overlap, this is still no guarantee. It just reduces the risk of CFs going undetected.
The March test that has been described is defined for bit-oriented memories (BOMs). The SRAM in AVRs is a word-oriented memory (WOM). Replacing r0, r1, w0, and w1 by, respectively, rD, rD, wD, and wD, where D can be any data background, the BOM March test is converted to WOM March test that covers inter-word CFs. In our implementation, the data background D=0x00 has been chosen.

It is important to consider the physical location of bits in a row of the memory cell array. The proposed self-diagnostic routine can be configured to append an additional March element with the background sequence {0x55, 0xAA, 0x33, 0xCC, 0x0F, 0xF0}. This will add coverage for the intra-word state CFs considered in the unrestricted intra-word CFs model.

In order to make it possible to run the test even with application data in SRAM, the memory is divided into a configurable number of sections that are tested in turn. The simplest behavior of the test is when there is no overlap between memory sections. In this case, all sections have the same size, except possibly the last one. The first memory section (referred to as the buffer) is reserved. It is used by the test to store the content of the other sections while they are being tested. This is necessary given that the implemented March test is destructive.

Given that the March X algorithm is run on one memory section at a time, there is a user-configurable overlap between memory sections that will decrease the probability that inter-word CFs is undetected. Every time a memory section is tested, a part of the previous section is tested as well. Note that this does not apply to the buffer since it is the first section. The size of the buffer needs to be expanded with respect to the previous case (the size of the second section is decreased correspondingly).

An example application, that shows how the memory test can be embedded in an application is included. An LED that signals correct behavior of the system is ON, and then the program stays in a loop where the SRAM memory is tested as long as there are no errors.

The self-diagnostic routine can be tested as follows:

  • Set a number of breakpoints in the function that implement the March X algorithm
  • Modify the content of some memory locations to model different types of inter-word coupling faults

The test module will then set the error flag, which leads to the application exiting the main loop and the LED is switched OFF.