1.18 Flash Erase/Write Operation Security

Issue Overview

Whenever code has the ability to change the flash content, those operations need to be safe guarded as much as possible against accidental writes/erases. Once source of accidental operation execution is caused by invalid code jumps. These could be caused by software bugs or hardware issues like miss-execution from improper BOR guarding at system power up, power down, or a power spike/glitch.

Below is an example code code for writing some data to flash that we'll use for discussion:

Example 1
#define FLASH_UNLOCK_KEY 0x00AA0055u
if(UpdateNeeded() == true){
    if(dataValid() == true){
        FLASH_Unlock(FLASH_UNLOCK_KEY);
        FLASH_Write(data);
        FLASH_Lock();
    }
}

In example 1 above the flash unlock key is hard coded into the program. The validity checks of if we want to write the data into flash are done before unlocking the flash. In normal operational conditions, this code looks like it would be very reasonable. But in the miss-execution situations that we mentioned previously, this code is actually dangerous. Consider the case when a miss-executing code jumps from somewhere else in program memory to line 3. If the code accidental jumps to that line a write to the flash is going to happen, possibly with invalid data.

As you can see this code has some risk around the flash operations.

Mitigation Techniques

There are a few methods that can be used to help mitigate some of the risk around these flash operations. We will cover the techniques used this bootloader solution to help reduce that risk.

Externally Sourced Key:

The first mitigation technique is not hard code the key, but instead get the key from an external source that is not always present. For example, for the flash operations associated with downloading a new flash image by an external bootloader host, that host can provide the key in the command data to the target device. We use this technique for all of our host driven flash operations. In this case the flash unlock key isn't in the code and thus it becomes very unlikely that a flash operation can occur when you aren't in a firmware upgrading session. Below is an example of what this might look like.

Example 2
uint8_t commandPacket[100] = {0};
uint32_t flashKey = 0;

UART_GetCommand(&commandPacket); //Read command from host
    
memcpy(&flashKey, commandPacket[4], sizeof(uint32_t));
FLASH_Unlock(FLASH_UNLOCK_KEY);
FLASH_Write(data);
FLASH_Lock();
memset(commandPacket, 0x00, sizeof(commandPacket));

In example 2, the host is sending the flash key as part of the protocol. As a result, the target device code doesn't have a copy of the key at all. So if we are outside of a bootload operation and a command hasn't been sent by the host, then a flash operation is much less likely.

Let's say we are outside for the bootload operation and because of a code error, the code jumps to line 6 of example 2. In this case the flash will only unlock if the commandPacket[4]-commandPacket[7] have the unlock key still in it. After each flash operation, we destroy that copy of the key from RAM (line 9 in example 2). So unless the errant code managed to not only jump to this location, but also write the correct 32-bit value to the right location, then the flash write operation will be blocked. This greatly reduces the risk of an accidental flash write.

Calculated Flash Key:

There are times when you may need to do flash operations when not connected to an external source that can limit the devices access to the flash key to only when it is allowed. In this case another method to safe-guard the key must be used.

This bootloader solution uses a calculated flash key to help reduce the risk of invalid key usage.

We saw in example 1 that the miss-execution case allowed the code to by-pass all of the validity checks for the write operation that we had in place in example 1 lines 1 and 2. To prevent that from happening, instead of supplying the entire key in one place, we provide part of the key in two places - a part of it before the validity checks and the second part after the validity checks. This makes it so that the code must go through the validity checks in order to do the flash operation. Let's look at an example of what this might look like.

Example 3
#define FLASH_UNLOCK_KEY 0x00AA0055u
#define PARTIAL_KEY_1 (FLASH_UNLOCK_KEY + PARTIAL_KEY_2)
#define PARTIAL_KEY_2 (0x12345678u)

volatile uint32_t flashKey = 0;

flashKey = PARTIAL_KEY_1;
if(UpdateNeeded() == true){
    if(dataValid() == true){
        FLASH_Unlock(flashKey - PARTIAL_KEY_2);
        FLASH_Write(data);
        FLASH_Lock();
    }
}
flashKey = 0;

In the example above we use the real flash key to create two partial keys that are used together to calculate the real key. We use these partial keys rather than the real key. We place one key before the validity checks (line 4) and we place the other after the validity checks (line 7).

Just like with example 1, let's say the miss-executing code jumps to the unlock call at line 8. By default the flashKey variable is 0 and we always reset it to 0 after a flash operation (line 13). So if the flashKey variable is 0, then the FLASH_Unlock() function gets called with the value 0x12345678 and thus will not unlock the flash since a value of 0x00AA0055 is required to unlock the flash.

What happens if we jump to line 5 instead when we don't want to update the flash? The flashKey will be loaded with the first half of the key. But the code will now be forced to run through the validity checks. If these validity checks are able to determine that a flash write is not desired, then an invalid flash write can be prevented and the flashKey is set back to 0 to help prevent other bad writes.

This doesn't completely prevent the invalid flash operations occurring, but it helps reduce the likelihood of it happening.

NOTE: In the example above we declared the flashKey variable volatile to help reduce the likelihood that the compiler would optimize the flashKey calculation all into line 8. If the compiler determined that the flashKey was a constant value and it could just inject that constant value in the FLASH_Unlock() function directly instead of creating a variable, then it would break the safeguard we just tried to put in place.