3.4.1 Architectural Overview
The PBU is a direct mapped 128-line cache that helps in providing faster program data fetches to the CPU from Flash memory. The PBU provides program data from an internal instruction buffer, but if it is not available in the internal buffer, the PBU may fetch program data from Flash. Flash fetch operations are therefore accelerated when data are sourced from internal PBU buffers.
The PBU provides an interface between the Program Flash Memory (PFM) and the CPU instruction bus and have the following components associated for operation:
- Instruction Stream Buffer (ISB) - Also termed as the Prefetch Unit (PFU), is available for prefetching and caching of linear PFM instruction flows. ISB is the component that buffers program data words from the program memory. The ISB consists of one or more buffers of a fixed depth. Each buffer holds one or more lines of data fetches from Flash memory. The data held in each buffer represents a linear code flow. These are termed as internal PBU buffers.
- Instruction Cache (IC) - Also known as Branch Target Instruction Cache (BTIC), is used for the caching of target instructions that are most frequently hit. The IC refers to the cache memory and associated control logic that form the cache. A cache consists of N lines, directly mapped or through M-way associative. The PBU supports a direct mapped 128-line cache. The required width for the cache is 129-bits. The PBU Cache has two operating modes: IC mode and BTIC mode.
- Integrity Checking Logic - Provides parity checks on program data stored in the IC to ensure data integrity. This logic provides parity checking and Fault injection on the contents of RAM associated with the IC.
The PBU assumes Flash data width and Flash access speed are sufficient to allow linear program execution at the required speed using only the ISB. The ISB serves as the prefetch buffer and allows the next line of Flash to be fetched as instructions from the current line are executed.
The PBU IC becomes useful when there are frequent program flow changes in the source code. A program flow change will result in extra clock cycles because the current Flash fetch must be allowed to complete and then a new fetch must be initiated at the new location. If the desired program data is available in the IC, the data may be sourced immediately without waiting for the ISB to complete a new fetch from Flash. However, PBU uses a larger, direct-mapped Instruction Cache and has little control and status interface available to the user as its operations are transparent.
The ISB has multiple buffers also called slices. The ISB Slices help increase performance with CALL/RETURN and other flow changes in the code that return back to the previous code stream.
The ISB is two levels deep in the dsPIC33A PBU. For the first generation of dsPIC33A devices, Flash access time is fast enough to support linear code execution with the given program data word width. Therefore, only one level of prefetch buffer is required. The CPU can execute from the first level, while the next fetch occurs into the second level.
In cases where the code to be executed has a linear flow, no further caching of data is necessary. However, program flow changes insert latency into the code flow. A prior Flash fetch must be completed and discarded. Then, a new Flash fetch must be started in the new flow. This process can add a variable amount of clock cycles to the execution time, depending on when the flow change occurred relative to the prefetch that was in progress.
When Flash access time is fast enough to support continuous linear program flow, full instruction caching is not required. The cache could be configured as a BTIC, for which only the targets of program flow changes are cached. This mode of cache increases the effective cache size because all program data words do not have to be cached. However, program data must be transferred from the cache memory to the ISB when a flow change occurs so that data words are prefetched.