2.7 Cache Controller (Cache)
This library provides a brief overview of the L1 Cache peripheral in the PIC32MZ family of devices.
The L1 cache is divided into two parts, a Data Cache (D-cache) and an Instruction Cache (I-cache). These blocks of high-speed memory both serve to compensate for the lengthy access time of main memory, by fetching instructions and data for the CPU ahead of time. The CPU can then access the information directly through the cache in a single clock cycle, rather than having to wait multiple clock cycles for accesses to main memory. The L1 cache provides a drastic increase in performance, but the user must be aware of hazards that exist when using the cache.
Using The Library
Cache Coherency
Cache coherency is the discipline of ensuring that the data stored in main memory matches the corresponding data in the cache. The majority of the cache-related APIs deal with cache coherency. These functions allow the user to flush, clean and invalidate entire cache(s), or just a range of addresses within the cache.
Caches most often lose coherency when a bus master other than the CPU modifies memory. This happens frequently with DMA. Two examples are provided in the following section.
Example 1:
Imagine a situation where you would like to transfer data from a source buffer to a destination buffer using DMA. You would write data to the source buffer, start the DMA transfer, and then expect that the same data now appears in the destination buffer. With the cache in write-back mode (the default mode for the PIC32MZ family), this will not be the case. When transferring data out of memory using DMA, it is possible that the desired data is held in the D-cache, but has never been written back to main memory. Therefore, in this case, you write data to the source buffer and it gets stored in cache. When the DMA transfer executes, it will pull the data from the source buffer out of RAM and then transfer it to the destination buffer in RAM. The problem is that the fresh data was stored in the cache but never written back to RAM, so what has happened is that stale data was copied over rather than the intended data. What is needed is a way to force the cache to write its data back to main memory before the DMA transfer. This is known as a write-back operation and would be performed with the use of the function:
CACHE_DataCacheClean(uint32_t addr, size_t len)
Example 2:
The second situation involves writing data into memory using DMA. Imagine that the cache is holding a chunk of data known as destination_buffer. You then execute a DMA transfer to copy some new data from a source buffer into destination_buffer. The issue here is that main memory now contains the correct data, but the cache holds a copy of stale data for destination_buffer. The CPU cannot see this problem and it will keep pulling the data out of the cache, not even realizing that it’s stale. What is needed is a way to tell the cache to pull the fresh data out of main memory, to replace the stale data that the cache contains. This is known as an invalidate operation. It is performed with the use of the function:
CACHE_DataCacheInvalidate(uint32_t addr, size_t len)
The below example application shows how to use the cache maintenance APIs to avoid issues related to cache coherency when the data cache is enabled.
The application uses the USART and the DMA PLIBs to demonstrate the cache maintenance APIs provided by the Cache peripheral library. It registers a callback with the DMA transmit and receive channels. The application first transmits a message using the DMA transmit channel and then schedules a read of ten characters using the DMA receive channel. Once the DMA read is complete, it reads the received data and echoes the same on the terminal using the DMA transmit channel.
The application calls the DCACHE_CLEAN_BY_ADDR API on the write buffer before transmitting it. Calling this API copies the data from the cache memory to the main memory, thereby ensuring that the DMA peripheral uses the updated values in the write buffer.
The application calls the DCACHE_INVALIDATE_BY_ADDR API on the read buffer after reception of data is complete by the DMA receive channel. Calling this API invalidates the cache region corresponding to the read buffer, thereby ensuring that the CPU reads the updated values in the read buffer from the main memory and into the cache.
The cache maintenance operations are always performed on a cache line (1 cache line = 32 bytes), the read and write buffers must be aligned to a 32 byte boundary and must be a multiple of 32 bytes. For the same reason, the number of received and echoed back bytes is 10, the size of the receive and echo buffers is 32 bytes.
#define READ_SIZE 10 #define BUFFER_SIZE (2*CACHE_LINE_SIZE) // Buffer size in terms of cache lines char __attribute__ ((aligned (16))) messageStart[] = "**** CACHE maintenance demo with UART ****\r\n\ **** Type a buffer of 10 characters and observe it echo back ****\r\n\ **** LED toggles on each time the buffer is echoed ****\r\n"; char __attribute__ ((aligned (16))) receiveBuffer[BUFFER_SIZE] = {}; char __attribute__ ((aligned (16))) echoBuffer[BUFFER_SIZE] = {}; bool writeStatus = false; bool readStatus = false; void TX_DMAC_Callback(DMAC_TRANSFER_EVENT status, uintptr_t contextHandle) { writeStatus = true; } void RX_DMAC_Callback(DMAC_TRANSFER_EVENT status, uintptr_t contextHandle) { readStatus = true; } int main ( void ) { /* Initialize all modules */ SYS_Initialize ( NULL ); /* Register callback functions for both write and read contexts */ DMAC_ChannelCallbackRegister(DMAC_CHANNEL_0, TX_DMAC_Callback, 0); DMAC_ChannelCallbackRegister(DMAC_CHANNEL_1, RX_DMAC_Callback, 1); DCACHE_CLEAN_BY_ADDR((uint32_t)messageStart, sizeof(messageStart)); DMAC_ChannelTransfer(DMAC_CHANNEL_0, &messageStart, sizeof(messageStart), (const void *)&U2TXREG, 1, 1); while(1) { if(readStatus == true) { readStatus = false; memcpy(echoBuffer, receiveBuffer, READ_SIZE); echoBuffer[READ_SIZE] = '\r'; echoBuffer[(READ_SIZE + 1)] = '\n'; DCACHE_CLEAN_BY_ADDR((uint32_t)echoBuffer, sizeof(echoBuffer)); DMAC_ChannelTransfer(DMAC_CHANNEL_0, echoBuffer, READ_SIZE+2, (const void *)&U2TXREG, 1, 1); } else if(writeStatus == true) { writeStatus = false; /* Invalidate cache lines having received buffer before using it * to load the latest data in the actual memory to the cache */ DCACHE_INVALIDATE_BY_ADDR((uint32_t)receiveBuffer, sizeof(receiveBuffer)); DMAC_ChannelTransfer(DMAC_CHANNEL_1, (const void *)&U2RXREG, 1, receiveBuffer, READ_SIZE, 1); } } /* Execution should not come here during normal operation */ return ( EXIT_FAILURE ); }
Library Interface
Cache Controller peripheral library provides the following interfaces:
Functions
Name | Description |
---|---|
CACHE_CacheInit | Initialize the L1 cache |
CACHE_CacheFlush | Flushes the L1 cache |
CACHE_DataCacheFlush | Flushes the L1 data cache |
CACHE_InstructionCacheFlush | Flushes (invalidates) the L1 instruction cache |
CACHE_CacheClean | Write back and invalidate an address range in either cache |
CACHE_DataCacheClean | Write back and invalidate an address range in the data cache |
CACHE_DataCacheInvalidate | Invalidate an address range in the data cache |
CACHE_InstructionCacheInvalidate | Invalidate an address range in the instruction cache |
CACHE_InstructionCacheLock | Fetch and lock a block of instructions in the instruction cache |
CACHE_DataCacheLock | Fetch and lock a block of data in the data cache |
CACHE_CacheSync | Synchronize the instruction and data caches |
CACHE_CacheCoherencySet | Set the cache coherency attribute for kseg0 |
CACHE_CacheCoherencyGet | Returns the current cache coherency attribute for kseg0 |
CACHE_DataCacheAssociativityGet | Returns the number of ways in the data cache |
CACHE_InstructionCacheAssociativityGet | Returns the number of ways in the instruction cache |
CACHE_DataCacheLineSizeGet | Returns the data cache line size |
CACHE_InstructionCacheLineSizeGet | Returns the instruction cache line size |
CACHE_DataCacheLinesPerWayGet | Returns the number of lines per way in the data cache |
CACHE_InstructionCacheLinesPerWayGet | Returns the number of lines per way in the instruction cache |
CACHE_DataCacheSizeGet | Returns the total number of bytes in the data cache |
CACHE_InstructionCacheSizeGet | Returns the total number of bytes in the instruction cache |
Data types and constants
Name | Type | Description |
---|---|---|
CACHE_COHERENCY | Enum | L1 cache coherency settings |