9.6 Thread-local Storage
The MPLAB XC32 compiler implements a thread-local storage (TLS) memory management feature that uses memory local to each thread to allow unique storage of objects that otherwise appear to have global scope. Using such storage can reduce the risk of race conditions when accessing shared data, reduce complexity in thread synchronization, and eliminate the need for locks, thereby improving runtime performance in multithreaded applications.
Objects qualified with __thread (see Thread Qualifier) are known as
thread-safe objects and will be allocated to dedicated sections so that they can be
linked separately from other objects. Initialized objects are allocated to the
.tdata section; uninitialized objects to the .tbss
section. The compiler will generate calls to a special routine to access the memory
associated with thread-safe objects, based on which program thread is currently
active.
The Arm ABI Addendum documentation specifies how thread-safe variables are managed and accessed, and indicates that these variables are stored in a special section of memory that is unique to each thread and are accessed via a thread pointer (TP) variable, which points to the base of the TLS area for the current thread.
- The Best Fit Allocator (BFA)
- A customized linker script
When TLS sections are handled by the BFA, the allocator collates .tdata
and .tbss sections not allocated by a linker script, then concatenates
them in the output, where they will be allocated space in program flash memory. Unless
prevented by use of the -Wl,--no-tls-first-copy linker option, memory
is allocated for the TLS block in RAM and the runtime startup code copies the
.tdata sections to this memory and clears the
.tbss section for the initial thread executed after Reset. The
--tls-first-copy option makes this action explicit. The TLS block
initialization code uses a dedicated xc32_init_tls() routine to perform
the initialization. This routine is provided as a stub if the
--no-tls-first-copy option has been used.
| Symbol | Represents |
|---|---|
__tdata_source | The start of .tdata section in flash |
__tdata_start | The start of .tdata section in RAM if a copy is
created in ram, otherwise the same value as
__tdata_source |
__tdata_size | The size of .tdata |
__tdata_end | The end of .tdata |
__tbss_start | The start of .tbss |
__tbss_size | The size of .tbss |
__tbss_end | The end of .tbss |
__tbss_offset | Equivalent to __tbss_start -
__tdata_start) |
__tls_align | Equivalent to max(alignof(.tdata),
alignof(.tbss)) |
__arm32_tls_tcb_offset | Equivalent to (max(8, __tls_align)) |
__tls_base | Equivalent to __tdata_start |
__tls_end | Equivalent to __tbss_end |
__tls_size | Equivalent to __tbss_offset +
__tbss_size |
This allocation method uses the runtime startup code and linker scripts provided in the DFPs. The XC32 Picolibc library provides functions to initialize the TLS block for additional threads created.
Alternatively, a customized linker script can be written to allocate memory for the TLS. This linker script must gather all TLS sections and provide the symbols tabled above, which represent the allocated space for the TLS block. This script should be paired with customized runtime startup code that performs initialization of the initial TLS block.
_set_tls() function has the
prototype:void _set_tls(void *tls)and
sets the TLS thread pointer for the core. It is architecture-specific and is used to
point to the TLS area for the current thread. The _init_tls() function
has the
prototype:void _init_tls(void *tls)and
initializes the TLS area for a new thread. It typically involves copying the initial
values from the .tdata section and zeroing out the
.tbss section.