Equivalent Assembly Symbols

Most C symbols map to an corresponding assembly equivalent.

This mapping is such that an “ordinary” symbol defined in the assembly domain cannot interfere with an “ordinary” symbol in the C domain. So for example, if the symbol main is defined in the assembly domain, it is quite distinct to the main symbol used in C code and they refer to different locations.

The name of a C function maps to an assembly label that will have the same name, but with an underscore prepended. So the function main() will define an assembly label _main.

Baseline PIC devices can use alternate assembly domain symbols for functions. The destinations of call instructions on these devices are limited to the first half of a program memory page. The compiler, thus, encodes functions in two parts, as illustrated in the following example of a C function, add(), compiled for a Baseline device.

entry__add:
ljmp    _add

The label entry__add is the function’s entry point and will always be located in the first half of a program memory page. The code associated with this label is simply a long jump (see Long Jumps And Calls) to the actual function body located elsewhere and identified by the label _add.

If you plan to call routines from assembly code, you must be aware of this limitation in the device and the way the compiler works around it for C functions. Hand-written assembly code should always call the entry__funcName label rather than the usual assembly-equivalent function label.

If a C function is qualified static and there is more than one static function in the program with the exact same name, the name of the first(1) function will map to the usual assembly symbol and the subsequent functions will map to a special symbol with the form: functionName@fileName$Fnumber, where functionName is the name of the function, fileName is the name of the file that contains the function, and number is a unique number sequence.

For example, a program contains the definition for two static functions, both called add. One lives in the file main.c and the other in lcd.c. The first function will generate an assembly label _add. The second might generate the label add@lcd$F38, for example.

The name of a C variable with static storage duration also maps to an assembler label that will have the same name, but with an underscore prepended. So the variable result will define an assembly label: _result.

If the C variable is qualified static, there is a chance that there could be more than one variable in the program with exactly the same C name. The rules that apply to static variables defined outside of functions are the same as those that apply to static functions. The name of the first variable will map to a symbol prepended with an underscore; the subsequent symbols will have the form: functionName@fileName$Fnumber@variableName, where variableName.

All local static variables (i.e., defined inside a function definition) have an assembly name of the form: functionName@variableName. If there is a static variable called output in the function read() and another static variable with the same name defined in the function update(), then the symbols in the assembly can be accessed using the symbols read@output and update@output, respectively.

Functions that use the reentrant model do not define any symbols that allow you to access auto and parameter variables. You should not attempt to access these in assembly code. Special symbols for these variables are defined, however, by functions that use the nonreentrant model. These symbols are described in the following paragraphs.

To allow easy access to parameter and auto variables on the compiled stack, special equates are defined which map a unique symbol to each variable. The symbol has the form: functionName@variableName. Thus, if the function main() defines an auto variable called foobar, the symbol main@foobar can be used in assembly code to access this C variable.

Function parameters use the same symbol mapping as auto variables. If a function called read() has a parameter called channel, then the assembly symbol for that parameter is read@channel.

Function return values have no C identifier associated with them. The return value for a function shares the same memory as that function’s parameter variables, if they are present. The assembly symbol used for return values has the form ?_funcName, where funcName is the name of the function returning the value. Thus, if a function, getPort() returns a value, it will be located the address held by the assembly symbol ?_getPort. If this return value is more than one byte in size, then an offset is added to the symbol to access each byte, e.g., ?_getPort+1.

If the compiler creates temporary variables to hold intermediate results, these will behave like auto variables. As there is no corresponding C variable, the assembly symbol is based on the symbol that represents the auto block for the function plus an offset. That symbol is ??_funcName, where funcName is the function in which the symbol is being used. So for example, if the function main() uses temporary variables, they will be accessed as an offset from the symbol ??_main.

1

The definition of “first” is complex. Typically, if the symbol is contained in the source module that defines main(), it will be processed first. If it is not in this module, then the order in which the source files are listed on the compiler command line determines which is considered first.