9.8 Constant Types and Formats
A constant is used to represent a numerical value in the source code, for example 123 is a constant. Like any value, a constant must have a C/C++ type. In addition to a constant’s type, the actual value can be specified in one of several formats. The format of integral constants specifies their radix. MPLAB XC32 C/C++ supports the ANSI standard radix specifiers as well as ones which enables binary constants to be specified in C code.
The formats used to specify the radices are given in the table below. The letters used to specify binary or hexadecimal radices are case insensitive, as are the letters used to specify the hexadecimal digits.
Radix | Format | Example |
---|---|---|
binary | 0b
number or 0B
number |
0b10011010 |
octal | 0
number |
0763 |
decimal | number | 129 |
hexadecimal | 0x
number or 0X
number |
0x2F |
Any integral constant will have a type of int
, long int
or long long int
, so that the type can hold the value without overflow.
Constants specified in octal or hexadecimal may also be assigned a type of
unsigned int
, unsigned long int
or
unsigned long long int
if the signed counterparts are too small to
hold the value.
The default types of constants may be changed by the addition of a suffix after the
digits, for example, 23U
, where U
is the suffix. The
table below shows the possible combination of suffixes and the types that are considered
when assigning a type. For example, if the suffix l
is specified and
the value is a decimal constant, the compiler will assign the type long
int
, if that type will hold the constant; otherwise, it will assigned
long long int
. If the constant was specified as an octal or
hexadecimal constant, then unsigned types are also considered.
Suffix | Decimal | Octal or Hexadecimal |
---|---|---|
u or U |
|
|
l or L |
|
|
u or U , and l or
L |
|
|
ll or LL |
long long int |
|
u or U , and ll or
LL |
unsigned long long int |
unsigned long long int |
Here is an example of code that may fail because the default type assigned to a constant is not appropriate:
unsigned long int result;
unsigned char shifter;
int main(void)
{
shifter = 40;
result = 1 << shifter;
// code that uses result
}
The constant 1
will be assigned an int
type hence the
result of the shift operation will be an int
and the upper bits of the
long
variable, result
, can never be set,
regardless of how much the constant is shifted. In this case, the value 1 shifted left
40 bits will yield the result 0, not 0x10000000000.
The following uses a suffix to change the type of the constant, hence ensure the shift
result has an unsigned long
type.
result = 1UL << shifter;
Floating-point constants have double
type unless suffixed by
f
or F
, in which case it is a
float
constant. The suffixes l
or
L
specify a long double
type.
Character constants are enclosed by single quote characters, ’
, for
example ’a’
. A character constant has int
type,
although this may be optimized to a char
type later in the
compilation.
Multi-byte character constants are accepted by the compiler but are not supported by the standard libraries.
String constants, or string literals, are enclosed by double quote characters "
"
, for example "hello world"
. The type of string constants
is const char *
and the character that make up the string are stored in
the program memory, as are all objects qualified const
.
To comply with the ANSI C standard, the compiler does not support the extended character set in characters or character arrays. Instead, they need to be escaped using the backslash character, as in the following example:
const char name[] = "Bj\370rk";
printf("%s's Resum\351", name); \\ prints "Bjørk's Resumé"
Assigning a string literal to a pointer to a non-const char
will
generate a warning from the compiler. This code is legal, but the behavior if the
pointer attempts to write to the string will fail. For example:
char * cp= "one"; // "one" in ROM, produces warning
const char * ccp= "two"; // "two" in ROM, correct
Defining and initializing a non-const array (i.e., not a pointer definition) with a string,
char ca[]= "two"; // "two" different to the above
is a special case and produces an array in data space which is initialized at start-up
with the string "two"
(copied from program space), whereas a string
constant used in other contexts represents an unnamed const
-qualified
array, accessed directly in program space.
The MPLAB XC32 C/C++ Compiler will use the same storage location and label for strings that have identical character sequences. For example, in the code snippet
if(strncmp(scp, "hello world", 6) == 0)
fred = 0;
if(strcmp(scp, "hello world") == 0)
fred++;
the two identical character string greetings will share the same memory locations. The link-time optimization must be enabled to allow this optimization when the strings may be located in different modules.
Two adjacent string constants (that is, two strings separated only by white space) are concatenated by the compiler. Thus:
const char * cp = "hello" "world";
will assign the pointer with the address of the string "hello
world"
.