The next important directive is #define, which creates a named string transform, or macro, which the preprocessor then uses to replace any later references to the macro. The simpler type of macro is simply a constant text value;
#define macro value
This sort of macro is replaced with the text following the name, up to the end of the line of text. They are most often used for defining constant values.
The second type of macro takes one or more arguments:
#define macro(argument-list) transformation-rule
Any use of an argument name in the transformation rule is replaced with the text value of the argument. These macros are invoked in a manner similar to a function call.
By convention, macro names are usually (but not always) written in all-uppercase lettering, to distinguish them from variables and functions. Nonetheless, many of the standard 'functions' are in fact macros defined in the header files to encapsulate calls to lower-level or more general-purpose functions.
When the preprocessor passes through the code, it searches the code for references to macro names or invocations, and replaces them with the text of the macro; in the case where a macro has arguments, the text of the argument replaces the argument name in the inserted code. For example, in the file
/* bar.c */ #define PI 3.14 #define SQUARE(x) x * x int main() { printf("PI * SQUARE(r) == %d", PI * SQUARE(3)); }
generates the following code:
int main() { printf("PI * SQUARE(r) == %d", 3.14 * 3 * 3); }
You will see that the preprocessor recognizes C string literals and does not attempt to expand macro names mentioned within them. Macro argument names are replaced by the exact text of the argument given to the macro.
Macros of this sort are often used as a substitute for simple functions. Not only are the macro definitions simpler than the equivalent function definitions, they do not require function prototypes (though the macro does not to be defined before use). Because the code is inserted directly into the source, macros avoid the calling overhead of functions as well; thus, for simple operations, macros are often slightly faster than function calls, and can in some cases use less code as well. However, repeated use of even a small macro can cause the resultant code to grow significantly. In C++ (and C programs following the C99 standard), such macros can often be repalced with inline functions.
The C and C++ standards both define a group of standard macros, which serve various purposes. Among the standard macros defined by the C99 standard are:
In addition, most compilers define certain default macros for identifying the compiler and the system type.
The C standard reserves all names (macro or otherwise) beginning with an underscore followed by an uppercase letter for system use. The C++ standard further reserves all macro names containing a pair of underscores anywhere in the name. Thus, user-defined macros of the form
#define _Foo
illegal in both languages, while those of the forms
#define __barand
#define quux__
are also outlawed in C++.
Normally, the value of a macro definition is limited to a single line of code, ending at the newline character. However, a backslash (\) character at the end of a line acts as a line extension, causing the macro to include the text of the following line as well. This can be used to extend a macro to indefinite size:
#define REALLY_LONG_MACRO(x) if (x == 0) \ do_something(); \ else \ do_something else();
Needless to say, this holds considerable potentential for abuse, and should be used very carefully, if at all.
In the C99 language standard, support for macros with a variable number of arguments was added. A series of varadic arguments is represented by an elipsis (three periods in sequence). In the macro body, the arguments are represented by the special preprocessor token __VA_ARGS__, which is used
The C99 and C++ standards define methods for manipulating the preprocessor tokens, allowing the
Within a directive (after the directive name), a hash mark followed by a macro argument caused the argument to be inserted into the body of the macro as a string literal. For example, for the macro
#define foo(x) printf("Argument %s evaluates to %d", #x, x) // ... foo(2 + 2);
...would print:
Argument 2 + 2 evaluates to 4Conversely, the use of a pair of hash marks followed by a token has the effect of inserting the body of
As noted earlier, macros are inserted into the code without being evaluated in any manner, except for expanding macro arguments and invocations of earlier macros. Furhtermore, nacro argument names are replaced with the exact text of a macro argument. This can lead to a number of problems which can be very difficult to debug. For example, if the code given earlier is changed to
#define SQUARE(x) ((x) * (x)) printf("PI * SQUARE(r) == %d", PI * SQUARE(2 + 3));
would expand to
printf("PI * SQUARE(r) == %d", 3.14 * 2 + 3 * 2 + 3);
which, because of the operator precedence rules, would give a very different result than would be expected.
Such subtle (or not-so-subtle) expansion errors are a common problem with macros. Thus, it is generally advised to use macros sparingly, and to parenthesize the arguments defensively when you do:
#define SQUARE(x) ((x) * (x))
would result in the line
printf("PI * SQUARE(r) == %d", 3.14 * ((2 + 3) * (2 + 3)));
which, while clumsy, would result in the correct value.
Even this would not solve all potential problems, however. For example, macros do no type checking of their arguments; had it been written
SQUARE("Hello, world!")
the preprocessor would have blithely expanded it to
(("Hello, world!") * ("Hello, world!"))
resulting in an inobvious compiler error. Expressions which modify variables, such as the increment operator, can also cause problems. For example,
y = SQUARE(x++)
would expand to
y = ((x++) * (x++))
... the behavior of which is undefined according to the language standard.
Another common error in macro processing is when a 'swallows' the semicolon on the line where it is invoked. This occurs primarily in multi-line macros in which there is an
These are only a few of the possible pitfalls of incautious use of macros. Like many parts of the C language, preprocessor macros are a powerful but potentially dangerous tool that must be handled with care.
As noted earlier, macros can often be replaced with inline functions which perform the same task; while the coding of inlines is slightly more complex than that of a macro, they have the advantage that they enforce the usual C/C++ type and syntax checking, and have some affect of isolating the inserted code to avoid many of the side effects. However, because the functions are not treated as literal text insertions, they cannot be used for some of the purposed which macros can be.
The C++ language has another macro language, the template system, which is separate from the preprocessor. The template sub-language, which is intended to support generic programming, is restricted to certain specific types of transformations, and like inline functions, is more tightly integrated into the C++ language syntax.