My Favorite C++ Pattern: X Macros (2023)
60 comments
·March 25, 2025jchw
X Macros are a classic C++ pattern. Less known than some of the more popular techniques like curiously recurring template pattern, but still quite common in large codebases.
(Although... it is a neat trick, but... It is kind of mostly useful because C++ macros are not very powerful. If they were more powerful, most uses of X Macros could be replaced by just having a single macro do all of the magic.)
I recently saw something I hadn't seen before in a similar vein. There's a million different bin2c generators, and most of them will generate a declaration for you. However, I just saw a setup where the bin2c conversion just generates the actual byte array text, e.g. "0x00, 0x01, ..." so that you could #include it into whatever declaration you want. Of course, this is basically a poor man's #embed, but I found it intriguing nonetheless. (It's nice that programming languages have been adding file embedding as a first-class feature. This really simplifies a lot of stuff and removes the need to rely on more complex solutions for problems that don't really warrant them.)
WalterBright
> X Macros are a classic C++ pattern
I've seen them in 1970s assembler code.
kragen
https://en.wikipedia.org/wiki/X_macro claims they go back to the 01960s, but unfortunately the reference is to a DDJ article. Is there a chance we could like run a Kickstarter to buy DDJ's archives and fix their website?
jchw
Y'know, I've never actually seen this before, but that makes a lot of sense. They're also (of course) a common pattern in C, and probably basically any language that has a C-style preprocessor.
gpderetta
> If they were more powerful, most uses of X Macros could be replaced by just having a single macro do all of the magic
You can do exactly that, it is just very tedious without a support preprocessor library like boost.PP.
malkia
I've learned about them, staring endlessly at the luajit source code - for example (not the best example I can remember, but still) - https://github.com/LuaJIT/LuaJIT/blob/v2.1/src/lj_lex.h#L15
there it defines a TOKEN(_,__) macro generator the luajit token/keywords and later generates enums with them.
I've used it recently to wrap a "C" api with lots of functions, such that I can redirect calls.
loeg
I hate X-Macros, but they're very useful in some situations. We use them to generate an enum of error values and also a string table of those names. (Unlike two separate tables, the X macro version is guaranteed to be the exact correct length / values align with the corresponding string.)
msarnoff
That's been my primary usage of them. Another is to create a list of config file options (or command line arguments) like
X(foo, int, "set the number of foos")
X(filename, std::string, "input filename")
X(verbose, bool, "verbose logging")
which can then be used to (a) generate the fields of a config struct, define the mapping from string to field (using the stringifying macro operators), define what functions to use for parsing each field, create the help message, etc. Basically like `argparse` or `clap` but much hackier.As gross as they are, the ability to define one table of data that's used multiple ways in multiple places is handy.
IshKebab
I agree. Terrible hack, but if you're in terrible hack land they're quite useful. LLVM uses them extensively.
Blackthorn
I understand the use case of this, but when I see it I always wonder if, and think I would prefer, some external code generation step instead rather than falling back on macros in the preprocessor. Like an external script or something.
wat10000
Now you have a additional stage in your build, a bunch of new code to maintain, and either a bespoke language embedded in your standard C++ or a bunch of code emitting C++ separately from the code it logically belongs with.
Compare with a solution that's 100% standard C++, integrates into your build with zero work, can be immediately understood by anyone reasonably skilled in the language, and puts the "generated" code right where it belongs.
MITSardine
CMake makes this pretty painless. My codegen targets have only two additional instructions to handle the generation itself and dependencies: add_custom_command to call the codegen exec, and then add_custom_target to wrap my outputs in a "virtual" target I can then make the rest of my program depend on, but this is just for tidying up.
And I'll dispute the fact that any complex C prepro task "can be immediately understood by anyone reasonably skilled in the language". Besides, code should ideally be understood by "anyone reasonably likely to look at this code to work in it", not "reasonably skilled".
wat10000
This isn't complex. It's a bit unusual, but not hard to understand if you understand the basics of how #include and #define work.
If you're working on the sort of C++ codebase that would benefit from this sort of code generation, and you're not reasonably skilled in C++, then god help you.
mauvehaus
I've done this in C with the C preprocessor and Java with m4[0].
The upside of doing it natively is that it keeps the build simpler. And everybody at least knows about the existence of the C preprocessor, even if they don't know it well. And it's fairly limited, which prevents you from getting too clever.
The big downside of doing it with the C preprocessor is that the resulting code looks like vomit if it's more than a line or two because of the lack of line breaks in the generated code. Debugging it is unenjoyable. I'd recommend against doing anything super clever.
The upside of doing it out of band is that your generated source files look decent. m4 tends to introduce a little extra whitespace, but it's nothing objectionable. Plus you get more power if you really need it.
The downside is that almost nobody knows m4[1]. If you choose something else, it becomes a question of what, does anyone else know it, and is it available everywhere you need to build.
Honestly, integrating m4 into the build in ant really wasn't too bad. We were building on one OS on two different architectures. For anything truly cross-platform, you'll likely run into all the usual issues.
ETA: Getting an IDE to understand the out of band generation might be a hassle, as other folks have mentioned. I'm a vim kinda guy for most coding, and doing it either way was pretty frictionless. The generated java code was read-only and trivial, so there wasn't a lot of reason to ever look at it. By the time you get to debugging, it would entirely transparent because you're just looking at another set of java files.
[0] This was so long ago, I no longer remember why it seemed like a good idea. I think there was an interface, a trivial implementation, and some other thing? Maybe something JNI-related? At least at first, things were changing often enough that I didn't want to have to keep three things in sync by hand.
[1] Including me. I re-learn just enough to get done with the job at hand every time I need it.
writebetterc
IDEs understand preprocessor macros, so IDE features (jump2def, etc) work with this. IDEs also can expand the macro invocations. So, I prefer the macros when possible :-).
Someone
> IDEs understand preprocessor macros, so IDE features (jump2def, etc) work with this.
Do they? X macros often are used with token pasting (https://gcc.gnu.org/onlinedocs/cpp/Concatenation.html), as for example in (FTA)
#define AST_BEGIN_SUBCLASSES(NAME) START_##NAME ,
Are modern IDEs/compiler toolchains smart enough to tell you that START_foo was created by an expansion of that macro?tom_
Yes. I use this with VS2019 for generating enum names, and they interact fine with auto complete and go to definition.
jcelerier
any non-toy IDE can do that. Most IDEs use clang directly for parsing nowadays.
null
jasonthorsness
The C# "source generator" approach is a good compromise; it runs within the build chain so has the ease-of-use of macros in that respect, but they don't need to be written in a weird macro language (they are C# or can call external tool) and when you debug your program, you debug through the generated source and can see it, more accessible than macros. Not sure if there is something similar in C/C++ integrated with the common toolchains.
But when working outside C/C++ I've found myself missing the flexibility of macros more times than I can count.
MITSardine
After trying to wrangle Boost PP and other advertised compile-time libraries such as Boost Hana (which still has some runtime overhead compared to the same logic with hardcoded values), I've finally converged to simply writing C++ files that write other C++ files. Could be Python, but I rather keep the build simple in my C++ project. Code generation is painless with CMake, no idea with other build configuration utilities.
rcxdude
CMake has a particularly irritating flaw here, though, in that it makes no distinction between host and target which cross-compiling, which makes it really difficult to do this kind of code generation when supporting this use-case (which is becoming more and more commoon).
MITSardine
Right, I hadn't thought of that, to be honest. If I understand correctly, you're saying the codegen targets will be compiled to the target arch, and then can't be run on the machine doing the compiling?
I think one solution might be to use target_compile_options() which lets you specify flags per target (instead of globally), assuming you're passing flags to specify the target architecture.
Arech
> Boost Hana (which still has some runtime overhead compared to the same logic with hardcoded values)
Can you elaborate on that? What was your use-case for which this was true?
MITSardine
One case I benchmarked was Bernstein/Bézier and Lagrange element evaluation. This is: given a degree d triangle or tetrahedron, given some barycentric coordinates, get the physical coordinate and the Jacobian matrix of the mapping.
Degree 2, Lagrange:
- runtime: 3.6M/s - Hana: 16.2M/s - Hardcoded: 37.7M/s
Degree 3, Lagrange: 2.6M/s, 6.4M/s, 13.4M/s (same order).
"Runtime" here means everything is done using runtime loops, "Hana" using Boost Hana to make loops compile-time and use some constexpr ordering arrays, "hardcoded" is a very Fortran-looking function with all hardcoded indices and operations all unrolled.
As you see, using Boost Hana does bring about some improvement, but there is still a factor 2x between that and hardcoded. This is all compiled with Release optimization flags. Technically, the Hana implementation is doing the same operations in the same order as the hardcoded version, all indices known at compile time, which is why I say there must be some runtime overhead to using hana::while.
In the case of Bernstein elements, the best solution is to use de Casteljau's recursive algorithm using templates (10x to 15x speedup to runtime recursive depending on degree). But not everything recasts itself nicely as a recursive algorithm, or I didn't find the way for Lagrange anyways. I did enable flto as, from my understanding (looking at call stacks), hana::while creates lambda functions, so perhaps a simple function optimization becomes a cross-unit affair if it calls hana::while. (speculating)
Similar results to compute Bernstein coefficients of the Jacobian matrix determinant of a Q2 tetrahedron, factor 5x from "runtime" to "hana" (only difference is for loops become hana::whiles), factor 3x from "hana" to "hardcoded" (the loops are unrolled). So a factor 15x between naive C++ and code generated files. In the case of this function in particular, we have 4 nested loops, it's branching hell where continues are hit very often.
cbuq
This sounds pragmatic, but are you writing C++ executables that when run create the generated code? Are there templating libraries involved?
MITSardine
Yeah, it's all done automatically when you build, and dependencies are properly taken into account: if you modify one of the code generating sources, its outputs are regenerated, and everything that depends on them is correctly recompiled. This doesn't take much CMake logic at all to make work.
In my case, no, it's dumb old code writing strings and dumping that to files. You could do whatever you want in there, it's just a program that writes source files.
I do use some template metaprogramming where it's practical versus code generation, and Boost Hana provides some algorithmic facilities at compile time but those incur some runtime cost. For instance, you can write a while loop with bounds evaluated at compile time, that lets you use its index as a template parameter or evaluate constexpr functions on. But sometimes the best solution has been (for me, performance/complexity wise) to just write dumb files that hardcode things for different cases.
dataflow
External codegen introduces a lot of friction in random places. Like how your editor can no longer understand the file before you start building. Or how it can go out of date with respect to the rest of your code until you build. If you can do it with a macro it tends to work better than codegen in some ways.
drwu
Another issue is cross-compiling.
External code generation requires (cross) execution of the (cross) compiled binary program.
rcxdude
Or to build the generating code for the host instead. Most build systems that support cross-compilation can do this, except CMake.
paulddraper
Spoken like a Go programmer :D
That introduces some other tool (with its own syntax), an extra build step, possible non-composability with other features.
A preprocessor (for good or bad) IS a code generation step.
wat10000
Some minor tweaks to what the author shows to make it even better.
Give the macro a more descriptive name. For their example, call it GLOBAL_STRING instead of X. I think this helps make things clearer.
#undef the macro at the end of the header. That removes one line of boilerplate from every use of the header.
Use #ifndef at the top of the header and emit a nice error if the macro isn't defined. This will make it easier to understand what's wrong if you forget the #define or misspell the macro.
zem
despite the extra boilerplate, I feel like it's still better to undef the macros in the same scope they were defined in, so that they clearly delimit the code in which the macro is active.
wat10000
I think this pattern is obvious enough that it's clear what's going on, but I can see where you're coming from.
kevin_thibedeau
"X" macros are great until you need two of them visible in the same translation unit. It is much better to pass a list macro as an argument to a uniquely named X macro and avoid the need to ever undef anything.
skribanto
I don’t think I follow, do you mind giving a concrete example?
rcxdude
Basically, instead of faffing around with undefing values and including different files, you define your list like this:
#define AN_X_LIST(X) \
X(foo, bar) \
X(bar, baz)
And then you use it like so: #define AN_ASSIGNMENT_STATEMENT(a,b) a = STRINGIFY(b);
And so AN_X_LIST(AN_ASSIGNMENT_STATEMENT)
Will expand to foo = "bar";
bar = "baz";
The nice thing about this approach is you can define multiple lists and macros, and make higher order macros which use them. I have a system like this which allows me to define reflective structs in C++ easily, i.e. I define a struct like: #define STRUCT_FOO_LIST(X) \
X(int, bar, 0), \
X(float, baz, 4.0),
DECLARE_REFLECTIVE_STRUCT(Foo, STRUCT_FOO_LIST);
(where DECLARE_REFLECTIVE_STRUCT basically just does the same dance as above with passing different per-element structs into the list that it is passed for the struct definition, and other utility functions associated with it)which then makes a struct Foo with members bar and baz with the right types and default values, but also I can do 'foo_instance.fetch_variant("baz")' and other such operations.
The biggest pain with this approach is it's basically dealing with a bunch of multi-line macros, so it can get messy if there's a typo somewhere (and I strongly recommend an auto-formatter if you don't like having a ragged line of backslashes to the right of all the code that uses it).
elteto
_This_ is the best pattern for X macros, without any of that noise of undef'ing anything.
My approach is to wrap the list elements with two macros: an inner transformation one and an outer delimiter, like so:
#define AN_X_LIST(X, DELIM) \
DELIM(X(int, foo)) \
DELIM(X(int, bar)) \
X(std::string, baz)
Then you can compose different pieces of code for different contexts by just swapping out the delimiter. A very contrived example: #define SEMICOLON(x) x;
#define COMMA(x) x,
#define DECLARE(type, var) type var
#define INIT(type, var) var{}
struct s {
AN_X_LIST(DECLARE, SEMICOLON);
s() AN_X_LIST(INIT, COMMA) {}
};
null
gibibit
It is a clever trick. Very useful in C also, maybe more than in C++.
It can be overused, though.
Kind of works like Rust declarative macros (`macro_rules!`) in that it is often used to repeat and expand something common across an axis of things that vary.
It's funny that the simple name X Macros has stuck and is a de facto standard. E.g. https://en.wikipedia.org/wiki/X_macro suggests they date to the 1960s.
PaulHoule
Famous in IBM 360 assembly language, in particular these were used for building code based on data schemas in CICS [1] applications which were remarkably similar to 2000-era cgi-bin applications in that you drew forms on 3270 terminals [2] and instead of sending a character for each keystroke like the typical minicomputer terminal, users would hit a button to submit the form, like an HTML form, and the application mostly shuttled data between forms and the database.
null
randomNumber7
One the one hand it is great and probably usable to solve actual problems.
On the other hand it seems fishy that you need all these hacks to do it. There must be a way simpler language than C++ where you could do the same easier.
sfpotter
No need for any of this if you use D!
This is really more a C pattern than a C++ pattern, isn't it?
Frustrated by C's limitations relative to Golang, last year I sketched out an approach to using X-macros to define VNC protocol message types as structs with automatically-generated serialization and deserialization code. So for example you would define the VNC KeyEvent message as follows:
And that would generate a KeyEvent typedef (to an anonymous struct type) and two functions named read_KeyEvent_big_endian and write_KeyEvent_big_endian. (It wouldn't be difficult to add debug_print_KeyEvent.) Since KeyEvent is a typedef, you can use it as a field type in other, larger structs just like u8 and u32.Note that here there are two Xes, and they are passed as parameters to the KeyEvent_fields macro rather than being globally defined and undefined over time. To me this feels cleaner than the traditional way.
The usage above is in http://canonical.org/~kragen/sw/dev3/binmsg_cpp.c, MESSAGE_TYPE and its ilk are in http://canonical.org/~kragen/sw/dev3/binmsg_cpp.h, and an alternative approach using Python instead of the C preprocessor to generate the required C is in http://canonical.org/~kragen/sw/dev3/binmsg.py.