What the hell is a target triple?

138 comments

·April 15, 2025

arp242

> There are also many ficticious names for 64-bit x86, which you should avoid unless you want the younger generation to make fun of you. amd64 refers to AMD’s original implementation of long mode in their K8 microarchitecture, first shipped in their Athlon 64 product. Calling it amd64 is silly and also looks a lot like arm64, and I am honestly kinda annoyed at how much Go code I’ve seen with files named fast_arm64.s and fast_amd64.s. Debian also uses amd64/arm64, which makes browsing packages kind of annoying.

I prefer amd64 as it's so much easier to type and scans so much easier. x86_64 is so awkward.

Bikeshed I guess and in the abstract I can see how x86_64 is better, but pragmatism > purity and you'll take my amd64 from my cold dead hands.

As for Go, you can get the GOARCH/GOOS combinations from "go tool dist list". Can be useful at times if you want to ensure your code cross-compiles in CI.

kowabungalow

There's genuine AMD_64 and the knock off by their slower competitor who coined the genuine emphasis for the first to market. I don't see what is confusing about that.

peterldowns

Some other sources of target triples (some mentioned in the article, some not):

rustc: `rustc --print target-list`

golang: `go tool dist list`

zig: `zig targets`

As the article point out, the complete lack of standardization and consistency in what constitutes a "triple" (sometimes actually a quad!) is kind of hellishly hilarious.

lifthrasiir

> what constitutes a "triple" (sometimes actually a quad!)

It is actually a quintiple at most because the first part, architecture, may contain a version for e.g. ARM. And yet it doesn't fully describe the actual target because it may require an additional OS version for e.g. macOS. Doubly silly.

achierius

Why would macOS in particular require an OS version where other platforms would not -- just backwards compatibility?

ycombinatrix

at least we don't have to deal with --build, --host, --target nonsense anymore

rendaw

You do on Nix. And it's as inconsistently implemented there as anywhere.

psanford

As a Go developer, I certainly find the complaints about the go conventions amusing. I guess if you have really invested so much into understanding all the details in the rest of this article you might be annoyed that it doesn't translate 1 to 1 to Go.

But for the rest of us, I'm so glad that I can just cross compile things in Go without thinking about it. The annoying thing with setting up cross compilation in GCC is not learning the naming conventions, it is getting the correct toolchains installed and wired up correctly in your build system. Go just ships that out of the box and it is so much more pleasant.

Its also one thing that is great about zig. Using Go+zig when I need to cross compile something that includes cgo in it is so much better than trying to get GCC toolchains setup properly.

cbmuser

»32-bit x86 is extremely not called “x32”; this is what Linux used to call its x86 ILP324 variant before it was removed.«

x32 support has not been removed from the Linux kernel. In fact, we‘re still maintaining Debian for x32 in Debian Ports.

jcranmer

I did start to try to take clang's TargetInfo code (https://github.com/llvm/llvm-project/blob/main/clang/lib/Bas...) and porting it over to TableGen, primarily so somebody could actually extract useful auto-generated documentation out of it, like "What are all the targets available?"

I actually do have working code for the triple-to-TargetInfo instantiation portion (which is fun because there's one or two cases that juuuust aren't quite like all of the others, and I'm not sure if that's a bad copy-paste job or actually intentional). But I never got around to working out how to actually integrate the actual bodies of TargetInfo implementations--which provide things like the properties of C/C++ fundamental types or default macros--into the TableGen easily, so that patch is still merely languishing somewhere on my computer.

ComputerGuru

Great article but I was really put off by this bit, which aside from being very condescending, simply isn't true and reveals a lack of appreciation for the innovation that I would have thought someone posting about target triples and compilers would have appreciated:

> Why the Windows people invented a whole other ABI instead of making things clean and simple like Apple did with Rosetta on ARM MacBooks? I have no idea, but http://www.emulators.com/docs/abc_arm64ec_explained.htm contains various excuses, none of which I am impressed by. My read is that their compiler org was just worse at life than Apple’s, which is not surprising, since Apple does compilers better than anyone else in the business.

I was already familiar with ARM64EC from reading about its development from Microsoft over the past years but had not come across the emulators.com link before - it's a stupendous (long) read and well worth the time if you are interested in lower-level shenanigans. The truth is that Microsoft's ARM64EC solution is a hundred times more brilliant and a thousand times better for backwards (and forwards) compatibility than Rosetta on macOS, which gave the user a far inferior experience than native code, executed (sometimes far) slower, prevented interop between legacy and modern code, left app devs having to do a full port to move to use newer tech (or even just have a UI that matched the rest of the system), and was always intended as a merely transitional bit of tech to last the few years it took for native x86 apps to be developed and take the place (usurp) of old ppc ones.

Microsoft's solution has none of these drawbacks (except the noted lack of AVX support), doesn't require every app to be 2x or 3x as large as a sacrifice to the fat binaries hack, offers a much more elegant solution for developers to migrate their code (piecemeal or otherwise) to a new platform where they don't know if it will be worth their time/money to invest in a full rewrite, lets users use all the apps they love, and maintains Microsoft's very much well-earned legacy for backwards compatibility.

When you run an app for Windows 2000 on Windows 11 (x86 or ARM), you don't see the old Windows 2000 aesthetic (and if you do, there's an easy way for users to opt into newer theming rather than requiring the developer to do something about it) and you aren't stuck with bugs from 30 years ago that were long since patched by the vendor many OS releases ago.

plorkyeran

The thing named Rosetta (actually Rosetta 2) for the x86_64 -> ARM transition is technologically completely unrelated to the PPC -> x86 Rosetta, and has none of the problems you mention. There's no user-observable difference between a program using Rosetta and a native program in modern macOS, and porting programs which didn't have any assembly or other CPU-arch-specific code was generally just a matter of wrangling your build system.

ComputerGuru

I addressed that in my response here: https://news.ycombinator.com/item?id=43720758

Zamiel_Snawley

Do those criticisms of Rosetta hold for Rosetta 2?

I assumed the author was talking about the x86 emulator released for the arm migration a few years ago, not the powerpc one.

ComputerGuru

They do indeed. Rosetta 2 is lightyears beyond Rosetta when it comes to performance and emulation overhead strategies and benefits from hardware support (and having to do less work just because of fewer differences between the host/target architectures) but still fundamentally relies on the emulation the entirety of the stack. There is almost zero information about its internals disclosed, but from what I understand it still revolves around fat binaries - and necessitates that Apple compiles their frameworks against both x86_64 and arm64. Unlike the MS solution, with Rosetta 2 you cannot call a native ARM64 library from an x86_64 binary, you can't port your code over piece-by-piece, and once Apple decides to no longer ship the next version of xxx framework as a fat binary because they don't want to maintain support for two different architectures in their codebase (wholly understandable), you'll (at best) be left with an older version of said framework that hasn't been patched to address the latest bugs, doesn't behave the same way that newer apps linking against the newer version of the framework do, etc.

Philpax

This author has a tendency to be condescending about things they find disagreeable. It's why I stopped reading them.

juped

You have neglected to consider that Microsoft bad; consider how they once did something differently from a Linux distribution I use. (This sentiment is alive and well among otherwise intelligent people; it's embarrassing to read.)

matheusmoreira

> Go originally wanted to not have to link any system libraries, something that does not actually work

It does work on Linux, the only kernel that promises a stable binary interface to user space.

https://www.matheusmoreira.com/articles/linux-system-calls

lonjil

FreeBSD does as well, but old ABI versions aren't kept forever.

matheusmoreira

People have told me that before but I was unable to find official documentation of this fact. Can you point me to it? Closest I found is forum posts claiming the ABI compatibility is good.

damagednoob

When developing a small program for my Synology NAS in Go, I'm sure I had to target a specific version of glibc.

matheusmoreira

Probably because the networking libraries use it for name resolution. That's a choice the developers of the Go implementation made. It's not required.

guipsp

Does it really tho? I've had address resolution break more than once in go programs.

matheusmoreira

That's because on Linux systems it's typical for domain name resolution to be provided by glibc. As a result, people ended up depending on glibc. They were writing GNU/Linux software, not Linux software.

https://wiki.archlinux.org/title/Domain_name_resolution

https://en.wikipedia.org/wiki/Name_Service_Switch

https://man.archlinux.org/man/getaddrinfo.3

This is user space stuff. You can trash all of this and roll your own mechanism to resolve the names however you want. Go probably did so. Linux will not complain in any way whatsoever.

Linux is the only kernel that lets you do this. Other kernels will break your software if you bypass their system libraries.

guipsp

I mean, that is fine and all, but it doesn't really matter for making the software run correctly on systems that currently exist.

vient

> Kalimba, VE

> No idea what this is, and Google won’t help me.

Seems that Kalimba is a DSP, originally by CSR and now by Qualcomm. CSR8640 is using it, for example https://www.qualcomm.com/products/internet-of-things/consume...

VE is harder to find with such short name.

AKSF_Ackermann

NEC Vector Engine. Basically not a thing outside supercomputers.

fc417fc802

$800 for the 20B-P model on ebay. More memory bandwidth than a 4090. I wonder if llama.cpp could be made to run on it?

I see rumors they charge for the compiler though.

fweimer

I think GCC's more-or-less equivalent to Clang's --target is called -B: https://gcc.gnu.org/onlinedocs/gcc/Directory-Options.html#in...

I assume it works with an all-targets binutils build. I haven't seen anyone building their cross-compilers in this way (at least not in recent memory).

JoshTriplett

I haven't either, probably because it would require building once per target and installing all the individual binaries.

This is one of the biggest differences between clang and GCC: clang has one binary that supports multiple targets, while a GCC build is always target-specific.

o11c

Old versions of GCC used to provide `-b <machine>` (and also `-V <version>`), but they were removed a long time ago in favor of expecting people to just use and set `CC` correctly.

It looks like gcc 3.3 through 4.5 just forwards to an external driver; prior to that it seems like it used the same driver for different paths, and after that it is removed.

IshKebab

Funny thing I found when I gave up trying to find documentation and read the LLVM source code (seems to be what happened to the author too!): there are actually five components of the triple, not four.

I can't remember what the fifth one is, but yeah... insane system.

Thanks for writing this up! I wonder if anyone will ever come up with something more sensible.

o11c

There are up to 7 components in a triple, but not all are used at once, the general format is:

  <machine>-<vendor>-<kernel>-<libc?><abi?><fabi?>

But there's also <obj>, see below.

Note that there are both canonical and non-canonical triples in use. Canonical triples are output by `config.guess` or `config.sub`; non-canonical triples are input to `config.sub` and used as prefixes for commands.

The <machine> field (1st) is what you're running on, and on some systems it includes a version number of sorts. Most 64-bit vs 32-bit differences go here, except if the runtime differs from what is natural (commonly "32-bit pointers even though the CPU is in 64-bit mode"), which goes in <abi> instead. Historically, "arm" and "mips" have been a mess here, but that has largely been fixed, in large part as a side-effect of Debian multiarch (whose triples only have to differ from GNU triples in that they canonicalize i[34567]86 to i386, but you should use dpkg-architecture to do the conversion for sanity).

The <vendor> field (2nd) is not very useful these days. It defaults to "unknown" but as of a few years ago "pc" is used instead on x86 (this means that the canonical triple can change, but this hasn't been catastrophic since you should almost always use the non-canonical triple except when pattern-matching, and when pattern-matching you should usually ignore this field anyway).

The <kernel> field (3rd) is pretty obvious when it's called that, but it's often called <os> instead since "linux" is an oddity for regularly having a <libc> component that differs. On many systems it includes version data (again, Linux is the oddity for having a stable syscall API/ABI). One notable exception: if a GNU userland is used on BSD/Solaris system, a "k" is prepended. "none" is often used for freestanding/embedded compilation, but see <obj>.

The <libc> field (main part of the 4th) is usually absent on non-Linux systems, but mandatory for "linux". If it is absent, the dash after the kernel is usually removed, except if there are ABI components. Note that "gnu" can be both a kernel (Hurd) and a libc (glibc). Android uses "android" here, so maybe <libc> is a bit of a misnomer (it's not "bionic") - maybe <userland>?

<abi>, if present, means you aren't doing the historical default for the platform specified by the main fields. Other than "eabi" for ARM, most of this is for "use 32-bit pointers but 64-bit registers".

<fabi> can be "hf" for 32-bit ARM systems that actually support floats in hardware. I don't think I've seen anything else, though I admit the main reason I separately document this from <abi> is because of how Debian's architecture puts it elsewhere.

<obj> is the object file format, usually "aout", "coff", or "elf". It can be appended to the kernel field (but before the kernel version number), or replace it if "none", or it can go in the <abi> field.

IshKebab

Nah I dunno where you're getting your information from but LLVM only supports 5 components.

See the code starting at line 1144 here: https://llvm.org/doxygen/Triple_8cpp_source.html

The components are arch-vendor-os-environment-objectformat.

It's absolutely full of special cases and hacks. Really at this point I think the only sane option is an explicit list of fixed strings. I think Rust does that.

jcranmer

You're not really contradicting o11c here; what LLVM calls "environment" is a mixture of what they called libc/abi/fabi. There's also what LLVM calls "subarch" to distinguish between different architectures that may be relevant (e.g., i386 is not the same as i686, although LLVM doesn't record this difference since it's generally less interested in targeting old hardware), and there's also OS version numbers that may or may not be relevant.

The underlying problem with target triples is that architecture-vendor-system isn't sufficient to uniquely describe the relevant details for specifying a toolchain, so the necessary extra information has been somewhat haphazardly added to the format. On top of that, since the relevance of some of the information is questionable for some tasks (especially the vendor field), different projects have chosen not to care about subtle differences, so the normalization of a triple is different between different projects.

LLVM's definition is not more or less correct than gcc's here, nor are these the only definitions floating around.

o11c

LLVM didn't invent the scheme; why should we pay attention to their copy and not look at the original?

The GNU Config project is the original.

jkelleyrtp

The author's blog is a FANTASTIC source of information. I recommend checking out some of their other posts:

- https://mcyoung.xyz/2021/06/01/linker-script/

- https://mcyoung.xyz/2023/08/09/yarns/

- https://mcyoung.xyz/2023/08/01/llvm-ir/

eqvinox

Given TFA's bias against GCC, I'm not so sure. e.g. looking at the linker script article… it's also missing the __start_XYZ and __stop_XYZ symbols automatically created by the linker.

matheusmoreira

It also focuses exclusively on sections. I wish it had at least mentioned segments, also known as program headers. Linux kernel's ELF loader does not care about sections, it only cares about segments.

Sections and segments are more or less the same concept: metadata that tells the loader how to map each part of the file into the correct memory regions with the correct memory protection attributes. Biggest difference is segments don't have names. Also they aren't neatly organized into logical blocks like sections are, they're just big file extents. The segments table is essentially a table of arguments for the mmap system call.

Learning this stuff from scratch was pretty tough. Linker script has commands to manipulate the program header table but I couldn't figure those out. In the end I asked developers to add command line options instead and the maintainer of mold actually obliged.

Looks like very few people know about stuff like this. One can use it to do some heavy wizardry though. I leveraged this machinery into a cool mechanism for embedding arbitrary data into ELF files. The kernel just memory maps the data in before the program has even begun execution. Typical solutions involve the program finding its own executable on the file system, reading it into memory and then finding some embedded data section. I made the kernel do almost all of that automatically.

https://www.matheusmoreira.com/articles/self-contained-lone-...

o11c

I wouldn't call them "same concept" at all. Segments (program headers) are all about the runtime (executables and shared libraries) and are low-cost. Sections are all about development (.o files) and are detailed.

Generally there are many sections combined into a single segment, other than special-purpose ones. Unless you are reimplementing ld.so, you almost certainly don't want to touch segments; sections are far easier to work with.

Also, normally you just just call `getauxval`, but if needed the type is already named `ElfW(auxv_t)*`.

eqvinox

Absolutely agree. Had my own fun dealings with ELF, and to be clear, on plain mainline shipping products (amd64 Linux), not toys/exercise/funky embedded. (Wouldn't have known about section start/stop symbols otherwise)

sramsay

I was really struck by the antipathy toward GCC. I'm not sure I quite understand where it's coming from.

null

[deleted]

forrestthewoods

What a great article.

Everytime I deal with target triples I get confused and have to refresh my memory. This article makes me feel better in knowing that target triples are an unmitigated cluster fuck of cruft and bad design.

> Go does the correct thing and distributes a cross compiler.

Yes but also no. AFAIK Zig is the only toolchain to provide native cross compiling out of the box without bullshit.

Missing from this discussion is the ability to specify and target different versions of glibc. Something that I think only Zig even attempts to do because Linux’s philosophy of building against local system globals is an incomprehensibly bad choice. So all these target triples are woefully underspecified.

I like that at least Rust defines its own clear list of target triples that are more rational than LLVM’s. At this point I feel like the whole concept of a target triples needs to be thrown away. Everything about it is bad.

SAI_Peregrinus

I'd say it's less bad design, & more a near-total lack of design. Someone needed a "good enough" way to specify target properties for a few targets they had to support, and picked a string format they could easily parse. Worked fine. Then more systems had to be added, and special cases happened, and nobody wanted to break backwards-compatibility so the system just grew. And nobody can agree on names, so people added alias support, and the system grew. And people started releasing OSes instead of just organizations so the "vendor" concept grew fuzzy, and the system grew. Now it is a hyphen-separated variable-length monster of confusion.

Ideally each component in the target "triple" would be a separate argument.

HN

What the hell is a target triple?

What the hell is a target triple?