That Time Ken Thompson Wrote a Backdoor into the C Compiler
11 comments
·October 25, 2025tetris11
fjfaase
For the 'full' graph for stage0 see: https://fransfaase.github.io/Emulator/tdiagram.html and note that it even is not completely 'full' leaving out some steps that copy file from one location to another. Use mouse or fingers to zoom and pan.
I gave a talk about this at WHY2025 which also refers to this 'Reflections on Trusting Trust' paper. On YouTube https://www.youtube.com/watch?v=akzyyO5wvm0
tetris11
nice work!
kaem is a new one for me, what's its connection to mescc?
donatj
Hmm... I've read about "Reflections on Trusting Trust" a couple times including in college some twenty years ago, though never the paper itself.
I have never seen the actual examples before, but the way it's always been described to me I kind of expected more...
It was always described as completely undetectable... so my assumption was one could not find it even with a decompiler and a lot of free time...
I guess I expected for instance it to filter patterns of itself out of fread for instance, such that a system built with it literally could not detect its existence at all. I expected it to make the operating system at large lie to you.
fjfaase
One can find it out with a decompiler and a lots and lots of free time. Compilers are not trivial programs, especially the ones needed to compile operating systems, with the required optimizations, and there are many ways to obfuscate code.
A better approach is to start with a small executable, one that translate hexadecimal numbers to binary, and from that build all the tools to compile a simple C compiler (such as the Tiny C Compiler, which is not very tiny), to compile the optimizing C compiler that can compile operating systems. That is the approach followed by the live-bootstrap project.
colejohnson66
That’s what Guix did. All the way back to a 357 byte assembly code blob that turns a hex file into a binary file, and can “compile” itself.
richardhenry
If the compromised compiler also compiles the decompiler…
fjfaase
Interesting. I reviewd the live-bootstrap project (a project to build a trusted C compiler for building Linux) in the past years, including writing a Linux on i368 simulator/interpreter, and gave a presentation about this at WHY2025.
turtleyacht
Assigning 11 to mean "v" in the sequence "\v" sounds like replacing words (or letters) with numbers.
Is that related to Godel's idea that a system can be either complete or inconsistent?
Digit-Al
"\v" is an ASCII control character, and means "vertical tab" (VT). In ASCII it is code 11, which is why they are inserting the number 11.
turtleyacht
Yes. In this case, it represents "teaching" the compiler something it didn't previously "know" about. But wanted to connect that to whether the computed output of a language could ever be falsifiable, i.e. we would not know it happened.
Ken Thompson's Reflections on Trusting Trust[0] was one of the motivations for Guix's single 357-byte seed (+libguile) full bootstrap[1].
0: https://www.cs.cmu.edu/~rdriley/487/papers/Thompson_1984_Ref...
1: https://guix.gnu.org/en/blog/2023/the-full-source-bootstrap-...