Comparison of C/POSIX standard library implementations for Linux
57 comments
·May 10, 2025pizlonator
skissane
What's Fil-C? Okay, found it myself, looks cool: https://github.com/pizlonator/llvm-project-deluge/
What's yoyoland? All I can find is an amusement park in Bangkok, and some 1990s-era communication software for Classic Mac OS: https://www.macintoshrepository.org/39495-yoyo-2-1
pizlonator
The Fil-C stack is composed of :
- Userland: the place where you C code lives. Like the normal userland you're familiar with, but everything is compiled with Fil-C, so it's memory safe.
- Yololand: the place where Fil-C's runtime lives. Fil-C's runtime is about 100,000 lines of C code (almost entirely written by me), which currently has libc as a dependency (because the runtime makes syscalls using the normal C functions for syscalls rather than using assembly directly; also the runtime relies on a handful of libc utility functions that aren't syscalls, like memcpy).
So Fil-C has two libc's. The yololand libc (compiled with a normal C compiler, only there to support the runtime) and the userland libc (compiled with the Fil-C compiler like everything else in Fil-C userland, and this is what your C code calls into).
skissane
Why does yoyoland need to use libc’s memcpy? Can’t you just use __builtin_memcpy?
On Linux, if all you need is syscalls, you can just write your own syscall wrapper-like Go does.
Doesn’t work on some other operating systems (e.g. Solaris/Illumos, OpenBSD, macOS, Windows) where the system call interface is private to the system shared libraries
abnercoimbre
Interesting! Will you stick around with the musl build? And if so, why?
pizlonator
Not sure but in likely to because right now I to use the same libc in userland (the Fil-C compiled part) and yololand (the part compiled by normal C that is below the runtime) and the userland libc is musl.
Having them be the same means that if there is any libc function that is best implemented by having userland call a Fil-C runtime wrapper for the yololand implementation (say because what it’s doing requires platform specific assembly) then I can be sure that the yololand libc really implements that function the same way with all the same corner cases.
But there aren’t many cases of that and they’re hacks that I might someday remove. So I probably won’t have this “libc sandwich” forever
LukeShu
When I was working with Envoy Proxy, it was known that perf was worse with musl than with glibc. We went through silly hoops to have a glibc Envoy running in an Alpine (musl) container.
pjmlp
Are you sure they were being used at all?
GCC replaces memcpy/memmove/memset with its own intrisics, if compiling in high optimization levels.
pizlonator
Yes they were being used.
ObscureScience
That table is unfortunately quite old. I can't personally say what have changed, but it is hard to put much confidence in the relevance of the information.
lifthrasiir
Yeah, also it doesn't compare actual implementations, just plain checkboxes. I'm aware of two specific substantial performance regressions for musl: exact floating point printing (it uses Dragon4 but implemented it way slower than it could have been) and memory allocator (for a long time it didn't any sort of arena like pretty much every modern allocator---now it does with mallocng though).
jay-barronville
Please note that the linked comparison table has been unmaintained for a while. This is even explicitly stated on the legacy musl libc website[0][0] (i.e., “The (mostly unmaintained) libc comparison is still available on etalabs.net.”).
ethan_smith
This comparison was last updated around 2016-2017. Since then, glibc has improved its size efficiency (particularly with link-time optimization), musl has enhanced its POSIX compliance, and several performance optimizations have landed in both projects.
weiwenhao
The static compilation of musl libc is a huge help for alpine linux and many system programming languages. My programming language https://github.com/nature-lang/nature is also built on musl libc.
thrtythreeforty
It really ought to lead with the license of each library. I was considering dietlibc until I got to the bottom - GPLv2. I am a GPL apologist and even I can appreciate that this is a nonstarter; even GNU's libc is only LGPL!
LeFantome
musl seems to have displaced dietLibc. Much more complete yet fairly small and light.
yusina
Note that dietlibc is the project of a sole coder in the CCC sphere from Berlin (Fefe). His main objective was to learn how low level infra is implemented and started using it in some of his other projects after realizing that there is a lot of bloat he can skip with just implementing the bare essentials. Musl has a different set of objectives.
projektfu
I follow diet but it is definitely not ready for general use like musl and probably never will be. There aren't a lot of eyeballs on it.
josephg
It’s amazing how much code gets pulled in for printf. Using musl, printf apparently adds 13kb of code to your binary. Given format strings are almost always static, it’s so weird to me that they still get parsed at runtime in all cases. Modern compilers even parse printf format strings anyway to check your types match.
This sort of thing makes me really appreciate zig’s comptime. Even rust uses a macro for println!().
messe
In larger programs, that compile time parsing can lead to even more code, as the function is essentially instantiated and compiled separately for each and every invocation. The type erasure provided by printf, can be a blessing in _some circumstances_.
That being said, in those larger programs, it's still likely going to be a negligible part of the binary size, and the additional code paths are unlikely to affect performance unless you're doing string formatting in multiple hot-paths which is generally a poor choice anyway.
jcelerier
If you use any level of compiler optimisation both clang and GCC will convert calls to printf into calls to puts (which is much simpler) if they detect there's no formatting done
moomin
No cosmopolitan, pity.
null
snickerer
Fun libc comparison by the author of musl.
My getaway is: glibc is bloated but fast. Quite unexpected combination. Am I right?
kstrauser
It’s not shocking. More complex implementations using more sophisticated algorithms can be faster. That’s not always true, but it often is. For example, look at some of the string search algorithms used by things like ripgrep. They’re way more complex than just looping across the input and matching character by character, and they pay off.
Something like glibc has had decades to swap in complex, fast code for simple-looking functions.
weinzierl
In case of glibc I think what you said is orthogonal to its bloat. Yes, it has complex implementations but since they are for a good reason I'd hardly call them bloat.
Independently from that glibc implements a lot of stuff that could be considered bloat:
- Extensive internationalization support
- Extensive backward compatibility
- Support for numerous architectures and platforms
- Comprehensive implementations of optional standards
kstrauser
Ok, fair points, although internationalization seems like a reasonable thing to include at first glance.
Is there a fork of glibc that strips ancient or bizarre platforms?
ape4
Yeah look at even strlen()
https://github.com/lattera/glibc/blob/master/string/strlen.c
GabrielTFS
That's the generic implementation - it's not used on most popular architectures (I think the most popular architecture it's used on would be RISC-V or MIPS) because they all have architecture-specific implementations. The implementation running on the average (x86) computer is likely to be https://sourceware.org/git/?p=glibc.git;a=blob;f=sysdeps/x86... (if you have AVX512), https://sourceware.org/git/?p=glibc.git;a=blob;f=sysdeps/x86... (if you have AVX2 and not AVX512) or https://sourceware.org/git/?p=glibc.git;a=blob;f=sysdeps/x86... (if you have neither AVX2 nor AVX512 - rather rare these days)
LeFantome
A lot of the “slowness” of MUSL is the default allocator. It can be swapped out.
For example, Chimera Linux uses MUSL with mimalloc and it is quite snappy.
jeffbee
That's a great combo. I like LLVM libc in overlay mode with musl beneath and mimalloc. Performance is excellent.
userbinator
Microbenchmarks tend to favour extreme unrolling and other "speed at any cost" tricks that often show up as negatives in macrobenchmarks.
flohofwoe
Choice still matters IMHO. E.g. a very small but slow malloc/free may be preferable if your code only allocates infrequently. Also linking musl statically avoids the whole glibc dll version mess, admittedly only useful for cmdline tools though.
timeinput
My take away is that it's not a meaningful chart? Just in the first row musl looks bloated at 426k compared to dietlibc at 120k. Why were those colors chosen? It's arbitrary and up to the author of the chart.
The author of musl made a chart, that focused on the things they cared about and benchmarked them, and found that for the things they prioritized they were better than other standard library implementations (at least from counting green rows)? neat.
I mean I'm glad they made the library, that it's useful, and that it's meeting the goals they set out to solve, but what would the same chart created by the other library authors look like?
cyberax
Not quite correct. glibc is slow if you need to be able to fork quickly.
However, it does have super-optimized string/memory functions. There are highly optimized assembly language implementations of them that use SIMD for dozens of different CPUs.
casey2
Where is the "# of regressions caused" box?
My own perf comparison: when I switched from Fil-C running on my system’s libc (recent glibc) for yololand to my own build of musl, I got a 1-2% perf regression. My best guess is that it’s because glibc’s memcpy/memmove/memset are better. Couldn’t have been the allocator since Fil-C’s runtime has its own allocator.