Bare metal printf – C standard library without OS

87 comments

·April 26, 2025

ChuckMcM

I was feeling a bit like the Petunia and thought "Oh no, not again." :-) One of the annoyances of embedded programming can be having the wheel re-invented a zillion times. I was pleased to see that the author was just describing good software architecture that creates portable code on top of an environment specific library.

For doing 'bare metal' embedded work in C you need the crt0 which is the weirdly named C startup code that satisfies the assumption the C compiler made when it compiled your code. And a set of primitives to do what the i/o drivers of an operating system would have been doing for you. And voila, your C program runs on 'bare metal.'

Another good topic associated with this is setting up hooks to make STDIN and STDOUT work for your particular setup, so that when you type printf() it just automagically works.

This will also then introduce you to the concept of a basic input/output system or BIOS which exports those primitives. Then you can take that code in flash/eprom and load a binary compilation into memory and start it and now you've got a monitor or a primitive one application at a time OS like CP/M or DOS.

Its a fun road for students who really want to understand computer systems to go down.

LelouBil

At my school, we did the following project : https://github.com/lse/k

It is a small kernel, from only a bootloader to running elf files.

It has like 10 syscalls if I remember correctly.

It is very fun, and really makes you understand the ton of legacy support still in modern x86_64 CPUs and what the os underneath is doing with privilege levels and task switching.

I even implemented a small rom for it that has an interactive ocarina from Ocarina of Time.

ChuckMcM

This is really neat. So many engineers come out of school without ever having had this sort of 'start to finish' level of hands on experience. If you ever want to do systems or systems analysis this kind of thing will really, really help.

LelouBil

Also, if anyone wants to try it, the whole course is available publicly here (in English) https://k.lse.epita.fr/

pyuser583

What is your school? I thought it was the London School of Economics, but it’s another LSE.

LelouBil

It's EPITA, in France.

LSE is the System's laboratory of EPITA (https://www.lse.epita.fr/)

OnACoffeeBreak

No BIOS necessary when we're talking about bare metal systems. printf() will just resolve to a low-level UART-based routine that writes to a FIFO to be played out to the UART when it's not busy. Hell, I've seen systems that forego the FIFO and just write to the UART blocking while writing.

ChuckMcM

I hope nobody was confused into thinking I thought a BIOS was required, I was pointing out the evolution from this to a monitor. I've written some code[1] that runs on the STM32 series that uses the newlib printf(). I created the UART code [2] that is interrupt driven[3] which gives you the fun feature that you can hit ^C and have it reset the program. (useful when your code goes into an expected place :-)).

[1] https://github.com/ChuckM/

[2] https://github.com/ChuckM/nucleo/blob/master/f446re/uart/uar...

[3] https://github.com/ChuckM/nucleo/blob/master/f446re/common/u...

nonrandomstring

Yup, I recall Atari ST (68000) and BBC Micro (6502) having unbuffered and interrupt access to 6402 UART - which I used to C/ASM to fire MIDI bytes to and from.

null

[deleted]

marssaxman

This was my attempt at a minimal bare-metal C environment:

https://github.com/marssaxman/startc

ChuckMcM

That's awesome. Back in the day this was the strong point of eCOS which was a bare metal "platform" for running essentially one application on x86 hardware. The x86 ecosystem has gotten so complicated that being able to do this can get you better performance for an "embedded" app than running on top of Linux or another embedded OS. That translates into your appliance type device using lower cost chips which is a win. When I was playing around with eCos a lot of the digital signage market was using it.

guestbest

Does anyone still do it that way?

dusanh

This sounds fascinating and absolutely alien to me, a Python dev. Any good books or other sources to learn more you can recommend?

pjmlp

You can start here, https://wiki.osdev.org/Expanded_Main_Page

Also regardless of what others say, you can have a go trying to feel how it was to use BASIC in 8 bit computers to do everything their hardware exposed, or even 16 bit systems like MS-DOS, but with Python.

Get a ESP32 board, and have a go at it with MicroPython or CircuitPython,

https://docs.micropython.org/en/latest/esp32/quickref.html

https://learn.adafruit.com/circuitpython-with-esp32-quick-st...

genewitch

There's always the Minix book!

Rochus

Newlib is huge and complex (even including old K&R syntax) and adapting the build process to a new system is not trivial. I spent a lot of time with it when I re-targeted chibicc and cparser to EiGen, and finally switched to PDCLib for libc and a part of uClibc for libm; see https://github.com/rochus-keller/EiGen/tree/master/ecc/lib. The result is platform independent besides esentially one file.

adrian_b

For a static library it does not matter whether it is huge and complex, because you will typically link into your embedded application only a small number of functions from it.

I have used a part of newlib with many different kinds of microcontrollers and its build process has always been essentially the same as a quarter of century ago, so that the script that I have written the first time, before 2000, has always worked without problems, regardless of the target CPU.

The only tricky part that I had to figure the first time was how to split the compilation of the gcc cross-compiler into a part that is built before newlib and a part that is built after newlib.

However that is not specific to newlib, but is the method that must be used when compiling a cross-gcc with any standard C library and it has been simplified over the years, so that now there is little more to it than choosing the appropriate make targets when executing the make commands.

I have never needed to change the build process of newlib for a new system, I had just needed to replace a few functions, for things like I/O peripherals or memory allocation. However, I have never used much of newlib, mostly only stdio and memory/string functions.

Rochus

> it does not matter whether it is huge and complex

I was talking about the migration effort and usage complexity, not what the compiler or linker actually sees. It may well be that Newlib can be configured for every conceivable application, but it was more important to me not to have a such a behemoth and bag full of surprises in the project with preprocessor rules and dependencies that a single developer can hardly understand or keep track of. My solution is lean, complete, and works with standard-conforming compilers on each platform I need it.

adrian_b

The standard C library does not belong into any project, but it normally is shared together with cross-compilers, linkers and other tools by all projects that target a certain kind of hardware architecture.

So whatever preprocessor rules and dependencies may be needed to build the tool chain, they do not have any influence on the building processes for the software projects used to develop applications.

The building of the tool chain is done again only when new tool versions become available, not during the development of applications.

I assume that you have encountered problems because you have desired to build newlib with something else than gcc + binutils, with which it can be built immediately, as delivered.

Even if for some weird reason the use of gcc is avoided for the intended application, that should have not required the use of a newlib compiled with something else than gcc, as it should be linked without problems with any other ELF object files.

einpoklum

While "newlib" is an interesting idea - the approach taken here is, in many cases, the wrong one.

You see, actually, the printf() family of functions don't actually require _any_ metal, bare or otherwise, beyond the ability to print individual characters.

For this reason, a popular approach for the case of not having a full-fledged standard library is to have a fully cross-platform implementation of the family which "exposes" a symbol dependency on a character printing function, e.g.:

  void putchar_(char c);

and variants of the printf functions which take the character-printing function as a runtime parameter:

  int fctprintf(void (*out)(char c, void* extra_arg), void* extra_arg, const char* format, ...);
  int vfctprintf(void (*out)(char c, void* extra_arg), void* extra_arg, const char* format, va_list arg);

this is the approach taken in the standalone printf implementation I maintain, originally by Marco Paland:

https://github.com/eyalroz/printf

eqvinox

As replied on your other comment, when you introduce a custom printf for an embedded platform it makes more sense to just edit in support for your local I/O backend rather than having the complexity of a putch() callback function pointer.

cf. https://news.ycombinator.com/item?id=43811191 for other notes.

p0w3n3d

Bare metal printf is usually faster but (surprise surprise) platform dependent.

I remember I was trying to program Atari 8-bit using C compiler, and writing directly characters to Antic memory range WITH charcode translation was 100x faster than using printf.

However I'm not sharing this code because it won't work on UART... laughs nervously

smackeyacky

Has anybody played with newlib, but grown the complexity as the system came together?

It seems like one thing to get a bare-bones printf() working to get you started on a bit of hardware, but as the complexity of the system grows you might want to move on from (say) pushing characters out of a serial interface onto pushing them onto a bitmapped display.

Does newlib allow you to put different hooks in there as the complexity of the system increases?

adrian_b

Newlib provides both a standard printf, which is necessarily big, and a printf that does not support any of the floating-point format specifiers.

The latter is small enough so that I have used it in the past with various small microcontrollers, from ancient types based on PowerPC or ARM7TDMI to more recent MCUs with Cortex-M0+.

You just need to make the right configuration choice.

Gibbon1

You can always write a printf replacement that takes a minimal control block that provides put, get, control, and a context.

That way you can print to a serial port, an LCD Display, or a log.

Meaning seriously the standard printf is late 1970's hot garbage and no one should use it.

dailykoder

  // QEMU UART registers - these addresses are for QEMU's 16550A UART
  #define UART_BASE 0x10000000
  #define UART_THR  (*(volatile char *)(UART_BASE + 0x00)) // Transmit Holding Register
  #define UART_RBR  (*(volatile char *)(UART_BASE + 0x00)) // Receive Buffer Register
  #define UART_LSR  (*(volatile char *)(UART_BASE + 0x05)) // Line Status Register

This looks odd. Why are receive and transmit buffer the same and why would you use such a weird offset? Iirc RISC-V allows that, but my gut says I'd still align this to the word size.

bobmcnamara

> Why are receive and transmit buffer the same?

Backwards compatibility aside, why bother implementing additional register address decoding? Since the host already doesn't need to read THR or write RBR they can be safely combined. Some UARTs call this a DATA register instead.

eqvinox

My sweet summer child… this is backwards compatibility to the I/O register set of NatSemi/Intel's 8250 UART chip…

…from 1978.

https://en.m.wikipedia.org/wiki/8250_UART

The definitions are correct, look up an 16550 datasheet if you want to lose some sanity :)

null

[deleted]

dailykoder

Oh damn, thanks!

Neywiny

In school we were taught that the OS does the printf. I think the professors were just trying to generalize to not go on tangents. But, once I learned that no embedded libc variants had printf just no output path, it got a lot easier to figure out how to get it working. I wish I knew about SWO and the magic of semihosting back then. I don't think those would be hard to explain and interestingly it's one of the few things students asked about that in the field I'm also asked how to do by coworkers (the setting up _write).

wrasee

> But, once I learned that no embedded libc variants had printf just no output path

Did you mean "once I learned that no, embedded libc variants have printf"?

To clarify as I had to check, embedded libc variants do indeed have some (possibly stripped-down) implementation of printf and as you say they just lack the output path (hence custom output backends like UART, etc).

Neywiny

Sorry yes I had a comma there and then deleted it. I don't remember why but I remember doing it.

sylware

I am coding RISC-V assembly (which I run on x86_64 with a mini-interpreter) but I am careful to avoid the usage of the pseudo-instructions and the registers aliases (no compressed instruction ofc). I have a little tool to generate constant loading code, one-liner (semi-colon separated instructions).

And as a pre-processor I use a simple C preprocessor (I don't want to tie the code to the pre-processor of a specific assembler): I did that for x86_64 assembly, and I could assemble with gas, nasm and fasmng(fasm2) transparently.

0x000xca0xfe

What's wrong with compressed instructions?

sylware

I don't feel comfy using duplicate instructions for a 'R'educed instruction set.

That said, I know in some cases it could increase performance since the code would use less memory (and certainly more things which I don't know because I am not into modern advanced hardware CPU micro-architecture design).

0x000xca0xfe

It's just an alternative encoding for exactly the same instructions, it does not make the ISA more complex.

If you are writing assembly you probably are using compressed instructions already since your assembler can do the substitions transparently, e.g.

  addi a0,a0,10 -> c.addi a0,10

Example: https://godbolt.org/z/MG3v3jx7P (the disassembly shows addi but the instruction is only two bytes).

They offer a nice reduction in code size with basically no downsides :)

rurban

220k just to include studio? That's insane. I have 12k and still do IO. Just without the overblown stdio and sbrk, uart_puts is enough. And only in DEBUG mode.

tails4e

I thought this was going to talk about how printf is implemented. I worked with a tiny embedded processor that had 8k imem, and printf is about 100k alone. Crazy. Switched to a more basic implementation that was around 2k, and ran much,much faster. It seems printf is pretty bloated, though I guess typical people don't care.

adrian_b

In most C standard libraries intended for embedded applications, including newlib, there is some configuration option to provide a printf that does not support any of the floating-point format specifiers.

That is normally enough to reduce the footprint of printf by more than an order of magnitude, making it compatible with small microcontrollers.

rurban

I implemented a secure printf_s and its API is the problem. You cannot dead-code eliminate all the unused methods. And it's type unsafe. There are much better API's to implement a safe printer with all the formatting options still. format is not one of them

bobmcnamara

The only hack I could think of is having the compiler front end generate calls to different functions based on the content of the format string, similar to how some compilers replace memset with a 32-bit memset based on the type of the destination pointer

And it all falls apart as soon as a format string cannot be known at compile time.

MuffinFlavored

I always felt with these kinds of things you strip out `stdio.h` and your new API/ABI/blackbox becomes `syscall` for `write()`, etc.

up2isomorphism

If you already are doing bare metal, you probably should think twice if you need a printf, may you just need inb and outb.

adiabatichottub

I've used avrlibc's printf quite a lot. I always thought it was silly that Arduino never did the simple task of plumbing up to the UART

saagarjha

  char buffer[100];
  printf("Type something: ");
  scanf("%s", buffer);

Come on, it’s 2025, there’s no need to write trivial buffer overflows anymore.

anyfoo

It’s a feature to rewrite your OS kernel on the fly.

dbuder

It's 1990, maybe 1999, in embedded land.

HN

Bare metal printf – C standard library without OS

Bare metal printf – C standard library without OS