Skip to content(if available)orjump to list(if available)

No-Panic Rust: A Nice Technique for Systems Programming

wongarsu

The approach at the end of declaring invariants to the compiler so the compiler can eliminate panics seems accidentally genius. You can now add the same invariants as panicking asserts at the end of each function, and the compiler will prove to you that your functions are upholding the invariants. And of course you can add more panicking asserts to show other claims to be true, all tested at compile time. You've basically built a little proof system.

Sure, Rust is hardly the first language to include something like that and adoption of such systems tends to be ... spotty. But if it was reliable enough and had a better interface (that preferably allowed the rest of your program to sill have panics) this might be very useful for writing correct software.

saghm

This isn't quite the same, but it reminds me of something a bit less clever (and a lot less powerful) I came up with a little while back when writing some code to handle a binary format that used a lot of 32-bit integers that I needed to use for math on indexes in vectors. I was fairly confident that the code would never need to run on 16-bit platforms, but converting from a 32-bit integer to a usize in Rust technically is considered fallible due to the fact that you can't necessarily assume that a usize is more 16 bits, and frustrating `usize` only implements `TryFrom<u32>` rather than conditionally implementing `From<u32>` on 32-bit and 64-bit platforms. I wanted to avoid having to do any casting that could silently get messed up if I happened to switch any of the integer types I used later, but I also was irrationally upset at the idea of having to check at runtime for something that should be obvious at compile time. The solution I came up with was putting a static assertion that the target pointer width was either 32 or 64 bits inside the error-handling path, followed by marking the code path as `unreachable!()` that would never get executed (because either the error handling path wouldn't be taken, or the static assertion would stop the code from having been compiled in the first place. Even though this wasn't meaningfully different from just conditionally compiling to make sure the platform was suitable and then putting `unreachable!()` unconditionally in the error handling path, having the compile-time insertion locally in the spot where the error was being handled felt like I magically turned the runtime error into a compile-time one; it was quite literally possible to write it as a function that could be dropped into any codebase without having to make any other changes to ensure it was used safely.

lilyball

What about just doing something like

  #[cfg(any(target_pointer_width = "32", target_pointer_width = "64"))]
  #[inline(always)]
  const fn usize_to_u32(x: usize) -> u32 {
      x as u32
  }
and this way you can just call this function and you'll get a compile-time error (no such function) if you're on a 16-bit platform.

saghm

Even though I can visually verify that it's safe in this context, I really don't like casting integers as a rule when there's a reasonable alternative. The solution I came up with is pretty much equally readable in my opinion but has the distinction of not having code that might in other contexts look like it could silently have issues (compared to an `unreachable!()` macro which might also look sketchy but certainly wouldn't be quiet if it accidentally was used in the wrong spot). I also prefer having a compiler error explaining the invariant that's expected rather than a missing function (which could just as easily be due to a typo or something). You could put a `compile_error!()` invocation to conditionally compile when the pointer width isn't at least 32, but I'd argue that tilts the balance even more in favor of the solution I came up with; having a single item instead of two defined is more readable in my opinion.

This wasn't a concern for me but I could also imagine some sort of linting being used to ensure that potentially lossy casts aren't done, and while it presumably could be manually suppressed, that also would just add to the noisiness.

haberman

While I would love this to be true, I'm not sure that this design can statically prove anything. For an assert to fail, you would have to actually execute a code sequence that causes the invariant to be violated. I don't see how the compiler could prove at compile time that the invariants are upheld.

wongarsu

An assert is just a fancy `if condition { panic(message) }`. If the optimizer can show that condition is always false, the panic is declared as dead code and eliminated. The post uses that to get the compiler to remove all panics to reduce the binary size. But you can also just check if panic code was generated (or associated code linked in), and if it was then the optimizer wasn't able to show that your assert can't happen.

Of course this doesn't prove that the assert will happen. You would have to execute the code for that. But you can treat the fact that the optimizer couldn't eliminate your assert as failure, showing that either your code violates the assert, your preconditions in combination with the code aren't enough to show that the assert isn't violated, or the whole thing was too complicated for the optimizer to figure out and you have to restructure some code

haberman

Ah, I see what you are saying. Yes, if the optimizer is able to eliminate the postcondition check, I agree that it would constitute a proof that the code upholds the invariant.

The big question is how much real-world code the optimizer would be capable of "solving" in this way.

I wonder if most algorithms would eventually be solvable if you keep breaking them down into smaller pieces. Or if some would have some step of irreducible complexity that the optimizer cannot figure out, now matter how much you break it down.

jwatte

For systems where correctness is actually important, not just a nice-to-have (in most systems, it's nice-to-have,) we have had an increasing number of options over the years.

From tools like "spin" and "tla+" to proof assistants like Coq to full languages like Idris and Agda.

Some of the stronger-typed languages already give us some of those benefits (Haskell, OCaml) and with restricted effects (like Haskell) we can even make the compiler do much of this work without it leaking into other parts of the program if we don't want it to.

rtpg

I've had an unpleasant amount of crashes with Rust software because people are way too quick to grab `panic!` as an out.

This was most shocking to me in some of the Rust code Mozilla had integrated into Firefox (the CSS styling code). There was some font cache shenanigans that was causing their font loading to work only semi-consistently, and that would outright crash this subsystem, and tofu-ify CJK text entirely as a result.

And the underlying panic was totally recoverable in theory if you looked at the call stack! Just people had decided to not Result-ify a bunch of falliable code.

ninetyninenine

Sometimes the program is in an invalid state. You don't want to keep running the program. Better to fail spectacularly and clearly then to fail silently and try to hobble along.

jwatte

The thing with functional programming (specifically, immutable data,) is that as long as the invalid state is immutable, you can just back up to some previous caller, and they can figure out whether to deal with it or whether to reject up the its previous caller.

This is why Result (or Maybe, or runExceptT, and so on in other languages) is a perfectly safe way of handling unexpected or invalid data. As long as you enforce your invariants in pure code (code without side effects) then failure is safe.

This is also why effects should ideally be restricted and traceable by the compiler, which, unfortunately, Rust, ML, and that chain of the evolution tree didn't quite stretch to encompass.

duped

Say a function has some return type Result<T, E>. If our only error handling mechanism is Err(e) then were restricted to E representing the set of errors due to invalid arguments and state, and the set of errors due to the program itself being implemented incorrectly.

In a good software architecture (imo) panics and other hard failure mechanisms are there for splitting E into E1 and E2, where E1 is the set of errors that can happen due to the caller screwing up and E2 being the set of errors that the caller screwed up. The caller shouldn't have to reason about the callee possibly being incorrect!

Functional programming doesn't really come into the discussion here - oftentimes this crops up in imperative or object oriented code where function signatures are lossy because code relies on side effects or state that the type system can't/won't capture (for example, a database or file persisted somewhere). Thats where you'll drop an assert or panic - not as a routine part of error handling.

dralley

At least sources of panic! are easily greppable. Cutting corners on error handling is usually pretty obvious

haberman

I don't think grepping for panics is practical, unless you are trying to depend on exclusively no-panic libraries.

Even if you are no_std, core has tons of APIs like unwrap(), index slicing, etc. that can panic if you violate the preconditions. It's not practical to grep for all of them.

wongarsu

There is panic-analyzer [1] that searches for code that needlessly panics. You can also use the no-panic macro [2] to turn possible panics in a specific function (including main) into a compile error

1: https://crates.io/crates/panic-analyzer

2: https://crates.io/crates/no-panic

pluto_modadic

I mean... rust modules aren't typically in your CWD, no? they're not in some node_modules that you can grep, but in a cargo folder with /all of the libraries you ever used/, not just the ones you have for this one project.

gpm

Putting them all in the project root takes just a single `cargo vendor` command.

But I would assume that for mozilla their entire CSS subsystem is pulled in as a git (hg?) submodule or something anyways.

eru

For what it's worth, eg vscode can jump to definition even when your code is in a different crate that's not in your repository.

est31

If you run cargo vendor, they end up in a neat directory.

duped

While sure, more things could be baked as results, most of the time when you see a panic that's not the case. It's a violation of the callee's invariants that the caller fucked up.

Essentially an error means that the caller failed in a way that's expected. A panic means the caller broke some contract that wasn't expressed in the arguments.

A good example of this is array indexing. If you're using it you're saying that the caller (whoever is indexing into the array) has already agreed not to access out of bounds. But we still have to double check if that's the case.

And if you were to say that hey, that implies that the checks and branches should just be elided - you can! But not in safe rust, because safe code can't invoke undefined behavior.

null

[deleted]

andyferris

This seems to obviate a lot of Rust's advantages (like a good std library). I wonder what it would take to write a nopanic-std library?

Panics really seem bad for composability. And relying on the optimzer here seems like a fragile approach.

(And how is there no -nopanic compiler flag?)

gpm

Rust doesn't want to add any proof-system that isn't 100% repeatable, reliable, and forwards compatible to the language. The borrow checker is ok, because it meets those requirements. The optimizer based "no panic" proof system is not. It will break between releases as LLVM optimizations change, and there's no way to avoid it.

Trying to enforce no-panics without a proof system helping out is just not a very practical approach to programming. Consider code like

    some_queue.push_back("new_value");
    process(some_queue.pop_front().unwrap());
This code is obviously correct. It never panics. There's no better way to write it. The optimizer will instantly see that and remove the panicing branch. The language itself doesn't want to be in the business of trying to see things like that.

Or consider code like

    let mut count: usize = 0;
    for item in some_vec {
        // Do some stuff with item
        if some_cond() {
            count += 1;
        }
    }
This code never panics. Integer arithmetic contains a hidden panic path on overflow, but that can't occur here because the length of a vector is always less than usize::MAX.

Or so on.

Basically every practical language has some form of "this should never happen" root. Rust's is panics. C's is undefined behavior. Java's is exceptions.

Finally consider that this same mechanism is used for things like stack overflows, which can't be statically guaranteed to not occur short of rejecting recursion and knowledge of the runtime environment that rustc does not have.

---

Proof systems on top of rust like creusot or kani do tend to try to prove the absence of panics, because they don't have the same compunctions about not approving code today that they aren't absolutely sure they will approve tomorrow as well.

wongarsu

The standard library is slowly adding non-panicking options. The article shows some of them (like vec.push_within_capacity()) and ignores some others (vec.get_unchecked()). There is still a lot of work to do, but it is an area where a lot of work gets done. The issue is just that a) Rust is still a fairly young language, barely a decade old counting from 1.0 release, and b) Rust is really slow and methodical in adding anything to stdlib because of how hard/impossible it is to reverse bad decisions in the stdlib.

The same article written a couple years in the future would look very different

hathawsh

What I would like to see is a reliable distinction of different types of panics. In the environments where software I write is typically run, panics due to heap allocation failure are generally acceptable and rarely an indication of fragility. (By the time a heap allocation failure occurs, the computer is probably already thrashing and needs to be rebooted.) On the other hand, other kinds of panics are a bad sign. For example, I would frown on any library that panics just because it can't reach the Internet.

In other environments, like embedded or safety-critical devices, I would need a guarantee that even heap allocation failure can not cause a panic.

staunton

This website makes by browser freeze... No idea why. Not able to read the article.

haberman

Author here -- that is surprising. What browser/OS are you on? I haven't had anyone else report this problem before.

TallonRain

I’m seeing the same problem, the page crashes on Safari on iOS, saying a problem repeatedly occurred. Haven’t seen a webpage do that in quite a while.

faitswulff

Yep, same experience, same platform. I guess straight to reader mode, it is.

EDIT - shockingly, reader mode also fails completely after the page reloads itself

IX-103

I'm also seeing this on Android Chrome. When I opened the page on my Linux desktop, I also saw the crashes (though they only affected the godbolt iframes).

Note that on Android process separation is not usually as good, so a crashing iframe can bring down the whole page.

anymouse123456

same for me. Chrome on Pixel 8

wbobeirne

Same issue on Android using Brave.

lacraig2

same on chrome android

DemetriousJones

Same, the web view in my Android client crashed after a couple seconds.

haberman

I wonder if it's all the Godbolt iframes. Do you have the same problem on other pages, like https://blog.reverberate.org/2025/01/27/an-ode-to-header-fil... ?

IX-103

Yeah, I think it's all those iframes. I'm seeing something weird on my Linux desktop - all the godbolt iframes crash on reload unless I have another tab with godbolt open. I didn't see anything obvious in Chrome's log.

I can't replicate the crash at all on my Linux cloud VM though. Usually the only difference there is that advertisers tend to not buy ads for clients on cloud IPs.

wavemode

Other pages on the site work fine for me yeah. But the OP blog post is crashing my Android browser, like the other commenters have mentioned.

btown

Does Rust have something like a deep-codemodding macro that could be used to un-panic-fy an entire function etc. automatically?

Something like: Given a function, rewrite its signature to return a Result if it doesn't already, rewrite each non-Resulty return site to a Some(), add a ? to every function call, then recurse into each called function and do the same.

dccsillag

It has `catch_unwind` [1], but that still retains the panicking runtime, so not sufficient in the context of the post.

[1] https://doc.rust-lang.org/std/panic/fn.catch_unwind.html

gpm

It's also not guaranteed to catch every panic - sometimes (notably if a destructor panics during unwinding) a panic can turn into a process-abort.

LegionMammal978

To add to that, Rust code is generally not written to be 'exception-safe' when panics occur: if a third-party function causes a panic, or if your own code panics from within a callback, then memory may be leaked, and objects in use may end up in an incorrect or unusable state.

You really want to avoid sharing mutable objects across a catch_unwind() boundary, and also avoid using it on a regular basis. Aside from memory leaks, panicking runs the thread's panic hook, which by default prints a stacktrace. You can override the panic hook to be a no-op, but then you won't see anything for actual panics.

davisp

Does anyone know if there's an obvious reason that adding a `no_panic` crate attribute wouldn't be feasible? It certainly seems like an "obvious" thing to add so I'm hesitant to take the obvious nerd snipe bait.

hathawsh

The standard library has a significant amount of code that panics, so a `no_panic` crate attribute would currently only work for crates that don't depend on the standard library. I imagine most interesting crates depend on the standard library.

davisp

What caught my eye in the article was the desire to have something that doesn't panic with a release profile, while allowing for panics in dev profiles. Based on other comments I think the general "allow use of std, but don't panic" seems like something that could be useful purely on the "Wait, why doesn't that exist?" reactions.

7e

You could do it, but I would prefer guarantees on a per-call chain basis using a sanitizer. It should be quite easy to write.

davisp

I'm no rustc expert, but from what little I know it seems like disabling panics for a crate would be an obvious first step. You make a great point though. Turning that into a compiler assertion of "this function will never panic" would also be useful.

amelius

> Protocol Buffers

Instead of serializing data (to disk, not the network), it would be much faster if Rust allowed us to allocate datastructures directly in an mmapped file, and allowed us to read back the data (basically patching the pointers so they become valid if the base address changed).

lilyball

This is basically what Cap'n Proto does, and it uses offsets instead of pointers so that way the mmapped data can be used as-is.

meltyness

While not as strict, you can filter lints by their description and apply a policy to your crate https://rust-lang.github.io/rust-clippy/rust-1.84.0/index.ht...

7e

It should be possible to write a sanitizer which verifies no panic behavior on a call graph, just as you can to verify no blocking, or no races.

XorNot

This seems..absurd for a programming language with goals like Rust. Why isn't this a compiler option? Just set -nopanics and the compiler errors and flags anything which is pulling in a panic at the very least?

dpc_01234

rcxdude

that just makes the panics unrecoverable. It doesn't statically guarantee no panics.

wongarsu

There is no-panic [1] to turn panics in a function into compile errors, or the older no-panics-whatsoever [2] to do the same for the entire binary

1: https://crates.io/crates/no-panic

2: https://crates.io/crates/no-panics-whatsoever

lmm

It presumably avoids the linked in 300Kb that was supposedly part of the motivation for doing this though?

wongarsu

You can set panic: abort [1] if you don't want the unwinding mechanism. You still get a nice error message on panic, which causes the compiler to link in some formatting code from the standard library. I'm not sure if you can get rid of that.

On the same page are also the options for controlling debug assertions and overflow checks, which would get rid of the "panics in debug, but not release", if that behavior bugs you

1: https://doc.rust-lang.org/cargo/reference/profiles.html#pani...

winstonewert

Well, if they did that, then people could expect/demand stability with regard to what scenarios get the checks/panics optimized out. This would be a bit of a burden for the Rust maintainers. It would effectively make the optimizer part of the language specification, and that's undesireable.

dccsillag

Yeah, I'm fairly sure that there is such a flag/toplevel attribute... and if there isn't, there should be one.

It also feels like most of the pains on avoiding panics centers around allocations which, though a bit unfortunate, makes sense; it was an intentional design choice to make allocations panic instead of return Results, because most users of the language would probably crash on allocation fails anyways and it would introduce a lot of clutter. There was some effort some while ago on having better fallible allocations, but I'm not sure what happened over there.