It is time to standardize principles and practices for software memory safety

100 comments

·February 6, 2025

flitzofolov

Makes sense, good luck! I know that sounds snarky, I'm looking forward to rational progress and cooperation on the evolution and adoption of the standard. Just haven't seen that played out in such a planned orderly fashion yet (ipv6?).

nottorp

ipv6, unicode, usb...

Why am I more worried than excited about a new standard?

By the way bounds checking was introduced in Turbo Pascal in 1987. Iirc people ended up disabling it in release builds but it was always on in debug.

But ... it's Pascal, right? Toy language.

pjmlp

Bounds checking exists at very least since JOVIAL in 1958, or if you consider FORTRAN compilers have add an option for bounds checking for quite some time, 1957.

Here is my favourite quote, every time we discuss bounds checking.

"A consequence of this principle is that every occurrence of every subscript of every subscripted variable was on every occasion checked at run time against both the upper and the lower declared bounds of the array. Many years later we asked our customers whether they wished us to provide an option to switch off these checks in the interests of efficiency on production runs. Unanimously, they urged us not to--they already knew how frequently subscript errors occur on production runs where failure to detect them could be disastrous. I note with fear and horror that even in 1980 language designers and users have not learned this lesson. In any respectable branch of engineering, failure to observe such elementary precautions would have long been against the law."

-- C.A.R Hoare's "The 1980 ACM Turing Award Lecture"

Guess what programming language he is referring to by "1980 language designers and users have not learned this lesson".

znpy

> But ... it's Pascal, right? Toy language.

Not really.

It's just out of fashion. But there are really high quality current day implementation, like the one from Embarcadero (i think they acquired Borland a while ago?): https://www.embarcadero.com/products/delphi/features/design

AnimalMuppet

I think nottorp was being a bit sarcastic. I think the point was, if Pascal, which some in the C/C++ world regard as a "toy" language, had this in 1987, maybe we can actually think about having it in "real" languages in 2025.

GoblinSlayer

I heard algol had bounds checking somewhere in 60s as an implementation feature. Reportedly customers liked it a lot that the programs don't produce wrong results faster.

nottorp

Pascal being derived from Algol, it makes a lot of sense.

leecommamichael

Yeah, I’m wondering what this even means. I’m assuming they’ll have to define “memory safety” which is already quite the task. Memory safe in what context? On what sort of machine? What sort of OS?

crabbone

> On what sort of machine? What sort of OS?

Just sharing an anecdote: recently, I had to create Linux images for x86 on ARM machine using QEMU. During this process, I discovered that, for example, creation of initrd fails because of memory page size (some code makes assumption about page size and calculates the memory location to access instead of using system interface to discover that location). There's a similar problem when using "locate" utility. Probably a bunch more programs that have been successfully used millions, well, probably trillions times. This manifests itself in QEMU segfaulting when trying to perform these operations.

But, to answer the question: I think, one way to define memory safety is to ensure that the language doesn't have the ability to do I/O to a memory address not obtained through system interface. Not sure if this is too much to ask. Feels like for application development purposes this should be OK, and for system development this obviously will not work (someone has to create the system interface that supplies valid memory addresses).

pjc50

I think the usual context just requires language soundness; it doesn't depend on having an MMU or anything like that. In particular, protection against:

- out-of-bounds on array read/write

- stack corruption such as overwriting the return address

It doesn't directly say "you can't use C", but achieving this level of soundness in C is quite hard (see sel4 and its Coq proof).

chillingeffect

Everyone picks on C, but we have a standard for this. We've been following it for decades in regulated industries. If people take the time, it can be perfectly safe. It requires thinking of a computer as a precision machine, rather than a semantic "do what i'm thinking" box.

AnimalMuppet

Maybe I lack vision in such matters, but: how would you corrupt the stack without an out-of-bounds write?

But there's another aspect that I think you missed: use after free.

As you say, achieving this level of soundness with C is hard. Proving it is much harder. (Except, how do you know you've achieved it if you don't prove it?)

milesrout

Yet that is not what memory safety means. A program being memory safe or not depends on its actual behaviour not what you can prove about that behaviour. There are plenty of safe C programs and plenty of unsafe ones. Proving something is safe doesnt make it safe.

Also these properties are a very small subset of general correctness. Who cares if you write a "safe" program if it computes the wrong answer?

User23

One of the most under appreciated things about the JVM is its well-defined memory model.

swiftcoder

Unfortunately immediately undermined by the decision to not address nullability out of the gate. It's fantastic that null pointer exceptions are well-defined on the JVM - but they still tend to bring your program to a screaming halt.

writebetterc

I believe that the parent is talking about how Java, and in turn the JVM, defines its memory model, which describes formally how memory reads and writes occur in the presence of multiple threads of execution. For example, data races are well-defined in Java: Either you read the old or the new value, you'll never read a mix of two values.

Having a well-defined memory model is important when running on infra with very different models, such as x86 and aarch64.

swiftcoder

[flagged]

GoblinSlayer

Is it a big difference if you have ValueNotPresentException from nullable unwrap instead of NullPointerException?

Oh, wait, it exists https://docs.oracle.com/javase/8/docs/api/java/util/Optional...

edflsafoiewq

The problem is basically every type is implicitly Optional and basically every operation implicitly unwraps, instead of only the cases where nullability is actually desired.

swiftcoder

The big difference is that Optional has ergonomic features like .map(), .ifPresent(), .orElse(), that reduce the verbosity of repeated if blocks checking if values are present or not.

tcoff91

After the log4j vulnerability, I’d say that despite having good memory safety I would say the JVM’s powerful serialization primitives make me pretty leery of it as far as security goes.

bruce343434

The log4j vulnerability happened because the log4j programmers made an overly generalized "do anything and everything" software. The kind of architecture that is made to be so generic that the software accidentally gains emergent properties (=not thought of/realized/considered execution paths and interactions). Although the desire to write software like that might have arisen under the influence of the object oriented mindset, I'm sure it could have happened in any other language.

naasking

> I’d say that despite having good memory safety I would say the JVM’s powerful serialization primitives make me pretty leery of it as far as security goes.

That's a sound deduction. The JVM had such a vulnerability because it's full of ambient authority and rights amplification patterns. More such vulnerabilities probably exist, they're just hard to see.

aeonik

You can do the same thing in Rust as Log4J.

I haven't tested this code, and definitely don't do this.

This code is intended to lets attackers run any shell command by sending JSON with a "debug_command" field - similar to Log4J, it's a "feature" being misused rather than a memory bug that Rust would catch.

      ```rust
   use serde_json::Value;
   use std::process::Command;
   
   fn process_log_entry(log: &str) {
       //  UNSAFE: Allows command injection
       if let Ok(json) = serde_json::from_str::<Value>(log) {
           if let Some(cmd) = json.get("debug_command") {
               Command::new("sh")
                   .arg("-c")
                   .arg(cmd.as_str().unwrap_or(""))
                   .output()
                   .unwrap();
           }
       }
   }
   ```

tcoff91

I feel like Rust has a different development culture than Java and Java devs are more likely to want to build some abstraction that does everything and loads classes from the network into the runtime.

d3nj4l

That's not the same thing. You can call a shell command from any language. The log4j problem was that you could load arbitrary classes from the internet into the memory of the current process, which is a much more severe problem.

MattPalmer1086

The log4j vulnerability was due to Java code, not in the JVM. But I get that people mostly conflate the two.

mimd

I know this article is more a buisness case presentation than a full demonstration of the field but the TR also misses some points.

Why remove the refrences in the TR to frama-C, cbmc, etc. from the opinion report? They are easier to adopt than the heavier tooling of coq, etc. I'm always suprised to see those tools ignored or downplayed, when it comes to these discussions. I do agree with the TR's sentiment that we need to improve accessibility and understanding of these tools, but this is not a great showing for that sentiment.

Additionally, both articles miss that compiler modified languages/builds are a path, such as fbounds-safety. They will be part of the solution, and frankly, likely the biggest part at the rate we are going. Eg. current stack defenses for C/C++/Rust, unaddressed in safe language design, are compiler based. The compiler extension path is not particularly different than cheri, which requires a recompile with some modifications, and the goal of both approaches is to allow maintainers to band aid dependencies with minimal effort.

The TR handwaves away the question of the complexity of the development of formal method tools for Rust/Unsafe Rust and C++. Ie. rust really only has two tools at the moment: miri and kani (which is a cbmc wrapper). Heavier tools are in various states of atrophying/development. And C++ support from the c family of formal tools such as frama-C, is mostly experimental. It's not assured, that with the continued language development rate of both languages and the complexity of the analysis, that the tools for these languages will come forth to cover this gap anytime soon.

I do not think the claim in the TR that the current unsafe/safe seperation will result in only requiring formal analysis of the unsafe sections is true, as logical errors, which are normal in safe rust, can cross the boundries to unsafe and cause errors, thus nessecitating whole program analysis to resolve if an unsafe section could result in errors. Perhaps it will decrease the constants, but not the complexity. If rust does further restricts perhaps more of the space could be covered to help create that senario, but the costs might be high in both usability and so on.

SOLAR_FIELDS

Is it a hot take to believe that no humans are infallible and that only languages with memory safe guarantees can offer the kind of safety the author seeks? With the advent of rust, c and c++ programmers can no longer argue that the performance tradeoff is worth giving up safety.

There are, of course, other good reasons to choose c and c++ over rust. And of course rust has its own warts. just pointing out that performance and memory safety are not necessarily mutually exclusive

RossBencina

I'm not sure what you're definition of performance parity is. Are you claiming that the existence of rust proves that there is no performance penalty for memory safety? The penalty may be relatively small, but I am not aware of any proof that the penalty is non-existent. I am not even sure how you could prove such a thing. I could imagine that C and C++ implementations of exactly the same algorithms and data structures as are implemented in safe rust might perform similarly, but what about all of the C and C++ implementations that are both correct and not implementable in safe rust? do they all perform only as well or worse than rust?

IshKebab

1. Those fast algorithms that can't be implemented in safe Rust are rare.

2. Even when they exist, Rust lets you use unsafe code but only where needed. It's still much better than having your entire program be unsafe.

3. In practice Rust versions of programs are as fast, if not faster than C/C++ ones.

tialaramex

As an example, a really fast sort can't be expressed in safe Rust

However, the two sort algorithms in Rust are safe to use, as well as being faster than their equivalents in C++. In fact even the previous sorts, the ones which were replaced by the current implementations, were both faster and safer than what you're getting in C++

vvanders

That assumes that people know what they're doing in C/C++, I've seen just as many bloated codebases in C++ if not more because the defaults for most compilers are not great and it's very easy for things to get out of hand with templates, excessive use of dynamic libraries(which inhibit LTO) or using shared_ptr for everything.

My experience is that Rust guides you towards defaults that tend to not hit those things and for the cases where you really do need that fine grained control unsafe blocks with direct pointer access are available(and I've used them when needed).

Tanjreeve

Is there a name for a fallacy like "appeal to stupidity" or something where the argument against using a tool that's fit for the job boils down to "All developers are too dumb to use this/you need to read a manual/it's hard" etc etc?

kojolina

Nah, rust also guides you to "death from a million paper cuts" aka RAII (aka everything is singularly allocated and free'd all over the place).

You need memory management to be painful like in C so that it forces people to go for better options like linear/static group allocations.

ultimaweapon

Once you know how Rust works it is likely your Rust code will be faster than C/C++ with less effort. I can say this because I was using C++ for a long time since Visual C++ 6.0 and moved to Rust recently about 3 years ago.

One of the reason is you get the whole program optimization automatically in Rust while C/C++ you need to use put the function that need to be inline in the header or enable LTO at the link time. Bound checking in Rust that people keep using as an example for performance problem is not actually a problem. For example, if you need to access the same index multiple times Rust will perform bound-checking only on the first access (e.g. https://play.rust-lang.org/?version=stable&mode=release&edit...).

Borrow checker is your friend, not an enemy once you know how work with it.

jandrewrogers

This kind of assumes old and naive C++. There was a lot of that 20 years ago but a lot of that was replaced by languages with garbage collectors. New C++ applications today tend to be geared toward extreme performance/scale. The idioms are far beyond thinking much about anything you mention.

People seriously underestimate how capable and expressive modern C++ metaprogramming facilities are. Most don’t bother to learn it but it is one of the most powerful features of the language when it comes to both performance and safety. The absence of it is very noticeable when I use other systems languages. I’m not a huge fan of C++ but that is a killer feature.

GoblinSlayer

Visual C++ supports LTO: https://learn.microsoft.com/en-us/cpp/build/reference/gl-who...

swiftcoder

> not implementable in safe rust

This is moving the goalposts. "Safe rust" isn't a distinct language. The unsafe escape hatch is there to make sure that all programs can be implemented safely.

RossBencina

It is not moving the goalposts. The parent that I replied to said "c and c++ programmers can no longer argue that the performance tradeoff is worth giving up safety." If you don't limit to safe rust you are giving up safety.

jandrewrogers

Safe Rust often performs significantly worse than C++ for many kinds of code where you care a lot about performance. You can bring that performance closer together with unsafe Rust but at that point you might as well use C++ (which still seems to have better code gen with less code). Everyone has their anecdotes but, with the current state of languages and compilers, C++ still excels for performance engineering.

The performance tradeoff is not intrinsic. Rust’s weakness is that it struggles to express safety models sometimes used in high performance code that are outside its native safety model. C++ DGAF, for better and worse.

The hardcoded safety model combined with a somewhat broken async situation has led me to the conclusion that Rust is not a realistic C++ replacement for the kinds of code where C++ excels. I am hopeful something else will come along but there isn’t much on the horizon other than Zig, which I like in many regards but may turn out to be a bit too spartan (amazing C replacement though).

swiftcoder

> a somewhat broken async situation

Isn't Rusts's async situation "somewhat broken" in the exact same way the C++'s async situation is?

rpigab

C++ often perform significantly worse than assembly for many kinds of code where you care a lot about performance. You can bring that performance closer together with bits of ASM in your C++ but at that point you might as well use ASM.

GoblinSlayer

The word is C++ performance comes from asm-like simd integration that can be not as mature in other languages.

deadliftdouche

https://github.com/Speykious/cve-rs

ultimaweapon

You are very unlikely to hit this bug in a real world Rust project while C/C++ you can easily hit by a memory safety bug.

weinzierl

Exactly, and also MIRI catches all of these, so with a tiny little extra effort world is in order again.

Moreover, if I remember correctly, they all are made possible by a single (long-standing) compiler bug that eventually will be fixed.

Previously discussed: https://news.ycombinator.com/item?id=39440808

I think this mindset is the big difference. We're not perfect, but we're working on it.

zyedidia

The bug used by that repository [1] isn't the only one that can be used to escape the Safe Rust type system. There are a couple others I've tried [2] [3], and the Rust issue tracker currently lists 92 unsoundness bugs (though only some of them are general-purpose escapes), and that's only the ones we know about.

These bugs are not really a problem in practice though as long as the developer is not malicious. However, they are a problem for supply chain security or any case where the Rust source is fully untrusted.

[1]: https://github.com/rust-lang/rust/issues/25860

[2]: https://github.com/rust-lang/rust/issues/57893

[3]: https://github.com/rust-lang/rust/issues/133361

lincpa

[dead]

userbinator

It's time to stop jailbreaking/rooting, fully take control away from the user and enforce DRM more strongly?

stavros

The ability to jailbreak should be a legal right. We shouldn't be relying on vulnerabilities just to own the devices we bought.

userbinator

Good luck getting those in power to agree to that.

Meanwhile, everything else passes under the guise of "safety and security".

mcpherrinm

It’s time to stop governments from hacking people’s phones, taking away their privacy?

Users should have control and trust in their devices. If they can be remotely compromised, they cannot get that.

userbinator

The governments will always have a way in if you're a target.

Meanwhile the rest of the population gets pushed towards authoritarian dystopia.

Users should have control and trust in their devices.

I think that already went away once they started adding spyware ("telemetry" being the usual euphemism) and forced automatic updates ("remotely compromised", as you put it.)

mcpherrinm

I fundamentally don't buy the argument that our products need to be shitty so that we can break them.

pjc50

Nothing to do with the article and flame bait?

perching_aix

It's either one or the other, right? In the end people will be forced to either fix their governance or embrace the chaos. At least in the idea of "verifiable computing for thee but not for me" it's certainly not the "verifiable computing" part that I find to be the problem.

We're living the inbetween and people are beating the drums that drive us towards either endpoint. Not for no reason either. Turbulent times.

kojolina

Just bang out a bunch of C code, feed it to an AI: "Make this memory safe". Profit.

No need for rust, Ada, CHERI, SPARK, etc.

pjc50

You could also pray, that's about as likely to be effective.

GoblinSlayer

A rewrite isn't strictly necessary. It should be enough if AI can find errors, doesn't even need to be very precise.

saagarjha

Profit from your AI-powered security company, sure. But the exploit authors are profiting too.

BiteCode_dev

Now the billion dollar question, how to make that work for the entire linux kernel.

cjfd

If it is too big just zip it and feed chunks of the resulting zipfile to the AI. AI can do anything, right?

oneshtein

It will work if you convert zip chunks to base64 first, and use a large enough training set.

nottorp

Easy, triple the hardware requirements and don't talk to any hardware because if you do you'll have to mess with buffers in a non approved way.

HN

It is time to standardize principles and practices for software memory safety

It is time to standardize principles and practices for software memory safety