Skip to content(if available)orjump to list(if available)

Why Safety Profiles Failed

Why Safety Profiles Failed

97 comments

·October 24, 2024

SubjectToChange

At this point I'm wondering if the purpose of safety profiles is simply to serve as a distraction. In other words, safety profiles are just something people can point to when the topic of memory safety comes up, that’s it. The objectives of the initiative always seemed hopelessly optimistic, if not absurd. In particular, I don't understand why littering a codebase with auto, const, constexpr, inline, [[nodiscard]], noexcept, etc is wonderful, yet lifetime annotations are somehow an intolerable tyranny.

rswail

It's a tick-the-box-for-compliance item like when Microsoft had a POSIX layer for Windows NT.

ameliaquining

I think maybe it's because lifetime annotations can get arbitrarily complicated. If you look at enough Rust code you'll definitely see some function signatures that make your head hurt, even if they're vastly outnumbered by simple ones. A guarantee that the comprehension complexity of that part of your code will always be below some low ceiling is tempting.

estebank

The thing is, if you were to make the same design in C++ the code might look "cleaner" because there is less code/fewer annotations, but the other side of that coin is that the developer also has less information about how things are meant to fit together. You not only lose the compiler having your back, you also don't have useful documentation, even if that documentation would be too complicated to grasp at once. Without that documentation you might be fooled into thinking that you do understand what's going on even if you don't in reality.

yellow_lead

That's a good point. There's many times in a C++ codebase, where I'd see or write a seemingly innocuous function, but it has so many assumptions about lifetimes, threads, etc that it would make your brain hurt. Of course we try to remove those or add a comment, but it's still difficult to deal with.

whimsicalism

Rust has nothing on template meta programming and the type signatures you get there, though

nickitolas

Not to mention the error messages when you get something slightly wrong

crest

Give the proc macro fans a little more time...

jimbob45

I’ve spent a fair amount of time writing C++ but F12’ing any of the std data structures makes me feel like I’ve never seen C++ before in my life.

myworkinisgood

It's because those people are lying. They are letting their egos and panic get to their technical decision making.

thadt

It's deceptively easy to look at a number of examples and think: "If I can see that aliasing would be a problem in this function, then a computer should be able to see that too."

The article states "A C++ compiler can infer nothing about aliasing from a function declaration." Which is true, but assumes that the compiler only looks at the function declaration. In the examples given, an analyzer could look at the function bodies and propagate the aliasing requirements upward, attaching them to the function declaration in some internal data structure. Then the analyzer ensures that those functions are used correctly at every call site. Start at leaf functions and walk your way back up the program until you're done. If you run into a situation where there is an ambiguity, you throw an error and let the developer know. Do the same for lifetimes. Heck, we just got 'auto' type inference working in C++11, shouldn't we be able to do this too?

I like not having to see and think about lifetimes and aliasing problems most of the time, and it would be nice if the compiler (or borrow checker) just kept track of those without requiring me to explicitly annotate them everywhere.

seanbax

From P3465: "why this is a scalable compile-time solution, because it requires only function-local analysis"

From P1179: "This paper ... shows how to efficiently diagnose many common cases of dangling (use-after-free) in C++ code, using only local analysis to report them as deterministic readable errors at compile time."

Local analysis only. It's not looking in function definitions.

Whole program analysis is extremely complicated and costly to compute. It's not comparable to return type deduction or something like that.

SkiFire13

> Start at leaf functions and walk your way back up the program until you're done. If you run into a situation where there is an ambiguity, you throw an error and let the developer know.

This assumes no recursive functions, no virtual functions/function pointers, no external functions etc etc

> Heck, we just got 'auto' type inference working in C++11, shouldn't we be able to do this too?

Aliasing is much trickier than type inference.

For example aliasing can change over time (i.e. some variables may alias at some point but not at a later point, while types are always the same) and you want any analysis to reflect it because you will likely rely on that.

Granularity is also much more important: does a pointer alias with every element of a vector or only one? The former is surely easier to represent, but it may unnecessary propagate and result in errors.

So effectively you have an infinite domain of places that can alias, while type inference is limited to locals, parameters, functions, etc etc. And even then, aliasing is quadratic, because you want to know which pairs of places alias.

I hope you can see how this can quickly get impractical, both due to the complexity of the analysis and the fact that small imprecisions can result in very big false positives.

myworkinisgood

You are more correct than you think you are!!!

ameliaquining

These considerations all seem so self-evident that I can't imagine the architects of Safety Profiles weren't aware of them; they are basically just the statement of the problem. And yet these smart people presumably thought they had some kind of solution to them. Why did they think that? What did this solution look like? I would be very interested to read more context on this.

steveklabnik

As always with different designs from smart people, it’s about priorities.

The profiles proposal focuses on a lack of annotations (I think there’s reasonable criticism that this isn’t achieved by it though…), and believing they can get 80% of the benefit for 20% of the effort (at least conceptually, obviously not those exact numbers). They aren’t shooting for full memory safety.

The Safe C++ proposal asks “how do we achieve 100% memory safety by default?”. And then asks what is needed to achieve that goal.

ameliaquining

What's with the "this model detects all possible errors" quote at the beginning of the post, then?

myworkinisgood

They are lying.

null

[deleted]

CJefferson

This article is really good, and covers many important issues.

There were many similar issues when it came to the earlier attempts to add concepts to C++ (which would improve template dispatch), although the outcome was more about improving C++ programmer's lives, not safety.

It turned out trying to encapsulate all the things C++ functions, even in the standard library, as a list of concepts, was basically impossible. There are so many little corner-cases in C++ which need representing as a concept, the list of 'concepts' a function needed often ended up being longer than the function itself.

alilleybrinker

The article makes the particularly good point that you generally can’t effectively add new inferences without constraining optionality in code somehow. Put another way, you can’t draw new conclusions without new available assumptions.

In Sean’s “Safe C++” proposal, he extends C++ to enable new code to embed new assumptions, then subsets that extension to permit drawing new conclusions for safety by eliminating code that would violate the path to those safety conclusions.

steveklabnik

Really glad to see this thorough examination of the weaknesses of profiles. Safe C++ is a really important project, and I hope the committee ends up making the right call here.

SubjectToChange

>...I hope the committee ends up making the right call here.

WG21 hasn't been able to solve the restrict type qualifier, or make a better alternative, in over twenty years. IMO, hoping that WG21 adequately solves Safe C++ is nothing more than wishful thinking, to put it charitably.

OskarS

Yeah, this one is so weird. You've been able to do that forever in C, and virtually all big compilers have this keyword in C++ as well, just named __restrict. Why is it so hard to get into the standard, at least for pointers? I can imagine that there are complex semantics with regards to references that are tricky to get right, but can't we at least have "'restrict" can only be used on raw pointer types, and it means the same thing as it does in C"?

steveklabnik

I am intimately familiar with the dysfunctions of various language committees.

I never said it would be easy, or probable. But I’m also the kind who hopes for the best.

pjmlp

Given how C++0x concepts, C++20 contracts, ABI discussion went down, where key people involved on those processes left to other programming language communities, not sure if the right call will be done in the end.

This is a very political subject, and WG21 doesn't have a core team, rather everything goes through votes.

It suffices to have the wrong count in the room when it is time to vote.

thadt

I have a long standing debate with a friend about whether the future of C++ will be evolution or extinction.

Safe C++ looks excellent - its adoption would go a long way toward validating his steadfast belief that C++ can evolve to keep up with the world.

myworkinisgood

Profiles it the wunderkind and magic child of people in committee like Bjarne and GDR, who are lying their way to get their way, and people are too afraid to call them out on it.

biorach

Wild accusations without any backup... please don't.

Muromec

I'm not familiar with the politics there. What do they get by having their way?

wyager

> Safe C++ is a really important project

What makes you say this? It seems to me like we already have a lower-overhead approach to reach the same goal (a low-level language with substantially improved semantic specificity, memory safety, etc.); namely, we have Rust, which has already improved substantially over the safety properties of C++, and offers a better-designed platform for further safety research.

alilleybrinker

Not everything will be rewritten in Rust. I've broken down the arguments for why this is, and why it's a good thing, elsewhere [1].

Google's recent analysis on their own experiences transitioning toward memory safety provide even more evidence that you don't need to fully transition to get strong safety benefits. They incentivized moving new code to memory safe languages, and continued working to actively assure the existing memory unsafe code they had. In practice, they found that vulnerability density in a stable codebase decays exponentially as you continue to fix bugs. So you can reap the benefits of built-in memory safety for new code while driving down latent memory unsafety in existing code to great effect. [2]

[1]: https://www.alilleybrinker.com/blog/cpp-must-become-safer/

[2]: https://security.googleblog.com/2024/09/eliminating-memory-s...

lmm

Nah. The idea that sustained bugfixing could occur on a project that was not undergoing active development is purely wishful thinking, as is the idea that a project could continue to provide useful functionality without vulnerabilities becoming newly exposed. And the idea of a meaningfully safer C++ is something that has been tried and failed for 20+ years.

Eventually everything will be rewritten in Rust or successors thereof. It's the only approach that works, and the only approach that can work, and as the cost of bugs continues to increase, continuing to use memory-unsafe code will cease to be a viable option.

wyager

> Not everything will be rewritten in Rust.

Yeah, but it's also not going to be rewritten in safe C++.

steveklabnik

I am pro any movement towards memory safety. Sure, I won't stop writing Rust and start moving towards C++ for this. But not everyone is interested in introducing a second toolchain, for example. Also, as this paper mentions, Safe C++ can improve C++ <-> Rust interop, because Safe C++ can express some semantics Rust can understand. Right now, interop works but isn't very nice.

Basically, I want a variety of approaches, not a Rust monoculture.

nicoburns

> But not everyone is interested in introducing a second toolchain, for example.

Not that this invalidates your broader point about Safe C++, but this particular issue could also be solved by Rust shipping clang / a frontend that can also compile C and C++.

tptacek

This is a thread about a C++ language feature; it's probably most productive for us to stipulate for this thread that C++ will continue to exist. Practical lessons C++ can learn moving forward from Rust are a good reason to talk about Rust; "C++ should not be improved for safety because code can be rewritten in Rust" is less useful.

wyager

Actually, this subthread is about whether this is a "really important project"

SubjectToChange

Things like web browsers will continue to have millions of lines of C++ code regardless of how successful Rust becomes. It would be a huge improvement for everyone if such projects had a tractable path towards memory safety

wyager

As this article discusses, it's not really viable that existing codebases will be able to benefit from safe C++ research without massive rewrites anyway

umanwizard

For new projects on mainstream architectures that don't have to depend on legacy C++ baggage, Rust is great (and, I think, practically always the better choice).

But, realistically, C++ will survive for as long as global technological civilization does. There are still people out there maintaining Fortran codebases.

(also, IDK if you already realized this, but it's funny that the person you're replying to is one of the most famous Rust boosters out there, in fact probably the most famous, at least on HN).

steveklabnik

I have realized this. Sean and I have talked about it.

I became a Rust fan because of its innovations in the space. That its innovations may spread elsewhere is a good thing, not a bad thing. If a language comes along that speaks to me more than Rust does, I’ll switch to that. I’m not a partisan, even if it may feel that way from the outside.

akira2501

Cool.

Do you mind if we have more than one approach?

wyager

Yeah, it does not matter to me, but that wasn't what we were talking about

mimd

I'm confused over lines such as "Profiles have to reject pointer arithmetic, because there’s no static analysis protection against indexing past the end of the allocation." Can't frama-c/etc do that? Additionally, section 2.3 is narrower than what is implied by the words "safe" and "out-of-contract" and is more concerned with what C/C++ call "undefined behavior" requirements than contract correctness. Ie. An integer which is defined to wrap overflows and violates the requirement of the function contract, which I can cause in a safe release build rust.

bjornsing

How is it supposed to do that (in the general case)? If I write a C++ program that will index out of bounds iif the Riemann hypothesis is true, then frama-c would have to win the millennium prize to do its job. I bet it can’t.

favorited

I know Sean said on Twitter that he probably won't submit this to WG21, but I wish he would... It is a fantastic rebuttal of certain individual's continued hand-waving about how C++ is safe enough as-is.

bfrog

This seems to be a common theme with many c++ developers honestly.

saagarjha

Some of them are unfortunately on language committees.

myworkinisgood

You mean certain individual's lying?

j16sdiz

> A C++ compiler can infer nothing about aliasing from a function declaration.

True. but you don't solely rely on the declaration, do you? lots of power comes from static analysis.

zahlman

What actually is this circle-lang site, and who runs it? The main page seems to just redirect to example.com, and I don't recognize the name of the author.

steveklabnik

Circle is a C++ compiler by Sean Baxter, with various extensions. One of those is an implementation of the Safe C++ proposal I’ve linked downthread.

loeg

Section 6 seems to propose adding essentially every Rust feature to C++? Am I reading that right? Why would someone use this new proposed C++-with-Rust-annotations in place of just Rust?

lmm

> Why would someone use this new proposed C++-with-Rust-annotations in place of just Rust?

They wouldn't. The point is, if you were serious about making a memory-safe C++, this is what you'd need to do.

ijustlovemath

Because the millions of lines of existing C++ aren't going anywhere. You need transition capability if you're ever gonna see widespread adoption. See: C++'s own adoption story; transpiling into C to get wider adoption into existing codebases.

leni536

Features C++ has that Rust doesn't:

* template specialisations

* function overloading

* I believe const generics is still not there in Rust, or its necessarily more restricted.

In general metaprogramming facilities are more expressive in C++, with different other tradeoffs to Rust. But the tradeoffs don't include memory safety.

steveklabnik

Here’s the actual proposal: https://safecpp.org/draft.html

It explains its own motivation.

SubjectToChange

>Why would someone use this new proposed C++-with-Rust-annotations in place of just Rust?

Simply making C++ compilers compatible with one another is a constant struggle. Making Rust work well with existing C++ code is even more difficult. Thus, it is far easier to make something like Clang understand and compile C++-specific annotations alongside legacy C++ code than making rustc understand C++ types. Moreover, teams of C++ programmers will have an easier time writing annotated C++ than they would learning an entirely new language. And it's important to recognize how deeply entrenched C++ is in many areas, especially when you consider things like OpenMP, OpenACC, CUDA, HIP/ROCm, Kokkos, etc etc etc.

o11c

"No mutable aliases" is a mistake; it prevents many useful programs.

Now that virtual address space is cheap, it's possible to recompile C (or presumably C++) with a fully-safe runtime (requiring annotation only around nasty things like `union sigval`), but this is an ABI break and has nontrivial overhead (note that AddressSanitizers has ~2x overhead and only catches some optimistic cases) unless you mandate additional annotation.

mananaysiempre

> AddressSanitizer has ~2x overhead

I’ve got some programs where the ASan overhead is 10× or more. Admittedly, they are somewhat peculiar—one’s an interpreter for a low-level bytecode, the other’s largely a do-nothing benchmark for measuring the overhead of a heartbeat scheduler. The point is, the overhead can vary a lot depending on e.g. how many mallocs your code does.

This does not contradict your point in any way, to be clear. I was just very surprised when I first hit that behaviour expecting ASan’s usual overhead of “not bad, definitely not Valgrind”, so I wanted to share it.

leni536

Don't deploy ASAN builds to production, it's a debugging tool. It might very well introduce attack vectors on its own, it's not designed to be a hardening feature.

stouset

> it prevents many useful programs

Every set of constraints prevents many useful programs. If those useful programs can still be specified in slightly different ways but it prevents many more broken programs, those constraints may be a net improvement on the status quo.

tialaramex

> "No mutable aliases" is a mistake; it prevents many useful programs.

Does it? You didn't list any. It certainly prevents writing a tremendous number of programs which are nonsense.

o11c

The entirety of Rust's `std::cell` is a confession that yes, we really do need mutable aliases. We just pretend they the aliases aren't mutable except for a nanosecond around the actual mutation.

steveklabnik

It’s more than that, they disable the aliasing based optimizations, and provide APIs that restrict how and when you can mutate in order to make sure data races don’t happen.

Controlled mutable aliasing is fine. Uncontrolled is dangerous.

alilleybrinker

Alternatively, Rust's cell types are proof that you usually don't need mutable aliasing, and you can have it at hand when you need it while reaping the benefits of stronger static guarantees without it most of the time.

JoshTriplett

Cells still don't allow simultaneous mutable aliases; they just allow the partitioning of regions of mutable access to occur at runtime rather than compile time.

andrewflnr

OP mentioned std::sort and the rest of std::algorithm as useful functions that use mutable aliasing.

morning-coffee

> "No mutable aliases" is a mistake; it prevents many useful programs.

Yes, it prevents many useful programs.

I think it also prevents many many many more useless broken incorrect programs from wreaking havoc or being used as exploit delivery vehicles.

tsimionescu

How would this fix memory safety issues like std::sort(vec1.begin(), vec2.end()) (where vec1 and vec2 are different vectors, of course)? Or strlen(malloc(100))?

gmueckl

These two examples come from bad library design, not bad language design. The first one was fixed with ranges. The second one would be fixed if C used an explicit string type in it's standard library.

o11c

With a safe runtime, a pointer is really something like a `struct { u32 allocation_base, allocation_offset;}`. (it may be worth doing fancy variable-bit-width math to allow many small allocations but only a few large ones; it's also likely worth it to have a dedicated "leaf" section of memory that is intended not to contain any pointers)

An implementation of `sort` would start with: `assert (begin.allocation_base == end.allocation_base)`. Most likely, this would be implicit when `end - begin` or `begin < end` is called (but not `!=`, which is well-defined between unrelated pointers).

If we ignore the uninitialized data (which is not the same kind of UB, and usually not interesting), the `strlen` loop would assert when, after not encountering a NUL, `s.allocation_offset` exceeds `100` (which is known in the allocation metadata).

null

[deleted]

murderfs

virtual address space is cheap, but changing it is massively expensive. If you have to do a TLB shootdown on every free, you're likely going to have worse performance than just using ASan.

o11c

Dealing with malloc/free is trivial and cheap - just give every allocated object a couple of reference counts.

The hard part is figuring out which words of memory should be treated as pointers, so that you know when to alter the reference counts.

Most C programs don't rely on all the weird guarantees that C mandates (relying on asm, which is also problematic, is probably more common), but for the ones that do it is quite problematic.

steveklabnik

The borrow checker works irrespective of the heap. Memory safety involves all pointers, not just ones that own a heap allocation.

acbits

https://github.com/acbits/reftrack-plugin

I wrote a compiler extension just for this issue since there wasn't any.

akira2501

> just give every allocated object a couple of reference counts.

Works great with a single thread.

prophesi

For those without a dark mode extension:

body {

  background-color: #1f1f1f;

  color: #efefef;
}

.sourceCode {

  background-color: #3f3f3f;

}