Why Safety Profiles Failed

232 comments

·October 24, 2024

SubjectToChange

At this point I'm wondering if the purpose of safety profiles is simply to serve as a distraction. In other words, safety profiles are just something people can point to when the topic of memory safety comes up, that’s it. The objectives of the initiative always seemed hopelessly optimistic, if not absurd. In particular, I don't understand why littering a codebase with auto, const, constexpr, inline, [[nodiscard]], noexcept, etc is wonderful, yet lifetime annotations are somehow an intolerable tyranny.

ameliaquining

I think maybe it's because lifetime annotations can get arbitrarily complicated. If you look at enough Rust code you'll definitely see some function signatures that make your head hurt, even if they're vastly outnumbered by simple ones. A guarantee that the comprehension complexity of that part of your code will always be below some low ceiling is tempting.

estebank

The thing is, if you were to make the same design in C++ the code might look "cleaner" because there is less code/fewer annotations, but the other side of that coin is that the developer also has less information about how things are meant to fit together. You not only lose the compiler having your back, you also don't have useful documentation, even if that documentation would be too complicated to grasp at once. Without that documentation you might be fooled into thinking that you do understand what's going on even if you don't in reality.

yellow_lead

That's a good point. There's many times in a C++ codebase, where I'd see or write a seemingly innocuous function, but it has so many assumptions about lifetimes, threads, etc that it would make your brain hurt. Of course we try to remove those or add a comment, but it's still difficult to deal with.

saghm

What's interesting to me about this is that from what I understand, lifetime annotations are not present in Rust because of a desire to include information for the use of developers, but because without them the compiler would need to brute-force checking all potential combinations of lifetimes to determine whether one of them is valid. The heuristics[0] that the compiler uses to avoid requiring explicit annotations in practice cover most common situations, but outside of those, the compiler only acts as a verifier for a given set of lifetimes the user specifies rather than attempting to find a solution itself. In other words, all of the information the compiler would need to validate the program is already there; it just wouldn't be practical to do it.

[0]: https://doc.rust-lang.org/reference/lifetime-elision.html

pjmlp

Except those kind of annotations already exist, but have proven not to be enough withough language semantic changes, SAL is a requirement in Microsoft's own code since Windows XP SP2.

https://learn.microsoft.com/en-us/cpp/code-quality/understan...

whimsicalism

Rust has nothing on template meta programming and the type signatures you get there, though

jimbob45

I’ve spent a fair amount of time writing C++ but F12’ing any of the std data structures makes me feel like I’ve never seen C++ before in my life.

nickitolas

Not to mention the error messages when you get something slightly wrong

crest

Give the proc macro fans a little more time...

fmbb

Yes. Lifetimes are complicated. Complicated codes make them even harder.

Not annotating is not making anything easier.

HelloNurse

What are the "arbitrarily complicated" cases of lifetime annotations? They cannot grow beyond one lifetime (and up to one compilation error) per variable or parameter or function return value.

ameliaquining

Mostly involving structs. Someone at work once posted the following, as a slightly-modified example of real code that they'd actually written:

  pub struct Step<'a, 'b> {
      pub name: &'a str,
      pub stage: &'b str,
      pub is_last: bool,
  }

  struct Request<'a, 'b, 'c, 'd, 'e> {
      step: &'a Step<'d, 'e>,
      destination: &'c mut [u8],
      size: &'b Cell<Option<usize>>,
  }

To be sure, they were seeking advice on how to simplify it, but I imagine those with a more worse-is-better technical sensibility arguing that a language simply should not allow code like that to ever be written.

I also hear that higher-ranked trait bounds can get scary even within a single function signature, but I haven't had cause to actually work with them.

myworkinisgood

[flagged]

thadt

It's deceptively easy to look at a number of examples and think: "If I can see that aliasing would be a problem in this function, then a computer should be able to see that too."

The article states "A C++ compiler can infer nothing about aliasing from a function declaration." Which is true, but assumes that the compiler only looks at the function declaration. In the examples given, an analyzer could look at the function bodies and propagate the aliasing requirements upward, attaching them to the function declaration in some internal data structure. Then the analyzer ensures that those functions are used correctly at every call site. Start at leaf functions and walk your way back up the program until you're done. If you run into a situation where there is an ambiguity, you throw an error and let the developer know. Do the same for lifetimes. Heck, we just got 'auto' type inference working in C++11, shouldn't we be able to do this too?

I like not having to see and think about lifetimes and aliasing problems most of the time, and it would be nice if the compiler (or borrow checker) just kept track of those without requiring me to explicitly annotate them everywhere.

seanbax

From P3465: "why this is a scalable compile-time solution, because it requires only function-local analysis"

From P1179: "This paper ... shows how to efficiently diagnose many common cases of dangling (use-after-free) in C++ code, using only local analysis to report them as deterministic readable errors at compile time."

Local analysis only. It's not looking in function definitions.

Whole program analysis is extremely complicated and costly to compute. It's not comparable to return type deduction or something like that.

SkiFire13

> Start at leaf functions and walk your way back up the program until you're done. If you run into a situation where there is an ambiguity, you throw an error and let the developer know.

This assumes no recursive functions, no virtual functions/function pointers, no external functions etc etc

> Heck, we just got 'auto' type inference working in C++11, shouldn't we be able to do this too?

Aliasing is much trickier than type inference.

For example aliasing can change over time (i.e. some variables may alias at some point but not at a later point, while types are always the same) and you want any analysis to reflect it because you will likely rely on that.

Granularity is also much more important: does a pointer alias with every element of a vector or only one? The former is surely easier to represent, but it may unnecessary propagate and result in errors.

So effectively you have an infinite domain of places that can alias, while type inference is limited to locals, parameters, functions, etc etc. And even then, aliasing is quadratic, because you want to know which pairs of places alias.

I hope you can see how this can quickly get impractical, both due to the complexity of the analysis and the fact that small imprecisions can result in very big false positives.

rswail

It's a tick-the-box-for-compliance item like when Microsoft had a POSIX layer for Windows NT.

pjmlp

Microsoft eventually learned that keeping full POSIX support would have been a better outcome in today's server room if they had done it properly instead.

Likewise, pushing half solutions like profiles that are still pretty much a paper idea, other than what already exists in static analysers, might decrease C++'s relevance in some domains, and eventually those pushing for them might find themselves in the position that adopting Safe C++ (circle's design) would have been a much better decision.

The problem with ISO driven languages, is who's around in the room when voting takes place.

badmintonbaseba

Adopting what static analyzers do is a no-go, as they rely on non-local reasoning, even across translation units for lifetime and aliasing analysis. Their output highly depend on what they can see, and they generally can't see the source code for the whole program. I also doubt that they promise any kind of stability in their output across versions.

This is a not a jab against static analyzers, by all means use them, but I don't think they are a good fit as part of the language.

klodolph

We can dream of what it would be like with full POSIX support on Windows, but it was a pipe dream to begin with. There are some major differences between Windows and POSIX semantics for things like processes and files. The differences are severe enough that Windows and POSIX processes can’t coexist. The big issue with files is that on POSIX, you can conceptually think of a file as an inode, with zero or more paths pointing to it. On Windows, you conceptually think of a file as the path itself, and you can create mandatory locks. There are other differences. Maybe you could address these given enough time, but WSL’s solution is to basically isolate Windows and Linux, which makes a ton of sense.

WalterBright

Since const can be cast away, it's useless for checking.

SubjectToChange

const can be cast away, auto can have some really nasty behavior, constexpr doesn't have to do anything, inline can be ignored, [[nodiscard]] can be discarded, exceptions can be thrown in noexcept functions, etc. Almost everything in C++ can be bypassed in one way or another.

WalterBright

D can cast away const, but not in @safe code. Though we are considering revising this so it can only be done in @system code.

myworkinisgood

[flagged]

CoastalCoder

> You are more correct than you think you are!!!

Your comment will be more interesting if you expand upon it.

ameliaquining

These considerations all seem so self-evident that I can't imagine the architects of Safety Profiles weren't aware of them; they are basically just the statement of the problem. And yet these smart people presumably thought they had some kind of solution to them. Why did they think that? What did this solution look like? I would be very interested to read more context on this.

steveklabnik

As always with different designs from smart people, it’s about priorities.

The profiles proposal focuses on a lack of annotations (I think there’s reasonable criticism that this isn’t achieved by it though…), and believing they can get 80% of the benefit for 20% of the effort (at least conceptually, obviously not those exact numbers). They aren’t shooting for full memory safety.

The Safe C++ proposal asks “how do we achieve 100% memory safety by default?”. And then asks what is needed to achieve that goal.

ameliaquining

What's with the "this model detects all possible errors" quote at the beginning of the post, then?

steveklabnik

That’s a claim about dangling pointers and ownership. Profiles do not solve aliasing or concurrency, as two examples of things that Safe C++ does that are important for memory safety.

tazjin

> Why did they think that? What did this solution look like?

I don't think they did think that. Having listened to a few podcasts with the safety profile advocates I've gotten the impression that their answer to any question about "right, but how would you actually do that?" is "well, we'll see, and in general there's other problems to think about, too!".

CoastalCoder

I wonder if the unstated issue here is:

C++ is so complex that it's hard to think through all the implications of design proposals like this.

So practically speaking, the only way to prove a design change is to implement it and get lots of people to take it for a test drive.

But it's hard to find enough people willing to do that in earnest, so the only real way to test the idea is to make it part of the language standard.

pjmlp

That is how we end up with stuff like GC added in C++11, and removed in C++23, because it was worthless for the only two C++ dialects that actually use a GC, namely Unreal C++ and C++/CLI.

So no one made use of it.

null

[deleted]

myworkinisgood

[flagged]

alilleybrinker

The article makes the particularly good point that you generally can’t effectively add new inferences without constraining optionality in code somehow. Put another way, you can’t draw new conclusions without new available assumptions.

In Sean’s “Safe C++” proposal, he extends C++ to enable new code to embed new assumptions, then subsets that extension to permit drawing new conclusions for safety by eliminating code that would violate the path to those safety conclusions.

steveklabnik

Really glad to see this thorough examination of the weaknesses of profiles. Safe C++ is a really important project, and I hope the committee ends up making the right call here.

SubjectToChange

>...I hope the committee ends up making the right call here.

WG21 hasn't been able to solve the restrict type qualifier, or make a better alternative, in over twenty years. IMO, hoping that WG21 adequately solves Safe C++ is nothing more than wishful thinking, to put it charitably.

OskarS

Yeah, this one is so weird. You've been able to do that forever in C, and virtually all big compilers have this keyword in C++ as well, just named __restrict. Why is it so hard to get into the standard, at least for pointers? I can imagine that there are complex semantics with regards to references that are tricky to get right, but can't we at least have "'restrict" can only be used on raw pointer types, and it means the same thing as it does in C"?

steveklabnik

I am intimately familiar with the dysfunctions of various language committees.

I never said it would be easy, or probable. But I’m also the kind who hopes for the best.

pjmlp

Given how C++0x concepts, C++20 contracts, ABI discussion went down, where key people involved on those processes left to other programming language communities, not sure if the right call will be done in the end.

This is a very political subject, and WG21 doesn't have a core team, rather everything goes through votes.

It suffices to have the wrong count in the room when it is time to vote.

thadt

I have a long standing debate with a friend about whether the future of C++ will be evolution or extinction.

Safe C++ looks excellent - its adoption would go a long way toward validating his steadfast belief that C++ can evolve to keep up with the world.

myworkinisgood

[flagged]

biorach

Wild accusations without any backup... please don't.

sitkack

History has been written. What makes you the future will be different?

Muromec

I'm not familiar with the politics there. What do they get by having their way?

wyager

> Safe C++ is a really important project

What makes you say this? It seems to me like we already have a lower-overhead approach to reach the same goal (a low-level language with substantially improved semantic specificity, memory safety, etc.); namely, we have Rust, which has already improved substantially over the safety properties of C++, and offers a better-designed platform for further safety research.

alilleybrinker

Not everything will be rewritten in Rust. I've broken down the arguments for why this is, and why it's a good thing, elsewhere [1].

Google's recent analysis on their own experiences transitioning toward memory safety provide even more evidence that you don't need to fully transition to get strong safety benefits. They incentivized moving new code to memory safe languages, and continued working to actively assure the existing memory unsafe code they had. In practice, they found that vulnerability density in a stable codebase decays exponentially as you continue to fix bugs. So you can reap the benefits of built-in memory safety for new code while driving down latent memory unsafety in existing code to great effect. [2]

[1]: https://www.alilleybrinker.com/blog/cpp-must-become-safer/

[2]: https://security.googleblog.com/2024/09/eliminating-memory-s...

lmm

Nah. The idea that sustained bugfixing could occur on a project that was not undergoing active development is purely wishful thinking, as is the idea that a project could continue to provide useful functionality without vulnerabilities becoming newly exposed. And the idea of a meaningfully safer C++ is something that has been tried and failed for 20+ years.

Eventually everything will be rewritten in Rust or successors thereof. It's the only approach that works, and the only approach that can work, and as the cost of bugs continues to increase, continuing to use memory-unsafe code will cease to be a viable option.

wyager

> Not everything will be rewritten in Rust.

Yeah, but it's also not going to be rewritten in safe C++.

steveklabnik

I am pro any movement towards memory safety. Sure, I won't stop writing Rust and start moving towards C++ for this. But not everyone is interested in introducing a second toolchain, for example. Also, as this paper mentions, Safe C++ can improve C++ <-> Rust interop, because Safe C++ can express some semantics Rust can understand. Right now, interop works but isn't very nice.

Basically, I want a variety of approaches, not a Rust monoculture.

nicoburns

> But not everyone is interested in introducing a second toolchain, for example.

Not that this invalidates your broader point about Safe C++, but this particular issue could also be solved by Rust shipping clang / a frontend that can also compile C and C++.

tptacek

This is a thread about a C++ language feature; it's probably most productive for us to stipulate for this thread that C++ will continue to exist. Practical lessons C++ can learn moving forward from Rust are a good reason to talk about Rust; "C++ should not be improved for safety because code can be rewritten in Rust" is less useful.

pjmlp

Especially because many of us security minded folks do reach out for C++ as there are domains where it is the only sane option (I don't consider C a sane alternative), so anything that improves C++ safety is very much welcomed.

Improved C++'s safety means that the C++ code underlying several JVM implementations, CLR, V8, GCC and LLVM, CUDA, Unreal, Godot, Unity,... also gets a way to be improved, without a full rewrite, which while possible might not be economically feasible.

wyager

Actually, this subthread is about whether this is a "really important project"

umanwizard

For new projects on mainstream architectures that don't have to depend on legacy C++ baggage, Rust is great (and, I think, practically always the better choice).

But, realistically, C++ will survive for as long as global technological civilization does. There are still people out there maintaining Fortran codebases.

(also, IDK if you already realized this, but it's funny that the person you're replying to is one of the most famous Rust boosters out there, in fact probably the most famous, at least on HN).

steveklabnik

I have realized this. Sean and I have talked about it.

I became a Rust fan because of its innovations in the space. That its innovations may spread elsewhere is a good thing, not a bad thing. If a language comes along that speaks to me more than Rust does, I’ll switch to that. I’m not a partisan, even if it may feel that way from the outside.

SubjectToChange

Things like web browsers will continue to have millions of lines of C++ code regardless of how successful Rust becomes. It would be a huge improvement for everyone if such projects had a tractable path towards memory safety

wyager

As this article discusses, it's not really viable that existing codebases will be able to benefit from safe C++ research without massive rewrites anyway

jmull

> It seems to me like we already have a lower-overhead approach ... Rust

Rewriting all the existing C++ code in Rust is extremely high-cost. Practically speaking, that means it won't happen in many, many cases.

I think we want to find a more efficient way to achieve memory safety in C++.

Not to mention, Rust's safety model isn't that great. It does memory safety, which is good, but it's overly restrictive, disallowing various safe patterns. I suspect there are better safe alternatives out there for most cases, or at least could be. It would make sense to consider the alternatives before anyone rewrites something in Rust.

zozbot234

> It does memory safety, which is good, but it's overly restrictive, disallowing various safe patterns

The "safe" patterns Rust disallows tend to not account for safe modularity - as in, they impose complex, hard-to-verify requirements on outside code if "safety" is to be preserved. This kind of thing is essentially what the "unsafe" feature in Rust is intended to address.

steveklabnik

Folks who want to propose alternatives should do so! Right now, you’ve only got the two: profiles and Safe C++. There are also two that aren’t proposals. It have a semblance of a plan: “graft Hylo semantics instead of Rust semantics” and “scpptool.” Realistically, unless something else concrete and not “could be” is found at the eleventh hour, this is the reality of the possible choices.

akira2501

Cool.

Do you mind if we have more than one approach?

wyager

Yeah, it does not matter to me, but that wasn't what we were talking about

CJefferson

This article is really good, and covers many important issues.

There were many similar issues when it came to the earlier attempts to add concepts to C++ (which would improve template dispatch), although the outcome was more about improving C++ programmer's lives, not safety.

It turned out trying to encapsulate all the things C++ functions, even in the standard library, as a list of concepts, was basically impossible. There are so many little corner-cases in C++ which need representing as a concept, the list of 'concepts' a function needed often ended up being longer than the function itself.

favorited

I know Sean said on Twitter that he probably won't submit this to WG21, but I wish he would... It is a fantastic rebuttal of certain individual's continued hand-waving about how C++ is safe enough as-is.

bfrog

This seems to be a common theme with many c++ developers honestly.

OvbiousError

Most C++ devs don't care that much either way I'd say, it's a vocal minority that does. I really don't understand the nay-sayers though, been a C++ dev for over 15 years, I'd give an arm and a leg for 1. faster language evolution (cries in pattern matching) and 2. a way to enforce safe code. Having to use std::variant when I want to use a sum type in 2024 is just so backwards it's hard to express. Still love the language though :p

pjmlp

C++ became my favourite language after Object Pascal, as it provided similar safety levels, added with the portability.

I never been that big into C, although I do know it relatively well, as much as anyone can claim to do so, because it is a key language in anything UNIX/POSIX and Windows anyway.

One of the appealing things back then were the C++ frameworks that were provided alongside C++ compilers, pre-ISO C++98, all of them with more security consideration than what ended up landing on the standard library, e.g. bounds checking by default on collection types.

Nowadays I rather spend my time in other languages, and reach out to C++ on a per-need basis, as other language communities take the security discussion more seriously.

However, likewise I still love the language itself, and is one of those that I usually reach for in side projects, where I can freely turn to 100% all the safety features available to me, without the usual drama from some C++ circles.

saagarjha

Some of them are unfortunately on language committees.

myworkinisgood

[flagged]

rurban

Additionally I still cannot understand why they didn't make iterators safe from the very beginning. In the alias examples some iterators must alias and some not. With safe iterators the checks would be trivial, as just the base pointers need to be compared. This could be done even at compile-time, when all iterators bases are known at compile-time.

Their argument then was that iterators are just simple pointers, not a struct of two values base + cur. You don't want to pass two values in two registers, or even on the stack. Ok, but then call them iterators, call them mere pointers. With safe iterators, you could even add the end or size, and don't need to pass begin() and end() to a function to iterate over a container or range. Same for ranges.

A iterator should have just have been a range (with a base), so all checks could be done safely, the API would look sane, and the calls could be optimized for some values to be known at compile-time. Now we have the unsafe iterators, with the aliasing mess, plus ranges, which are still unsafe and ill-designed. Thanksfully I'm not in the library working group, because I would have had heart attacks long time ago over their incompetence.

My CTL (the STL in C) uses safe iterators, and is still comparable in performance and size to C++ containers. Wrong aliasing and API usage is detected, in many cases also at compile-time.

sirwhinesalot

We're talking about a commitee that still releases "safety improving" constructs like std::span without any bounds checking. Don't think about it too much.

munificent

The C++ committee and standard library folks are in a hard spot.

They have two goals:

1. Make primitives in the language as safe as they can.

2. Be as fast as corresponding completely unsafe C code.

These goals are obviously in opposition. Sometimes, if you're lucky, you can improve safety completely at compile time and after the safety is proven, the compiler eliminates everything with no overhead. But often you can't. And when you can't, C++ folks tend to prioritize 2 over 1.

You could definitely argue that that's the wrong choice. At the same time, that choice is arguably the soul of C++. Making a different choice there would fundamentally change the identity of the language.

But I suspect that the larger issue here is cultural. Every organization has some foundational experiences that help define the group's identity and culture. For C++, the fact that the language was able to succeed at all instead of withering away like so many other C competitors did is because it ruthlessly prioritized performance and C compatibility over all other factors.

Back in the early days of C++, C programmers wouldn't sacrifice an ounce of performance to get onto a "better" language. Their identity as close-to-the-metal programmers was based in part on being able to squeeze more out of a CPU than anyone else could. And, certainly, at the time, that really was valuable when computers were three orders of magnitude slower than they are today.

That culture still pervades C++ where everyone is afraid of a performance death of a thousand cuts.

So the language has sort of wedged itself into an untenable space where it refuses to be any slower than completely guardrail-less machine code, but where it's also trying to be safer.

I suspect that long-term, it's an evolutionary dead end. Given the state of hardware (fast) and computer security failures (catastrophically harmful), it's worth paying some amount of runtime cost for safer languages. If you need to pay an extra buck or two for a slightly faster chip, but you don't leak national security secrets and go to jail, or leak personal health information and get sued for millions... buy the damn chip.

pjmlp

Ironically, in the early days of C, it was a good as Modula-2 or Pascal dialects "squeeze more out of a CPU than anyone else could".

All that squeezing was made possible by tons of inline Assembly extensions, that Modula-2 and Pascal dialects also had.

It only took off squeezing, when C compiler writers decided to turn to 11 the way UB gets exploited in the optimizer, with the consequences that we have to suffer 20 years later.

tialaramex

The primary goal of WG21 is and always has been compatibility, and particularly compatibility with existing C++ (though compatibility with C is important too).

That's why C++ 11 move is not very good. The safe "destructive" move you see in Rust wasn't some novelty that had never been imagined previously, it isn't slower, or more complicated, it's exactly what programmers wanted at the time, however C++ could not deliver it compatibly so they got the C++ 11 move (which is more expensive and leaves a trail of empty husk objects behind) instead.

You're correct that the big issue is culture. Rust's safety culture is why Rust is safe, Rust's safety technology merely† enables that culture to thrive and produce software with excellent performance. The "Safe C++" proposal would grant C++ the same technology but cannot gift it the same culture.

However, I think in many and perhaps even most cases you're wrong to think C++ is preferring better performance over safety, instead, the committee has learned to associate unsafe outcomes with performance and has falsely concluded that unsafe outcomes somehow enable or engender performance when that's often not so. The ISO documents do not specify a faster language, they specify a less safe language and they just hope that's faster.

In practice this has a perverse effect. Knowing the language is so unsafe, programmers write paranoid software in an attempt to handle the many risks haunting them. So you will find some Rust code which has six run-time checks to deliver safety - from safe library code, and then the comparable C++ code has fourteen run-time checks written by the coder, but they missed two, so it's still unsafe but it's also slower.

I read a piece of Rust documentation for an unsafe method defined on the integers the other day which stuck with me for these conversations. The documentation points out that instead of laboriously checking if you're in a case where the unsafe code would be correct but faster, and if so calling the unsafe function, you can just call the safe function - which already does that for you.

† It's very impressive technology, but I say "merely" here only to emphasise that the technology is worth nothing without the culture. The technology has no problem with me labelling unsafe things (functions, traits, attributes now) as safe, it's just a label, the choice to ensure they're labelled unsafe is cultural.

zozbot234

Rust is unique in being both safe and nearly as fast as idiomatic C/C++. This is a key differentiator between Rust and languages that rely on obligate GC or obligate reference counting for safety, including Golang, Swift, Ocaml etc.

adgjlsfhk1

it's not clear to me that GC is actually slower than manual memory management (as long as you allow for immutable objects. allocation/free is slow so most high performance programs don't have useless critical path allocations anyway.

null

[deleted]

jcelerier

> My CTL (the STL in C) uses safe iterators, and is still comparable in performance and size to C++ containers

I wonder, what's "comparable" there ? Because for instance MSVC, libstdc++ and libc++ supports some kind of safe iterators but they are definitely not useable for production due to the heavy performance cost incurred.

randomNumber7

This is a great article and shows why its so hard to program in C++. When you do not deeply understand the reasons why those examples behave as they do, your code is potentially dangerous.

WalterBright

> What are sort’s preconditions? 1. The first and last iterators must point at elements from the same container. 2. first must not indicate an element that appears after last. 3. first and last may not be dangling iterators.

This is a fundamental problem in C++ where a range is specified by the starting point and the ending point. This is because iterators in C++ are abstractions of a pointer.

D took a different approach. A range in D is an abstraction of an array. An array is specified by its starting point and its length. This inherently solves points one and two (not sure about three).

Sort then has a prototype of:

    Range sort(Range);

loeg

Section 6 seems to propose adding essentially every Rust feature to C++? Am I reading that right? Why would someone use this new proposed C++-with-Rust-annotations in place of just Rust?

ijustlovemath

Because the millions of lines of existing C++ aren't going anywhere. You need transition capability if you're ever gonna see widespread adoption. See: C++'s own adoption story; transpiling into C to get wider adoption into existing codebases.

leni536

Features C++ has that Rust doesn't:

* template specialisations

* function overloading

* I believe const generics is still not there in Rust, or its necessarily more restricted.

In general metaprogramming facilities are more expressive in C++, with different other tradeoffs to Rust. But the tradeoffs don't include memory safety.

ynik

The main big philosophical difference regarding templates is that Rust wants to guarantee that generic instantiation always succeeds; whereas C++ is happy with instantiation-time compiler errors. The C++ approach does make life a fair bit easier and can maybe even avoid some of the lifetime annotation burden in some cases: in Rust, a generic function may need a `where T: 'static` constraint; in C++ with lifetimes it could be fine without any annotations as long as it's never instantiated with structs containing pointers/references.

Template specializations are not in Rust because they have some surprisingly tricky interactions with lifetimes. It's not clear lifetimes can be added to C++ without having the same issue causing safety holes with templates. At least I think this might be an issue if you want to compile a function instance like `void foo<std::string_view>()` only once, instead of once for each different string data lifetime.

tialaramex

You definitely can't have all of "Non-type template parameters" (the C++ equivalent of const generics) in Rust because some of it is unsound. You can certainly have more than you get today, it's much less frantically demanded but I should like to be able to have an enum of Hats and then make Goose<Hat::TopHat> Goose<Hats::Beret> Goose<Hats::Fedora> and so on, which is sound but cannot exist today.

For function overloading this serves two purposes in C++ and I think Rust chooses a better option for both:

First, when there are similar features with different parameters, overloading lets you pretend the feature set was smaller but making them a single function. So e.g. C++ offers a single sort function but Rust distinguishes sort, sort_by and sort_by_key

Obviously all three have the same underlying implementation, but I feel the distinct names helps us understand when reading code what's important. If they're all named sort you may not notice that one of these calls is actually quite different.

Secondly, this provides a type of polymorphism, "Ad hoc polymorphism". For example if we ask whether name.contains('A') in C++ the contains function is overloaded to accept both char (a single byte 'A' the number 65) and several ways to represent strings in C++

In Rust name.contains('A') still works, but for a different reason 'A' is still a char, this time that's a Unicode Scalar Value, but the reason it works here is that char implements the Pattern trait, which is a trait for things which can be matched against part of a string. So name.contains(char::is_uppercase) works, name.contains(|ch| { /* arbitrary predicate for the character */ }) works, name.contains("Alex") works, and a third party crate could have it work for regular expressions or anything else.

I believe this more extensible alternative is strictly superior while also granting improved semantic value.

aw1621107

> You definitely can't have all of "Non-type template parameters" (the C++ equivalent of const generics) in Rust because some of it is unsound.

Just to clarify, does this mean NTTP in C++ is unsound as-is, or that trying to port C++ NTTP as-is to Rust would result in something unsound?

steveklabnik

Here’s the actual proposal: https://safecpp.org/draft.html

It explains its own motivation.

lmm

> Why would someone use this new proposed C++-with-Rust-annotations in place of just Rust?

They wouldn't. The point is, if you were serious about making a memory-safe C++, this is what you'd need to do.

SubjectToChange

>Why would someone use this new proposed C++-with-Rust-annotations in place of just Rust?

Simply making C++ compilers compatible with one another is a constant struggle. Making Rust work well with existing C++ code is even more difficult. Thus, it is far easier to make something like Clang understand and compile C++-specific annotations alongside legacy C++ code than making rustc understand C++ types. Moreover, teams of C++ programmers will have an easier time writing annotated C++ than they would learning an entirely new language. And it's important to recognize how deeply entrenched C++ is in many areas, especially when you consider things like OpenMP, OpenACC, CUDA, HIP/ROCm, Kokkos, etc etc etc.

j16sdiz

> A C++ compiler can infer nothing about aliasing from a function declaration.

True. but you don't solely rely on the declaration, do you? lots of power comes from static analysis.

steveklabnik

It’s important to only rely on the declaration, for a few reasons. One of the simpler ones is that the body may be in another translation unit.

seanbax

You do solely rely on the declaration.

From P3465: "why this is a scalable compile-time solution, because it requires only function-local analysis"

Profiles uses local analysis, as does borrow checking. Whole program analysis is something you don't want to mess with.

MathMonkeyMan

Why is `f3` safe?

My thought (which is apparently wrong) is that the `const int& x` refers to memory that might be freed during `vector::push_back`. Then, when we go to construct the new element that is a copy of `x`, it might be invalid to read `x`. No?

Is this related to how a reference-to-const on the stack can extend the lifetime of a temporary to which it's bound? I didn't think that function parameters had this property (i.e. an implicit copy).

seanbax

If the vector has to be resized in push_back, the implementation copy-constructs a new object from x. It resizes the buffer. Then it move-constructs the copy into the buffer. The cost is only a move construct per resize... So it's on the order of log2(capacity). Very efficient. But don't get used to it, because there are plenty of other C++ libraries that will break under aliasing.

embe42

I think the implementation in C++ does not do realloc on push_back, but allocates new memory, creates the new element and only then deallocates it.

I believe the reason f3 is safe is that the standard says that the iterators/references are invalidated after push_back – so push_back needs to be carefully written to accept aliasing pointers.

I am pretty sure if I were writing my own push_back it would do something like "reserve(size() + 1), copy element into the new place", and it would have different aliasing requirements...

(For me this is a good example of how subtle such things are)

tialaramex

I know this wasn't your main point here but with the C++ std::vector's reserve you must not do this because it will destroy the amortized constant time growth. Rust's Vec can Vec::reserve the extra space because it has both APIs here.

The difference is what these APIs do when we're making small incremental growth choices, Vec::reserve_exact and the std::vector reserve both grow to the exact size asked for, so if we do this for each entry inserted eight times we grow 1, 2, 3, 4, 5, 6, 7, 8 -- we're always paying to grow.

However when Vec::reserve needs to grow it either doubles, or grows to the exact size if bigger than a double. So for the same pattern we grow 1, 2, 4, no growth, 8, no growth, no growth, no growth.

There's no way to fix C++ std::vector without providing a new API, it's just a design goof in this otherwise very normal growable array type. You can somewhat hack around it, but in practice people will just advise you to never bother using reserve except once up front in C++, whereas you will get a performance win in Rust by using reserve to reserve space.

MathMonkeyMan

> allocates new memory, creates the new element and only then deallocates it.

Ah, I'll have to check the standard. Thanks.

null

[deleted]