Taming the UB Monsters in C++

124 comments

·March 31, 2025

roca

There isn't anything new here to defend against lifetime-related UB. For that it simply references https://arxiv.org/pdf/2503.21145, which is just a summary of existing dynamic mitigations --- which don't fix UB at the language level, impose performance tradeoffs, and in the case of pointer integrity, require hardware support that excludes e.g. x86.

Look at it this way --- mature products like Chrome are already doing all of that wherever they can. If it was enough, they wouldn't worry about C++ UB anymore. But they do.

Voultapher

At this point I'm wondering if C++ leadership is either willfully ignorant or genuinely in denial.

I know several people on various C++ committees and by and large their opinion is, we evolve the language and library to give existing projects incremental improvements without asking them to rewrite them, but if you are starting with a new project C++ is often a subpar option. From that perspective I get why they'd be hesitant about efforts like Circle. Circle and co. ask developers to rewrite their code, in something that looks very different to normal C++ - whatever normal C++ even is, given the multitude of dialects out there - can't seamlessly interop with existing code, needs a new incompatible standard library, that as of now doesn't even exist. At which point, honestly just rewrite it in Rust instead of going through the painful exercise to use something that's 10+ years behind where Rust is today in terms of DX, tooling and ecosystem.

But all that doesn't explain why at the very top, even mentioning Rust as an alternative seems taboo, idk.

pjmlp

There is also a strange dynamic going, and this has worked against C++.

In the early ISO days, the people sent to ISO were employees from compiler vendors, and existing practice was the key factor into adding stuff to the standard.

Eventually, comitee dynamics took place, and nowadays most of the contributors to WG21, and to lesser extent WG14 (which still keeps more close to the existing practice spirit), you have hundreds of contributors wanting to leave their historical mark on the ISO standard, withough having written a single line of compiler code, validating their proposal, which they are able to fight trough the whole voting process, and then leave the compiler vendors sorting out the mess how to implement their beloved feature.

Those of us that really like C++, are also kind of lost on how things turned out this way.

jcranmer

WG21 is well down the path of actively being hostile to the implementers of C++. There was a recent proposal where all 4 implementers [1] stood up and said "no", and the committee still voted it in, ignoring their feedback.

[1] C++ has only 4 implementations these days, Clang, EDG, GCC, and MSVC; everything keeping up with the standard is a fork of one of these projects.

gpderetta

Proof of implementation should be a requirement for every proposal (allegedly it is, but in practice...).

Which would limit most "outsider" proposals mostly to library features, which would be a good thing I guess.

stingraycharles

Well, you kind of have your answer right there: it’s a language designed by committee, not by “the industry”.

This has been my biggest problem, and I say this as someone who has been on and off developing C++ for over 2 decades.

At the same time, it’s a safe bet to say that C++ will still be around in another 2 decades.

zombot

> even mentioning Rust as an alternative seems taboo, idk.

Rust is even framed as an "attack on C++" by Stroustrup himself [1]. No wonder it's taboo.

[1] https://www.theregister.com/2025/03/02/c_creator_calls_for_a...

unboundedjiure

Seems like a bit of a sensationalist deduction from what looks like a pretty levelheaded response. It's not a call to war, but call to improve the C++ standard

pron

C++ developers worry about all UB, but we don't worry about all of them to the same extent. MITRE's CWE Top 25 [1] lists out-of-bounds write as the second most dangerous weakness, out-of-bounds read as the sixth, and use after free as number eight (null-pointer dereference and integer overflow are at nos. 21 and 23 respectively). All UBs are worrisome, but that's not to say they are equally so. Taking care of out-of-bounds is easier than UAF and at the same time more important. Priorities matter.

[1]: https://cwe.mitre.org/top25/archive/2024/2024_cwe_top25.html

roca

I don't know where that ranking comes from. It also matters that attackers adapt: UAF exploitation is harder than out of bounds, but it is well understood, and attackers can switch to it, so shutting off one source of UB isn't as effective in practice as you might expect.

pron

> I don't know where that ranking comes from.

It comes from MITRE (https://en.wikipedia.org/wiki/Mitre_Corporation), and the methodology is explained on the website (roughly, the score is relative prevalence times relative average vulnerability severity).

> and attackers can switch to it, so shutting off one source of UB isn't as effective in practice as you might expect.

If that's how things work, you could say the same about all the other weaknesses that have nothing to do with UB.

Wumpnot

Interesting, for all the winging about C or C++ this shows most of these apply to all languages, and the ones that relate to C or C++ are actually pretty easy to prevent in C++(less so in C) by enabling hardening modes and using smart pointers.

pron

Because these are ranked by prevalence x severity, and most programs are written in memory-safe languages, the UB-related weaknesses are probably at the top of the list for programs written in C or C++, especially because "ordinary" servers are usually not written in those languages. But the point is that, indeed, even within these unsafe languages, not all kinds of unsafety are equal.

It is true to say that memory-safety issues are among the biggest problems in C and C++, but it is not true to say that unless you absolutely prevent them all you remain in the same spot. If you significantly reduce or prevent some of the memory safety issues, you're in a place that's not too different from that of programs in memory-safe languages.

pjmlp

While it is great that this is happening, it seems a bit getting too late to the party.

There is also the whole issue that standard working group, and the folks that actually work on C and C++ compilers, nowadays it is a very thin Venn diagram for the intersection of both groups.

So it remains to be seen how much of this will actually land on compilers, and in what form.

Nevertheless there are many C++ written tools that most likely will never be rewritten into something else, so any improvement is welcomed.

Kelteseth

> too late to the party

And even if this would all be available today, you would need to wait 10 years for all third party libraries to catch up. Just look at the current rate of c++ version adoption. Many projects just migrated to c++17 this year, and that version is 8 years old by now...

Edit: Just used PCL that got updated from c++14 to 17, or look at https://vfxplatform.com/ where everything is c++17, and the list goes on....

dgellow

According to Herb’s article most of this is already available in compilers

steveklabnik

Depending on how you look at it, that raises the question of "if this is so effective, and it's already available, why hasn't the situation improved?"

pjmlp

As many of us point out, even those that actually apreciate C++ despite its warts, safety culture or lack thereof.

Which I think it was much better back when it was C++ vs C in the C++ARM days, with great compiler provided frameworks, eventually past C++98 there seems to have been an inversion, as C++ graduality took over domains where C ruled.

The same mideset that will reach out for unsafe language constructs, regardless of the programming language, without any kind of profiler information, because of course it is faster and every μs counts.

Wumpnot

  1. who says it hasn't?
  2. most of the vul code is C, which is obviously much harder to harden, and the Rust Evangelism Strike Force loves to pretend that C++ is the same as C, so no matter the improvements to C++, they will just point at C.
  3. I think many simply didn't know about these hardening modes, MSVC has had this for 10-15 years, but I still encounter people who don't know about it..somehow.

AlotOfReading

In mainstream compilers. Most of these options aren't available in compilers e.g. for safety critical code.

pjmlp

Partially.

uecker

In C, we made clear in C23 that there is no true time-travel UB and have eliminated about 30% of UB in the core language for C2Y as part of an ongoing effort that will eliminate most of the simple cases of UB. We also have plans to add opt-in memory safety mode proving full spatial and temporal memory safety.

Having said this, I think the sudden and almost exclusive focus on memory safety is weird. As a long-term Linux user, this is not my main problem. This is what people building app stores and non-free content distribution systems need and they now re-engineer the world according to their needs. There are lot of things compromising my online safety and freedom, and certainly memory safety issues are not very high on this list.

Finally, what Rust achieves, and what a memory-safe mode in C will hopefully also achieve in the future, is also just an incremental improvement. As long as there in unsafe code, and in practice there will be a lot of unsafe code, there is no perfect memory safety.

ultimaweapon

Most C/C++ users don't understand how Rust achieve memory safety because they don't know Rust enough. They always underestimate Rust memory safety. The truth is Rust can give you nearly 100% memory safety. The point of unsafe code in Rust is to isolate unsafe operations and provide a safe interface to it. As long as you wrote that unsafe code correctly the rest of your safe code will never have memory safety problems.

pjmlp

They also conflate unsafe keyword with Rust, when its use in systems programming languages predates C by a decade.

Here is the most recent version of NEWP manual,

https://public.support.unisys.com/framework/publicterms.aspx...

Which started as ESPOL in early 1960's,

https://en.wikipedia.org/wiki/Executive_Systems_Problem_Orie...

Binaries with unsafe code blocks are tainted, and must be white listed by admins to allow execution in first place.

This was then followed by several languages, using unsafe code blocks, pseudo packages like SYSTEM, unsafe, unchecked,..., until finally Rust came to be.

But since most C and C++ users aren't language nerds, not even reading their own ISO specification, they are unaware of the whole safety history since JOVIAL, and naturally the whole unsafe code blocks is all about Rust.

uecker

I understand this perfectly. The point is that 1) memory safety is a small part of the overall picture. 2) in practice people will not build perfectly safe abstractions that are then used by 100% memory-safe code, but they will create a mess.

Georgelemental

> In practice people will not build perfectly safe abstractions that are then used by 100% memory-safe code

Yes, in practice they quite commonly will. `unsafe` is rare, so it’s feasible to spend lots of extra efforts to validate it.

bluGill

I've seen Rust code where everything was in unsafe because they thought it was needed (in 1% is probably was).

unboundedjiure

I've also seen Java code that recursively calls 1 row from database at a time, a million times per request with 5ms latency per row. It's possible to willingly abuse almost any system that a reasonable person would deem perfectly performant or safe for the purpose under default conditions.

The question should less be about whether it's possible to try to abuse the system and more what it looks like in a very reasonable everyday scenario.

galangalalgol

Can you clarify why memory safety isn't your main problem? Do you just mean that UB isn't your biggest problem, or that memory safety isn't your biggest source of UB? The latter sounds unlikely, and the former is interesting as all the code I give away for free gets used by people who very much care if it has UB.

uecker

When we talk to companies that are in the business of helping other companies that were hacked, the stories we hear are never that they were hacked by some 0-day in the Linux kernel or a server software caused by some memory-safety issue. Instead, they are hacked because some did not install updates of some component on the webserver, password authentication was used, the same passwords are used on different servers, etc. or some bug in some overly complex Microsoft infrastructure part.

adrianN

> because some did not install updates of some component on the webserver

But how many of those updates fix memory issues?

lelanthran

> Instead, they are hacked because some did not install updates of some component on the webserver, password authentication was used, the same passwords are used on different servers, etc. or some bug in some overly complex Microsoft infrastructure part.

Last I checked (a few months ago) 8 out of 10 breaches were due to human error.

As far as reducing breaches go, you'll get more bang for your buck by ensuring employees are up to date on their routine security awareness training.

Your employees are much much easier to hack than your computers. "Choice of language" is a blip in the stats.

bluGill

Most security breeches find vulnerabilities in humans not their computers. Yes if you find a software (or hardware) vulnerability that you can exploit you can get many many computers are once, but those are much harder to find. However the vast majority are not from that source.

There is a lot we can do in UX to make human vulnerabilities less common, but no language change will help.

unboundedjiure

Are you sure about that? Throughout auditing a lot of codebases in my lifetime I've found loads of ways to bypass authentication, spoof identity, cause denial of service in every one. These are very big and widely used applications with a lot of userbase.

While unauthorized people waltzing on in to company premises hasn't not happened, it's been way rarer than the amount of serious bugs or security flaws I find. Traditional phone and email scams happen more often, but their impact has materialized much less severe thanks to very limited user privileges

Sharlin

I’m not sure how you can even try to juxtapose technical issues like memory safety with sociopolitical problems like corpo interests being in conflict with the interests of the common people. The formeer can be alleviated with technical solutions. There’s nothing that a programming language can do about the latter.

Also, I question your claim that memory unsafety is not of great importance to regular computer users. Perhaps not if your computer is airgapped from the internet and never gets any unvetted software installed. Otherwise, have you missed the primary cause of the majority of CVEs issued in the past decades? Do you not think that the main technical problem behind countless security vulnerabilities, that have very concretely affected tens and hundreds of millions of people, does not deserve the attention that it’s finally starting to get?

Google has reported that the mere act of stopping writing new code in memory-unsafe languages has made the fraction of mem safety vulnerabilities drop from >80% to ~20% in a few years. This is because bugs have a half-life, and once you stop introducing new bugs, the total count starts going down asymptotically as existing bugs get fixed.

Finally, since you inevitably mentioned Rust, memory safety is indeed a necessary but not sufficient condition in software reliability. Luckily, Rust also happens to greatly decrease the odds of logic bugs getting in, thanks to its modern (i.e. features first introduced in ML in the 70s) type system that actually tries to help you get things right.

C is never going to have those parts, the “if it compiles, it is correct by construction” assurance. C++ has janky, half-assed, non-orthogonal, poorly-composing, inconsistently designed versions of a lot of the stuff, but it also has all of the cruft, and that cruft is still what is taught to people before the less-bad parts. And because C++ is larger than the most people’s brain capacity, most people can’t even get to the more less-bad parts, never mind keeping up with new standard versions.

uecker

Technology and society are never separate things. The question if why something is seen as important or not, and what is funded, very much depends on societal questions. I am using Linux for almost 30 years. I have never been hacked, or know anybody personally, who was hacked because of a memory safety issue in any open-source component. I know such people and companies exist, I just know many more which are affected by other issues. I know many people affected by bugs in Microsoft software, including myself. I am also affected by websites spying on me, from email not being encrypted by default, etc. A lot of things could be done do make my safety and security better. That you cite Google actually demonstrates the problem: Memory safety is much more their concern, not mine.

And C definitely will have memory safety. Stay tuned. (And I also like to have memory safety.) I do not care about C++ nor do I care about Rust. Both languages are far too complex for my taste.

bluGill

> Otherwise, have you missed the primary cause of the majority of CVEs issued in the past decades

Only because CVEs are never issued when humans are compromised. While that is probably the correct action on their part it means your argument is flawed as you don't account for human vulnerabilities which are much more common. Yes memory safety is a big problem and we should do something - but as an industry we need to not ignore the largest problem. There is a lot we can do in UX to prevent most security vulnerabilities, and putting too much emphasis on memory can take away from potentially more productive paths.

IshKebab

> I think the sudden and almost exclusive focus on memory safety is weird.

They're clearly panicking about people switching to Rust. I don't think it's surprising. Too little too late though; you can't just ignore people's concerns for decades and then as soon as a viable competitor comes along say "oh wait, actually we will listen to you, come baaack!".

> There are lot of things compromising my online safety and freedom, and certainly memory safety issues are not very high on this list.

Out of the things that programming languages can solve, memory safety should be very high on your list. This has been proven repeatedly.

uecker

I don't see people panicking in my vicinity. I see some parts of the industry pouring money into Rust and pushing for memory safety. I agree that memory safety is nice, but some of the most reliable and safe software I use on daily basis is written in C. I am personally much more scared of supply chain issues introduced by Rust, or other issues introduced by increased complexity (which I still think is the main root cause of security issues).

galangalalgol

If you only need what is in the c std library the number of rust crates you use will be tiny and all from authors within the foundation. The hash gets stored in your repo, so any rebuilds where a dependency repo got compromised and tried to modify an existing version will fail. If you are using a c library that isn't std, then you will probably pull in 10x to 20x the number of dependencies in rust, but not substantially more authors. The risk is real, but if you treat it like c and minimize your dependencies it isn't any riskier, probably less. If you get tempted and grab everything and the kitchen sink you can still be reasonably safe by using the oldest version of crates you can compile with that don't have any cve. That is made easier with crates like cargo deny and cargo audit.

All that said, I would love a language that had the same guarantees and performance without the complexity, but I don't see how that could work. There is definitely extra stuff in rust but the core capabilities come from the type system. Getting the same safety any other way would probably require a purely functional language which has performance costs in any implementation I am aware of along with a runtime being necessary. If you can afford that, then we don't need a new c, we have those languages.

agwa

> It’s working: The price of zero-day exploits has already increased.

The only thing they've shipped is no UB in constexpr code - i.e. code that wouldn't have been reachable by attackers in the first place. How could that possibly be the reason for the price of zero-day exploits increasing?

dist-epoch

The C++ people were kind of ignoring the safety problems and Rust, but when Microsoft suddenly announced that all new code will try using Rust first, it's like suddenly they woke up and realized this is not a fad and that the gun is pointing at their heads.

like_any_other

As amusing as it is to imagine language developers getting executed when their language falls out of favor, I think most C++ people are happy about Rust. The problem are large codebases already in C++ that won't get rewritten, so we have to do the best we can, within the constraints of C++.

IshKebab

Depends what you mean by "C++ people". I think most of those C++ people who are happy about Rust would say they are now Rust people who may be forced to use C++ sometimes.

bluGill

No, I'm still a C++ person because while Rust is intriguing I have so much existing C++ it would be billions of dollars to rewrite to rust and it will take years to get more than a trivial amount of Rust. For the vast majority of new features the cost to implementing it in Rust is far higher than the cost of doing it in C++.

gpderetta

MS is a large contributor to the C++ standardization effort.

pjmlp

It is, while at the same time they have changed their point of view on allowing C++ for new projects at Microsoft.

I also think Herb Sutter leaving his role at Microsoft might have been related with this.

From "Microsoft Azure security evolution: Embrace secure multitenancy, Confidential Compute, and Rust"

https://azure.microsoft.com/en-us/blog/microsoft-azure-secur...

"Decades of vulnerabilities have proven how difficult it is to prevent memory-corrupting bugs when using C/C++. While garbage-collected languages like C# or Java have proven more resilient to these issues, there are scenarios where they cannot be used. For such cases, we’re betting on Rust as the alternative to C/C++. Rust is a modern language designed to compete with the performance C/C++, but with memory safety and thread safety guarantees built into the language. While we are not able to rewrite everything in Rust overnight, we’ve already adopted Rust in some of the most critical components of Azure’s infrastructure. We expect our adoption of Rust to expand substantially over time."

From "Windows security and resiliency: Protecting your business"

https://blogs.windows.com/windowsexperience/2024/11/19/windo...

"And, in alignment with the Secure Future Initiative, we are adopting safer programming languages, gradually moving functionality from C++ implementation to Rust."

Finally,

"Microsoft is Getting Rusty: A Review of Successes and Challenges - Mark Russinovich"

https://www.youtube.com/watch?v=1VgptLwP588

ladyanita22

I was about to reply to you saying that it was just Azure moving to Rust, but I can see I was wrong.

Good job MS! I hope Apple follows suit soon.

otabdeveloper4

Pretty much everything Microsoft does is guaranteed to be a fad. They announce a new language paradigm shift every 10 years. (And then abandon it 10 years later.)

mafuy

I think I'm misunderstanding something. The post sounds like UB has already been mostly eliminated from recent versions of C++. But to my knowledge even something simple as `INT_MAX + 1` is still UB. Is that false?

nmeofthestate

I think you are misunderstanding the post - it specifically says that there's "a metric ton of work" to be done to address UB, not that it's mostly a solved problem.

saagarjha

No, it’s true. None of these efforts significantly change the prevalence of UB in C++.

bluGill

Stop using `INT_MAX + 1` as an example! It is the worst possible example you can give (though easy to understand). Such code is essentially never a memory safety issue and not to what most people worry about UB.

In the vast majority of cases it doesn't matter what `INT_MAX + 1` does you code it wrong. Sure there are a few encryption cases where it is fine, but the vast majority of cases your code as a bug no matter what the result it. If the variable netWorth is at INT_MAX there is no value of adding 1 that is correct. If the variable employeeId is at INT_MAX all values of adding 1 are going to collide with an existing employee.

Meanwhile if you define INT_MAX+1 you force the compiler to add checks for overflow INT_MAX in addition operations even though most of the time you won't overflow and thus have needlessly slowed down the code.

UB causes real problems in the real world, but INT_MAX+1 is not one of those places where it causes problems.

Maxatar

This is a very worrisome perspective about undefined behavior. It suggests that the issue with undefined behavior is that there is a bug in your code and it's the bug that is the problem. But that's not (entirely) the case, the issue with undefined behavior is that compilers exploit it in ways that propagate this behavior in an entirely unbounded fashion that can result in bugs not only at the very moment that the undefined behavior happens, but even before it, ie. the infamous time travelling undefined behavior [1].

Getting rid of undefined behavior will not get rid of bugs, and no one thinks that memory safe languages somehow are bug free and certainly C++ code will not be bug free even if undefined behavior is replaced with runtime checks. What eliminating undefined behavior does is it places predictable boundaries on both the region of memory and the region of time that the bug can affect.

[1] https://devblogs.microsoft.com/oldnewthing/20140627-00/?p=63...

bluGill

The article already says we need to get rid of (or at least greatly restrict) time travel optimizations.

Your code has a bug if addition overflows and there is no point in defining how it works.

AlotOfReading

INT_MAX+1 is actually a great example of UB because it demonstrates how UB is a problem even if there's a completely reasonable runtime behavior. One of the big reasons UB is problematic is that it invalidates the semantic guarantees the standard makes about all your other code. That signed integer overflow completely invalidates any sort of formal proofs or error handling you might otherwise have.

gw2

A question to security experts reading this thread:

What is your opinion on deploying C++ codebases with mitigations like CFI and bounds checking? Let's say I have a large C++ codebase which I am unwilling to rewrite in Rust. But I:

* Enable STL bounds checking using appropriate flags (like `-DGLIBCXX_ASSERTIONS`).

* Enable mitigations like CFI and shadow stacks.

How much less safe is "C++ w/ mitigations" than Rust? How much of the "70% CVE" statistic is relevant to such a C++ codebase?

(I've asked this in an earlier thread and also in other forums, but I never really got a response that does not boil down to "only Rust is safe, suck it up!". It also doesn't help that every other thread about C++ is about its memory unsafety...)

pjmlp

It helps, however there is also a culture mindset that is required.

Back in the old Usenet flamewars, C developers would to say coding in languages like Object Pascal, Modula-2, Ada,... was like programming with straightjacket, and we used to call them cowboy programming.

When C++ came into the scene with its improved type system, it seemed a way we could have the best of both worlds, better safety and UNIX/C like ecosystem.

However this eventually changed as more and more people started to adopt C++, and thanks to its C subset, many C++ projects are actually mostly C code compiled with a C++ compiler.

So hardned runtimes help a lot, as does using static analysers like clang tidy, VC++ analyse, Sonar, PVS Studio, Clion analysers, ....

However many of them exist for the last 30 years, I was using Parasoft in 1999.

The biggest problem is culture, thinking that such tools are only required by those that aren't good enough to program C or C++, naturally those issues only happen to others, we are good drivers.

rwmj

STL bounds checking isn't bounds checking. Your code (or other libraries you use) can still have simple pointer arithmetic that goes outside bounds.

But the larger problem is that bounds checking (even ASAN) isn't as good as statically checking code. ie. Your code with bounds checking still crashes at run time, which can be a denial of service attack, whereas with static checking your code would never have compiled in the first place.

Nevertheless if you don't want to rewrite the world, then using these mitigations is much better than not using them. I would also add fuzzing to the mix.

gpderetta

DoS is vastly better than an RCE. And safe code can still panic.

But as you mention, unfortunately enabling bound checking in the STL wouldn't catch a lot of pointer manipulation.

It would still be better than the the status-quo.

IshKebab

Good question. If I had to bet I'd say something like half of the 70% would be prevented. Yeah it wouldn't really help with lifetime issues or type confusion but a huge proportion of that 70% is simple out-of-bounds memory accesses.

But don't forget lots of open source code is written in C and this barely helps there.

gw2

> something like half of the 70% would be prevented

Sure, but the other half are use-after-frees and those would not be exploitable anyway because of CFI and shadow stacks.

IshKebab

That is a very bold claim!

UncleMeat

For the first one, a lot of this depends on how modern your codebase is. STL bounds checks work great (and have remarkably low overhead) if the vast majority of your code is working with standard library types. Maybe all of the code that might have been a c-style array in the past is now using std::vector, std::span, or std::array and so you've got built in lengths. Not perfect, of course, since you can still have all sorts of spatial safety issues with custom iterator implementations or whatever, but great.

But my hunch is that the vast majority of C++ codebases aren't using std::span or std::array everywhere because there is just a lot of much older code. And there's no comparable option for handling lifetime bugs.

Tools like CFI or hardware memory tagging or pointer authentication help, but skilled exploit creators have been defeating techniques like these for a while so they don't have the "at least I know this entire class of issue is prevented" confidence as bounds checks inserted into library types.

The general industry recommendation is "if you are starting something new that has security implications, please seriously explore Rust" and "if you have a legacy C++ codebase that is too expensive to port please seriously explore these mitigation techniques and understand their limitations."

rfoo

My two cents, I'm wearing my exploit writer's hat, but my current day job is SWE on legacy/"modern-ish" C++ codebases.

> Enable STL bounds checking using appropriate flags

This rarely helps. Most of the nice-to-exploit bugs were in older codes, which weren't using STL containers. Or they are even just write in C. However, if enabling these flags do not hurt you, please still do as it does make non-zero contribution.

> Enable mitigations like CFI and shadow stacks.

Shadow stack is meh. CFI helps a bit more, however there's some caveats depending on which CFI implementation you are talking about, i.e. how strong is it, for example, is it typed or not? But in best case it still just makes the bug chain one bug longer and maybe completely kills some bugs, which isn't enough to make your exploits impossible. It just raises the bar (that's important too though). It also depends on what the specific scenario. For example, for browser renderer without sandbox / site-isolation etc, CFI alone makes almost no impact, as in this case achieving arbitrary R/W is usually easier than taking over $rip, and it's obvious you can do data-only attack to have UXSS, which is a serious enough threat. On the other hand, if it's a server and you are mainly dealing with remote attackers and there's inherently no good leak primitive etc, various mitigations soup could make real difference.

So, all in all, it's hard to tell without your project details.

> How much of the "70% CVE" statistic is relevant to such a C++ codebase?

Uh, I'd guess, half or more of that. But still, it just raises the bar.

gw2

First of all, thanks for your response.

> This rarely helps. Most of the nice-to-exploit bugs were in older codes, which weren't using STL containers.

While I agree with this, is not modifying those code to use STL containers much cheaper than rewriting into an entirely new language?

> Shadow stack is meh.

Are you referring to the idea of shadow stacks in general or a particular implementation of them?

> For example, for browser renderer without sandbox / site-isolation etc

I may be wrong, but I think you are referring to JIT bugs leading to arbitrary script execution in JS engines. I don't think memory safety can do anything about it because those bugs happen in the binding layer between the C++ code and JS scripts. Binding code would have to use unsafe code anyway. (In general, script injection has nothing to do with memory safety, see Log4j)

> Uh, I'd guess, half or more of that.

I mean, if you are after RCEs, don't CFI and shadow stacks halt the program instead of letting the CPU jumping to the injected code?

Now, let me get more specific - can you name one widespread C++ exploit that:

* would have happened even if the above mentioned mitigations were employed.

* would not have happened in a memory safe language?

rfoo

All good questions.

> is not modifying those code to use STL containers much cheaper

That's right. However, I'd add that most exploited bugs these days (in high-profile targets) are temporal memory safety (i.e. lifetime) bugs. The remaining spatial (out of bound) bugs are mostly in long forgotten dependencies.

> Are you referring to the idea of shadow stacks in general or a particular implementation of them?

The idea. Shadow stack (assuming perfect hardware assisted implementation) is a good backward-edge control flow integrity idea, and ruins one of the common ways to take over $rip (write a ROP chain to stack), but that's it. Besides making exploitation harder, both forward-edge and backward-edge CFI also kill some bugs. However, IMO we are long past non-linear stack buffer overflow days, once in a while there may still be news about one, but it could be news because it is an outlier. Hence, compared to CFI, the bugs shadow stack kills are pretty irrelevant now.

> JIT bugs leading to arbitrary script execution in JS engines

Not necessarily JIT bugs. Could also be an UAF and people went a bloody path to convert it to an `ArrayBuffer` with base address = 0 and size = 0x7FFFFFFFFFFFFFFF accessible from JavaScript. Chrome killed this specific primitive. But there's more, I'm not going to talk about them here.

You may have a slight confusion here. In case of browser renderer, people starts with arbitrary JavaScript execution, the goal here is to do what JavaScript (on this page!) can't do, via memory corruption - including, but not limited to executing arbitrary native code. For example, for a few years, being able to access Chrome-specific JS APIs to send arbitrary IPC message to browser process (out of renderer sandbox), is one `bool` flag on .bss away from JavaScript. If we managed to get arbitrary R/W (that is, can read / write all memory within the renderer process, within JavaScript, see my ArrayBuffer example above), we just change it and run our sandbox escape against browser process in JavaScript, who needs that native code execution?

Or, if you do want native code execution. For a few years in V8 the native code WASM gets compiled to, is RWX in memory, so you just use your arb R/W to write that. You can kill that too, but then people starts coming up with bizarre tricks like overwriting your WASM code cache when you load it from disk and before making it R/X, and there're enough fishes in the pool that you likely can't patch em'all.

> I mean, if you are after RCEs, don't CFI and shadow stacks halt the program instead of letting the CPU jumping to the injected code?

Yeah. But as I said, nowadays people usually use temporal memory safety bugs, and they want arbitrary R/W before they attempt to take over $rip. Don't get me wrong, this is because of the success of CFI and similar mitigations! So they did work, they just can't stop people from popping your phones.

> can you name one widespread C++ exploit that:

I just google'd "Chrome in the wild UAF" and casually found this in the first page: https://securelist.com/the-zero-day-exploits-of-operation-wi...

I assume "in the wild exploited" fits your "widespread" requirement.

Granted, it's five years old, but if you are okay with non-ITW bugs I can come up with a lot of more recent ones (in my mind).

This is an UAF. So it would not have happened in a memory safe language. While back then the exploited chrome.exe may not have enabled CFG (it was enabled late 2021 IIRC), I don't see how the exploit path could be blocked by CFI.

virtualritz

There is no end to what a(n old) white man thinks he can do. ;)

As an old white man who switched from C++, for a over a quarter century, to Rust, about seven years ago, the fallacy at the root of Herb's piece is all too well understood by me.

saagarjha

The problem here is that these don’t actually solve the problems that attackers use to exploit software written in C++. Nobody cares what constexpr code can’t UB (I thought it already couldn’t…?). Attackers will take your object and double free it and there’s nothing in here that will stop them from doing that. Fixing this in C++ is actually very difficult and unfortunately the committee doesn’t want to do this because it would change the language too much. So we’re only getting the minor improvements, which are nice, but nowhere near what is necessary to “tame UB”.

bluGill

Attackers cannot double free memory. They can force you into a state where you double free memory, but that is a different situation.

saagarjha

Well, they can force you into a state where you let them double free memory.

qalmakka

This is nice and all, but the main issue with UB and C++ IMHO has never been that "nice", modern codebases are problematic - modern C++ in the hands of competent people is very nice, really. The problem is that 90% of all C++ development ATM is done either on legacy codebases full of ancient crap or with old frameworks and libraries that spam non-standard containers, raw pointers, allocate manually, ...

In my experience, introducing modern C++ in a legacy codebase is not that much easier compared to adding Rust to it. It's probably safe to argue that C++03 stands to C++26 almost like K&R C stood to the original C++

techbrovanguard

> Tech pundits still seem to commonly assume that UB is so fundamentally entangled in C++’s specification and programs that C++ will never be able to address enough UB to really matter.

- denial ← you are here

- anger

- bargaining

- depression

- acceptance

Cope, seethe, mald, etc.

kookamamie

At depression they'll figure out the codebase is full of const-casts and null-dereferences.

I completely agree this is trying to polish a turd, essentially. The train has left the station some decades ago.

steve_gh

But you are never going to rewrite the gazillion or so lines of C++ out there, and currently being used in all sorts of production systems.

But if you have a beter compiler that points out more of the problem UB areas in your codebase, then you have somewhere you can make a start towards reducing the issues and attack surface.

The perfect is often the enemy of the good.

(edit - typo)

lambdaone

I don't doubt that most of the gazillion of so lines of legacy C++ will never be rewritten. But critical infrastructure - and there's a lot of it - most certainly needs to be either rewritten in safer languages, or somehow proved correct, and starting new projects in C++ just seems to me to be an unwise move when there are mature safer alternatives like Rust.

Human civilization is now so totally dependent on fragile, buggy software, and active threats against that software increasing so rapidly, that we will look back on this era as we do on the eras of exploding steam engines, collapsing medieval cathedrals, cities that were built out of flammable materials, or earthquake-unsafe buildings in fault zones.

This doesn't mean that safer C++ isn't a good idea; but it's also clear that C++ is unlikely ever to become a safe language; it's too riddled with holes, and the codebase built on those holes too vast, for all the problems to be fixed.

lmm

> But you are never going to rewrite the gazillion or so lines of C++ out there, and currently being used in all sorts of production systems.

We are, because we will have to, and the momentum is already gathering. Foundational tools and libraries are already being rewritten. More will follow.

> But if you have a beter compiler that points out more of the problem UB areas in your codebase, then you have somewhere you can make a start towards reducing the issues and attack surface.

Sure. But fixing those is going to be harder and less effective than rewriting.

raverbashing

Seriously

wtf someone comes up with "X is UB" and even worse, "Since it's UB this gives a license to do whatever the f we want, including something that's clearly not at all what the dev intended"

No wonder the languages being developed to solve real problems by people with real jobs are moving forward

masklinn

> Since it's UB this gives a license to do whatever the f we want, including something that's clearly not at all what the dev intended

That’s really not how it works.

Compilers rather works in terms of UBs being constraints (on the program), which they can then leverage for optimisations. All the misbehaviour is emergent behaviour from the compiler assuming UBs don’t happen (because that’s what an UB is).

Of note, Rust very much has UBs, and hitting them is as bad as in C++, but the “safe” subset of the langage is defined such that you should not be able to hit UBs from there at all (such feasibility is what “soundness” is about, and why “unsoundness” is one of the few things justifying breaking BC: it undermines the entire point of the langage).

HN

Taming the UB Monsters in C++

Taming the UB Monsters in C++