Skip to content(if available)orjump to list(if available)

How to Secure Existing C and C++ Software Without Memory Safety [pdf]

pizlonator

I think this paper overestimates the benefit of what I call isoheaps (partitioning allocations by type). I wrote the WebKit isoheap implementation so it’s something I care about a lot.

Isoheaps can make mostly neutralize use after free bugs. But that’s all they do. Moreover they don’t scale well. If you isoheap a select set of stuff then it’s fine, but if you tried to deploy isoheaps to every allocation you get massive memory overhead (2x or more) and substantial time overhead too. I know because I tried that.

If an attacker finds a type confusion or heap buffer overflow then isoheaps won’t prevent the attacker from controlling heap layout. All it takes is that they can confuse an int with a ptr and game over. If they can read ptr values as ints then they can figure out how the heap is laid out (no matter how weirdly you laid it out). If they also can write ptr values as ints then control they whole heap. At that point it doesn’t even matter if you have control flow integrity.

To defeat attackers you really need some kind of 100% solution where you can prove that the attacker can’t use a bug with one pointer access to control the whole heap.

Yoric

Yes, having coded quite a few years in C++ (on the Firefox codebase) before migrating to Rust, I believe that many C and C++ developers mistakenly assume that

1/ memory safety is the unachievable holy grail of safety ;

2/ there is a magic bullet somewhere that can bring about the benefits of memory safety, without any of the costs (real or expected).

In practice, the first assumption is wrong because memory-safety is just where safety _starts_. Once you have memory-safety and type-safety, you can start building stuff. If you have already expended all your cognitive budget on reaching this point, you have lost.

As for the magic bullets, all those I've seen suggested are of the better-than-nothing variety rather than the it-just-works variety they're often touted as. Doesn't mean that there won't ever be a solution, but I'm not holding my breath.

And of course, I've seen people claim more than once that AI will solve code safety & security. So far, that's not quite what's written on the wall.

gpderetta

Well, GC is very close to that magic bullet (comparatively spatial safety via bound checking is easy). It does have some costs of course, especially in a language like C++ that is GC-hostile.

pebal

C++ isn't hostile toward garbage collection — it's more the programmers using C++ who are . C++ is the only language that can have an optional, totally pause-less, concurrent GC engine (SGCL). No other programming language, not even Java, offers such a collector.

Yoric

I feel that the lack of GC is one of the key differentiators that remain to C++. If a group of C++ developers were to adopt a GC, they'd be well on their way to abandoning C++.

pjc50

Short paper, so can be easily summarized. The claim is that security can be improved by these compiler and hardware assisted measures:

    - Stack Integrity
    - Control-Flow Integrity
    - Heap Data Integrity
    - Pointer Integrity and Unforgeability
They cite the deployment of these measures on recent Apple hardware as evidence of their effectiveness.

pastage

Exciting! I did not know about HW pointer authentication and it is ten years old now on ARM, the referenced papers are nice but honestly LLVM docs are a lot better.

https://clang.llvm.org/docs/PointerAuthentication.html

The papers: https://www.usenix.org/conference/usenixsecurity23/presentat... https://www.qualcomm.com/content/dam/qcomm-martech/dm-assets...

lambdaone

From the cited paper:

"These four types of integrity, do not establish memory safety, but merely attempt to contain the effects of its absence; therefore, attackers will still be able to change software behavior by corrupting memory."

and the paper then goes on to say, about Apple's implementation of the cited techniques:

"This intuition is borne out by experience: in part as a result of Apple’s deployment of these defenses since 2019, the incidence of RCE attacks on Apple client software has decreased significantly—despite strong attack pressure—and the market value of such attacks risen sharply."

"Decreased significantly" is not "eliminated"; indeed, you could paraphrase this as "the combination of these techniques has already been shown to be insufficient for security guarantees".

Which is not to say that these mitigations are a bad idea; but I think their benefits are significantly over-sold in the paper.

commandersaki

I said this in another comment, but an easy way to measure the efficacy is to look at the economy surrounding the zero day markets. Long story short, over the years it has become increasingly more expensive to both produce/supply and acquire exploits. This is attributable to the increasing mitigations and countermeasures, which include the ones in this document.

MattPalmer1086

I don't think it is significantly oversold, although it is definitely overselling a bit:

"More disconcertingly, in some cases, attacks may be still be possible despite the above protections."

"We should prepare for a case where these defenses will fail to protect against a specific vulnerability in some specific software".

My main concern with the paper is there is no careful analysis showing that the 4 techniques they propose are really sufficient to cover the majority of RCE exploits. Having said that, I don't dispute that having them would raise the bar for them a lot.

gizmo

The memory protection strategies this paper argues for are fine. If we can recompile legacy software to gain better protection against stack and heap exploits that's a clear win.

As the paper points out memory safety is not a great concern on phones because applications are sandboxed. And that's correct. If an application is stuck in a sandbox it doesn't matter what that process does within its own process space. Smartphones taught us what we already knew: process isolation works.

Then the paper observes that memory safety is still a significant problem on the server. But instead of pointing out the root cause -- the absence of sandboxing -- the authors argue that applications should instead be rewritten in go or rust! This is absurd. The kernel already provides strong memory protection guarantees for each process. The kernel also provides hard guarantees for access to devices and the file system. But server software doesn't take advantage of any of these guarantees. When a server process intermixes data of multiple customers and privilege levels then any tiny programming mistake (regardless of memory safety) can result in privilege escalation or catastrophic data leaks. What use is memory safety when your go program returns the wrong user's data because of an off-by-one error? You don't need a root exploit if your process already has "root access" to the database server.

If we want to be serious about writing secure software on the server we have to start taking advantage of the process isolation the kernel provides. The kernel can enforce that a web request from user A cannot return data from user B because the process simply cannot open any files that belong to the wrong user. This completely eliminates all memory safety concerns. But today software on the server emulates what the kernel already does with threading, scheduling, and memory protection, except poorly and in userspace and without any hardware guarantees. Effectively all code runs as root in ring 0. And we're surprised that security continues to plague our industry?

jchw

> Then the paper observes that memory safety is still a significant problem on the server. But instead of pointing out the root cause -- the absence of sandboxing -- the authors argue that applications should instead be rewritten in go or rust! This is absurd. The kernel already provides strong memory protection guarantees for each process. The kernel also provides hard guarantees for access to devices and the file system. But server software doesn't take advantage of any of these guarantees. When a server process intermixes data of multiple customers and privilege levels then any tiny programming mistake (regardless of memory safety) can result in privilege escalation or catastrophic data leaks. What use is memory safety when your go program returns the wrong user's data because of an off-by-one error? You don't need a root exploit if your process already has "root access" to the database server.

Yes, because servers are inherently multi-tenant, you can't inherently avoid the risks. Process isolation can't help you, even if you had the resources to fork off for every single request. If you have a database pool in your process, you can go and access other people's data. There is never going to be a case where having an RCE to a server isn't a serious issue.

Also, neither process isolation nor memory safety can guarantee you don't simply return the wrong data for a given customer, so that point really neither here nor there.

(I'd also argue that memory safety clearly matters on mobile platforms still anyway. Many of the exploit chains that break kernel protections still rely on exploiting memory bugs in userland first before they can climb their way up. There's also other risks to this. Getting an RCE into someone's Signal process is an extremely dangerous event for the user.)

graemep

> Also, neither process isolation nor memory safety can guarantee you don't simply return the wrong data for a given customer, so that point really neither here nor there.

The fact that there can be security issues at the application level is no reason to add memory safety issues to them!

You may well have the resources to fork a process per customer, not have a database pool, not cache etc. its a trade-off with resources needed and performance.

> Also, neither process isolation nor memory safety can guarantee you don't simply return the wrong data for a given customer, so that point really neither here nor there.

You should fix the problems you can, even if there are problems you cannot.

jchw

> The fact that there can be security issues at the application level is no reason to add memory safety issues to them!

> You may well have the resources to fork a process per customer, not have a database pool, not cache etc. its a trade-off with resources needed and performance.

This feels like one of those weird things that wind up happening in discussions about a lot of things, including say, renewable energy. People get so hung up about certain details that the discussion winds up going into a direction that really doesn't make sense, and it could only go there incrementally because it would've been very obvious if you followed the entire chain.

Please bear with me. Consider the primary problem in the first place:

The problem is that we have a bunch of things that are not memory safe, already. Rewriting all of those things is going to take a lot of time and effort.

All of these ideas about isolating requests are totally fine, honestly. If you can afford those sorts of trade-offs in your design, then go hog wild. I find many of these architectural decisions to be a bit pointless for reasons I'm happy to dig into in more detail if people really care, but the important thing to understand is that I'm not telling you to not try to isolate requests. I'm just saying that in practice, there are no "legacy" servers that apply this degree of request isolation. They were architected with the assumption that the process would not get RCE'd, and therefore it would require a full rearchitecting to make them safe in the face of that sort of bug.

But when we talk about architectures where we avoid connection pooling and enforce security policy entirely in the database layer, that essentially means writing new servers. And if you're writing new servers, why in the world would you go through all of this effort when you don't have to? I'll grant you that the Rust borrow checker has a decently steep learning curve, but I don't expect it to be a serious barrier to experienced C++ programmers if they are really trying to learn and not just looking for reasons to not have to. C++ is not an easy language, either.

It would be absolutely insane to perform all of this per-request isolation voodoo and then not start on a clean slate with a memory safe language in the first place. You may as well take both at that point. Fuck it, run each request inside a small KVM context! And in fact, you can do that. Microsoft did that[1], but note that even in their press release for Hyperlight, they are showing it being used in Rust code, because that's the logical order of operations: If you're truly serious about security, and you're building something from scratch, start by preventing vulnerabilities as far left in the process as possible; and you can't go further left than having a programming language that can prevent certain bugs entirely by-design.

> You should fix the problems you can, even if there are problems you cannot.

That goes both ways.

[1]: https://opensource.microsoft.com/blog/2024/11/07/introducing...

gizmo

Databases also have access controls! Which developers don't use.

If you have 10.000 tenants on the same server you can simply have 10.000 database users. And that simple precaution will provide nearly perfect protection against cross-tenant data leaks.

jchw

The database was merely an example, there are other shared resources that are going to run into the same problem, like caches. Still, it's weird to imply that database users are meant to map to application users without any kind of justification other than "you can do it" but OK, let's assume that we can. Let's consider Postgres, which will have absolutely no problem creating 10,000 users (not sure about 100,000 or 1,000,000, but we can put that aside anyhow.) You basically have two potential options here:

- The most logical option is to continue to use database pooling as you currently do, and authenticate as a single user. Then, when handling a user request, you can impersonate a specific database user. Only problem is, if you do this, the protection you get is entirely discretionary: the connection is still absolutely authenticated to a user that can do more, and all you have to do is "reset role" and go on your way. So you can do this, but it doesn't help you with server exploits.

- The other way to handle this is by having each request get a separate database connection which is actually authenticated to a specific user. That will work and provide the database level guarantees. However, for obvious reasons, you definitely can't share a global database pool with this approach. That's a problem, because each Postgres connection will cost 5-10 MiB or so. If you had 10,000 active users, you would spend 50-100 GiB on just per-connection resources on your database server box. This solution scales horribly even when the scale isn't that crazy.

And this is all assuming you can actually get the guarantees you need just using the database layer. To that I say, good luck. You'll basically need to do most of your logic in the database instead of the application layer and make use of features like row-level security. You can do this, but it's an extremely limiting architecture, not the least of which because databases are hard to scale except vertically. If you run into any scenario where you outgrow a single database cluster, everything here goes out the window.

Needless to say, nobody does this, and they're totally right. Having a database assist in things like authorization and visibility is not a terrible idea or anything, but all of this taken together is just not very persuasive.

And besides. Postgres itself, along with basically all of the other major databases, are also not necessarily memory-safe. Having external parties have access to your database connection pretty much puts you back at square one for defenses against potential unknown memory safety bugs, making this entire exercise a bit pointless...

pornel

The kernel knows about system-local users, but not the remote ones. Servers may need to access data of multiple users at once, so it's not as simple as some setuid+chroot CGI for every cookie received. Kernels like Linux are not designed for that.

Maybe it would be more feasible with some capability-based kernel, but you'd inherently have a lot of logic around user accounts, privileges, and queries. You end up involving the kernel in what is row-level database security. That adds a lot of complexity to the kernel, which also makes the isolation itself have more of the attack surface.

OTOH you can write your logic in a memory-safe language today. The VM/runtime/guaranteed-safe-subset is your "kernel" that protects the process from getting hijacked — an off-by-one error can't cause arbitrary code execution. The VM/runtime itself can still have vulnerabilities, but that just becomes analogous to kernel vulnerabilities.

MisterTea

> That adds a lot of complexity to the kernel, which also makes the isolation itself have more of the attack surface.

Not if you remove auth from the kernel: https://doc.cat-v.org/plan_9/4th_edition/papers/auth The Plan 9 kernel is very small and portable which demonstrates that you don't need complexity to do distributed auth properly. The current OS hegemony is incredibly dated design wise because their kernels were all designed to run on a single machine.

> OTOH you can write your logic in a memory-safe language today.

Memory safety is not security.

pornel

> Not if you remove auth from the kernel

The factoctum looks very much like a microservice or a database with stored procedures handling access control, but of course plan9 makes it a file system instead of some RPC. It's a sensible design, but if IPC is the solution, then you don't even need plan9 for it.

> Memory safety is not security.

I didn't say it was. However, it is an isolation barrier for the memory-safe code. It's roughly equivalent to process isolation, but in userland. Instead of an MMU you have bounds checks in software.

Kernels implement process isolation cheaply with the help of hardware, but that isn't the only way to achieve the same effect. It can be emulated in software. When the code is memory safe, it can't be made to execute arbitrary logic that isn't in the program's code. If the program attempts some out-of-bounds access, it will be caught with userland checks instead of a page fault, but in either case it won't end up with an illegal memory access.

vacuity

> Maybe it would be more feasible with some capability-based kernel, but you'd inherently have a lot of logic around user accounts, privileges, and queries. You end up involving the kernel in what is row-level database security. That adds a lot of complexity to the kernel, which also makes the isolation itself have more of the attack surface.

Microkernels/exokernels sacrificing some performance to bring reliable kernels that allow for reliable userspace.

deepsun

> doesn't matter what that process does within its own process space

Re. phones -- you assume that a process hacks another process. But a there might be vulnerability within the process itself, corrupting its own memory. Sandboxing doesn't help.

gizmo

What matters is that a random app cannot access sensitive data like your passwords, sessions, email. On iOS you can run anything from the app store and it's fine. On Windows any .exe you run can cause havoc.

deepsun

My point that memory corruption can happen in the password app itself, not in a random app.

VWWHFSfQ

> memory safety is not a great concern on phones because applications are sandboxed. And that's correct. If an application is stuck in a sandbox it doesn't matter what that process does within its own process space. Smartphones taught us what we already knew: process isolation works.

I thought we learned that this doesn't work after the iOS 0-click exploit chain running "sandboxed" app code in the kernel.

IshKebab

Good job sandbox escapes and local root exploits never exist!

gizmo

1. The point is you don't need root when the unprivileged process already has access to all data. There is nothing more to be gained.

2. Good luck breaking out of your AWS instance into the hypervisor.

IshKebab

> Good luck breaking out of your AWS instance into the hypervisor.

https://www.itnews.com.au/news/xen-patches-critical-guest-pr...

VWWHFSfQ

> 2. Good luck breaking out of your AWS instance into the hypervisor.

It doesn't take luck. Just skill, resources, and motivation. Like we already saw happen to the iOS "sandbox".

cadamsdotcom

> For high assurance, these foundations must be rewritten in memory- safe languages like Go and Rust [10]; however, history and estimates suggest this will take a decade or more [31].

The world runs on legacy code. CISA is correct that rewrites are needed for critical software [1][2] but we know how rewrites tend to go, and ROI on a rewrite is zero for most software, so it will take far more than a decade if it happens at all. So score one for pragmatism with this paper! Hope CISA folks see it and update their guidance.

[1] https://www.cisa.gov/news-events/news/urgent-need-memory-saf... [2] https://www.cisa.gov/resources-tools/resources/case-memory-s...

Xylakant

I don't see what updates you expect on CISA guidance because the very documents you reference already acknowledge that rewriting all code is not a viable strategy in the general case and that other options for improvement exist

For example, they recommend evaluating hardware backed solutions such as CHERI

> There are, however, a few areas that every software company should investigate. First, there are some promising memory safety mitigations in hardware. The Capability Hardware Enhanced RISC Instructions (CHERI ) research project uses modified processors to give memory unsafe languages like C and C++ protection against many widely exploited vulnerabilities. Another hardware assisted technology comes in the form of memory tagging extensions (MTE) that are available in some systems. While some of these hardware-based mitigations are still making the journey from research to shipping products, many observers believe they will become important parts of an overall strategy to eliminate memory safety vulnerabilities.

And acknowledge that strategies will need to be adjusted for every case

> Different products will require different investment strategies to mitigate memory unsafe code. The balance between C/C++ mitigations, hardware mitigations, and memory safe programming languages may even differ between products from the same company. No one approach will solve all problems for all products.

However, and that's where the meat of the papers is: They require you to acknowledge that there is a problem and do something about it

> The one thing software manufacturers cannot do, however, is ignore the problem. The software industry must not kick the can down the road another decade through inaction.

and the least you can do is make it a priority and make a plan

> CISA urges software manufacturers to make it a top-level company goal to reduce and eventually eliminate memory safety vulnerabilities from their product lines. To demonstrate such a commitment, companies can publish a “memory safety roadmap” that includes information about how they are modifying their software development lifecycle (SDLC) to accomplish this goal.

It's clearly not the case that these papers say "Rewrite all in Rust, now!". They do strongly advocate in favor of using memory safe languages for future development and I believe that's the rational stance to take, but they appear well groundet in their stance on existing software.

IshKebab

Google has shown that you get the biggest benefit by writing new code in memory-safe languages, so it's not like security doesn't drastically improve until you've rewritten everything.

berratype

I don't know if the ROI on rewrites is zero, following Prossimo's work, I'm seeing lots of performance and maintainability improvements over existing software.

DyslexicAtheist

> Hope CISA folks see it and update their guidance.

cope > hope @ CISA right now:

https://www.darkreading.com/cyberattacks-data-breaches/cisa-...

0xEF

I wish HN readers were not so afraid to openly discuss this more, but it is a much bigger issue than even what that article lets on (which is a good article, I might add). Cuts to critical agencies like this aren't about efficiency at all; it's about ham-stringing roadblocks that might be in your way. That means we can expect one of two things in the near future of the US;

1. Enemies of the US gov't exploit the weakness, possibly conspiring with the people who created the weakness to do so.

2. A near or complete collapse of the government as, lo and behold, it is discovered that none of them actually knew what they were doing, regardless of the confidence in which they said otherwise.

Either way, we, the people trying to keep industries moving and bring home food to put on the table, will suffer.

saagarjha

There is actually an interesting niche that one can carve out when dealing with an attacker who has a memory corruption primitive but this paper is a bit too simple to explore that space. Preventing RCE is too broad of a goal; attackers on the platforms listed continue to bypass implementations of the mitigations presented and achieve some form of RCE. The paper suggests these are because of implementation issues, and some are clearly bugs in the implementation, but many are actually completely novel and unaddressed workarounds that require a redesign of the mitigation itself. For example, “heap isolation” can be done by moving allocations away from each other such that a linear overflow will run into a guard page and trap. Is it an implementation bug or a fundamental problem that an attacker can then poke bytes directly into a target allocation rather than linearly overwriting things? Control flow integrity has been implemented but attackers then find that, in a large application, calling whole functions in a sequence can lead to the results they want. Is this a problem with CFI or that specific implementation of CFI? One of the reasons that memory safety is useful is that it’s a lot easier to agree on what it is and how to achieve it, and with that what security properties it should have. Defining the security properties of mitigations is quite a bit harder. That isn’t to say that they’re not useful, or can’t be analyzed, but generally the result is not actually denial of RCE.

MattPalmer1086

This looks really useful. Doesn't fix the problem of memory corruption but mostly seems to limit the ability to convert that into remote code execution. And all the techniques are already in widespread use, just not the default or used together.

I would not be surprised if attackers still manage to find sneaky ways to bypass all the 4 protections, but it would certainly raise the bar significantly.

lambdaone

It's already been demonstrated to be insufficient; otherwise Apple software would now be impregnable. That's not to say these protections are a bad idea; they should be universal as they substantially reduce the existing attack surface - but the paper massively over-sells them as a panacea.

pron

Most server software these days is already written in memory-safe languages yet still has security vulnerabilities. If you reduce the vulnerabilities due to memory safety issues by, say, 90%, then there's nothing special about them anymore. On the other hand, memory-safe languages also don't fully eliminate memory safety issues as they depend on operations or components that may not themselves be fully memory-safe (BTW, Rust has this problem more than other memory-safe languages).

So a "panacea" in this context doesn't mean making the stack memory-safe (which nobody does, anyway) but rather making the vulnerabilities due to memory safety no more common or dangerous than other causes of vulnerabilities (and remember that rewriting software in a new language also introduces new vulnerability risks even when it reduces others). Whether these techniques actually accomplish that or not is another matter (the paper makes empirical claims without much evidence, just as some Rust fans do).

TheDong

> BTW, Rust has this problem more than other memory-safe languages

Do you have a citation there?

I've run into a ton of memory safety issues in Go because two goroutines concurrently modifying a map or pointer is a data-race, which leads to memory unsafety... and go makes it wildly easy to write such data-races. You need to manually add mutexes everywhere for go to be memory safe.

Rust, on the other hand, I've yet to run into an actual memory safety issue. Like, I'm at several hundred in Go, and 0 in rust.

I'm curious why my experience is so different.

kobebrookskC3

what vulnerabilities does rust introduce over c/c++?

linux_security

Was discussing this paper with a few colleagues who work in this area, and concluded that this paper seems like an odd combination of:

- The author citing their own research. (Ok, all researchers do this) - Mildly scolding the industry for not having applied their research. It's "pragmatic" after all.

The elephant in the room is that these approaches have been widely deployed and their track record is pretty questionable. iPhone widely deploys PAC and kalloc_type. Chrome applies CFI and PartitionAlloc. Android applies CFI and Scudo. Yet memory safety exploitation still regularly occurs against these targets. Is it harder because of these technologies? Probably. But if they're so effective, why are attackers still regularly successful at exploiting memory safety bugs? And what's the cost of applying these? Does my phone's battery die sooner? Is it slower? So now your phone/browser are slower AND still exploitable.

sebstefan

>A Pragmatic Security Goal

>Remote Code Execution (RCE) attacks where attackers exploit memory-corruption bugs to achieve complete control are a very important class of potentially-devastating attacks. Such attacks can be hugely disruptive, even simply in the effects and economic cost of their remediation [26]. Furthermore, the risk of such attacks is of special, critical concern for server-side platform foundations [10]. Greatly reducing the risk of RCE attacks in C and C++ software, despite the presence of memory-corruption bugs, would be a valuable milestone in software security especially if such attacks could be almost completely prevented. We can, therefore, aim for the ambitious, pragmatic goal of preventing most, or nearly all, possibilities of RCE attacks in existing C and C++ software without memory safety. Given the urgency of the situation, we should only consider existing, practical security mechanisms that can be rapidly deployed at scale.

I don't know if it's obvious to anyone else that this is AI-written or if it's just me/if I'm mistaken

readingnews

I am not sure, and it may be this persons culture/background, but I do know that at a college/uni, your advisors/reviewers would tell you not to do the adjective/drama stuff, as it adds no real value to a scientific/technical paper.

e.g. potentially-devastating, hugely disruptive, special critical, greatly reducing, valuable milestone, almost completely, ambitious pragmatic, most or nearly all, existing practical.

dgellow

It’s not obvious to me. I cannot say one way or the other

nickpsecurity

I've always favored a large public/private investment into open-source tools like Coverity, PVS Check, and RV-Match. Put extra effort into suppressing false positives and autofixing simple problems. Companies like Apple had enough money to straight up buy the vendors of these tools.

I'd also say, like CPAChecker and Why3, they should be designed in a flexible way where different languages can easily be added. Also, new passes for analyzers. Then, just keep running it on all the C/C++ code in low, false, positive mode.

On top of this, there have been techniques to efficiently do total memory safety. Softbound + CETS was an example. We should invest in more of those techniques. Then, combine the analyzers with those tools to only do runtime checks on what couldn't be proven.

mre

> However, their use is the exception, not the rule, and their use—in particular in combination—requires security expertise and investment that is not common. For them to provide real-world, large-scale improvements in the security outcomes of using C and C++ software, there remains significant work to be done. In particular, to provide security benefits at scale, for most software, these protections must be made an integral, easy-to-use part of the world-wide software development lifecycle. This is a big change and will require a team effort.

That's the core problem.

The mechanisms mentioned are primarily attack detection and mitigation techniques rather than prevention mechanisms. Bugs can't be exploited as easily, but they still exist in the codebase. We're essentially continuing to ship faulty software while hoping that tooling will protect us from the worst consequences.

Couldn't one argue that containers and virtual machines also protect us from exploiting some of these memory safety bugs? They provide isolation boundaries that limit the impact of exploits, yet we still consider them insufficient alone.

It's definitely a step in the right direction, though.

The paper mentions Rust, so I wanted to highlight a few reasons why we still need it for people who might mistakenly think this approach makes Rust unnecessary:

  - Rust's ownership system prevents memory safety issues at compile time rather than trying to mitigate their effects at runtime  
  - Rust completely eliminates null pointer dereferencing  
  - Rust prevents data races in concurrent code, which the paper's approach doesn't address at all  
  - Automatic bounds checking for all array and collection accesses prevent buffer overflows by design  
  - Lifetimes ensure pointers are never dangling, unlike the paper's approach which merely tries to make dangling pointers harder to exploit
So, we still need Rust, and we should continue migrating more code to it (and similar languages that might emerge in the future). The big idea is to shift bug detection to the left: from production to development.

commandersaki

We're essentially continuing to ship faulty software while hoping that tooling will protect us from the worst consequences.

Yet one way to measure how effective these mitigations and countermeasures are working is looking at the cost of the zero day market. The trend continues to going upwards in the stupidly expensive realm due to needing multiple chains and such to attack software. However, I'm not discounting software now developed in memory safe language doesn't already contribute to this.

Here is one of the references indicating this in the article: https://techcrunch.com/2024/04/06/price-of-zero-day-exploits...

tgv

Or Java, or Scala, or Go, or whathaveyou. This is about existing software.

unscaled

While all of the languages you mention are memory safe (as is almost every programming language released after 1990), none of them solve all of the safety problems mentioned above, in particular, the two points:

  - Rust completely eliminates null pointer dereferencing  
  - Rust prevents data races in concurrent code, which the paper's approach doesn't address at all  
Scala comes closest to solving these points, since it has optional features (in Scala 3) to enable null safety or you could build some castles in the sky (with Scala 2) to avoid using null and make NPEs more unlikely. The same goes for concurrency bugs: you can use alternative concurrency models that make data races harder (e.g. encapsulate all your state inside Akka actors).

With Go and Java there is no dice. These languages lack the expressive power (by design! since they are touted as "simple" languages) to do anything that will greatly reduce these types of bugs, without resorting external tools (e.g. Java static analyzers + annotation or race condition checkers).

In short, Rust is one of the only mainstream languages that absolutely guarantees safety from race conditions, and complete null safety (most other languages that provide good null safety mechanisms like C#, Kotlin and TypeScript are unsound due to reliance on underlying legacy platforms).

tgv

Nil dereferencing in those languages doesn’t make them unsafe. It throws an exception or panics. And Java has some none-nil annotation, IIRC.

Still, none of this is relevant to existing software written in C. This is not about a rewrite.

And if it were, rust doesn’t offer perfect safety, as many tasks almost demand unsafe code. Whereas that doesn’t happen in Go, Scala, etc. Every situation requires its own approach.

eklavya

How is NPE a safety issue?

pron

> We're essentially continuing to ship faulty software while hoping that tooling will protect us from the worst consequences.

It is the consequences that give rise to the cost of a bug or the value of preventing it.

> Couldn't one argue that containers and virtual machines also protect us from exploiting some of these memory safety bugs? They provide isolation boundaries that limit the impact of exploits, yet we still consider them insufficient alone.

No, that's not the same because sandboxing only limits the impact of vulnerabilities to whatever the program is allowed to do in the first place, and that is, indeed, insufficient. The mechanisms here reduce the impact of vulnerabilities to less than what the program is allowed to do. To what extent they succeed is another matter, but the two are not at all comparable.

> The big idea is to shift bug detection to the left: from production to development.

This has been the idea behind automated tests since the practice first gained popularity. But it's important to understand that it works due to a complicated calculus of costs. In principle, there are ways to eliminate virtually all bugs with various formal methods, yet no one is proposing that this should be the primary approach in most situations because despite "shifting left" it is not cost effective.

Everyone may pick their language based on their aesthetic preference and attraction to certain features, but we should avoid sweeping statements about software correctness and the best means to achieve it. Once there are costs involved (and all languages that prevent certain classes of bugs exact some cost) the cost/benefit calculus becomes very complex, with lots of variables. Practical software correctness is an extremely complicated topic, and it's rare we can make sweeping universal statements. What's important is to obtain empirical data and study it carefully.

For example, in the 1970s, the prediction by the relevant experts was that software would not scale without formal proof. Twenty years later, those predictions were proven wrong [1] as unsound (i.e. without absolute guarantees) software development techniques, such as code review and automated tests, proved far more effective than anticipated, while sound techniques proved much harder to scale beyond a relatively narrow class of program properties.

Note that this paper also makes some empirical claims without much evidence, so I take its claims about the effectiveness of these approaches with the same scepticism as I do the claims about the effectiveness of Rust's approach.

[1]: https://6826.csail.mit.edu/2020/papers/noproof.pdf

sramsay

> Everyone may pick their language based on their aesthetic preference and attraction to certain features, but we should avoid sweeping statements about software correctness and the best means to achieve it. Once there are costs involved (and all languages that prevent certain classes of bugs exact some cost) the cost/benefit calculus becomes very complex, with lots of variables. Practical software correctness is an extremely complicated topic, and it's rare we can make sweeping universal statements.

Thank you. I feel like this perspective is forever being lost in these discussions -- as if gaining the highest possible level of assurance with respect to security in a critical system were a simple matter of choosing a "safe language" or flipping some switch. Or conversely, avoiding languages that are "unsafe."

It is never this simple. Never. And when engineers start talking this way in particular circumstances, I begin to wonder if they really understand the problem at hand.

kobebrookskC3

does making software scale? i see exploits for android/ios even though they spend millions if not billions on securing it. which unsound techniques make exploits unfeasible? i'm not even looking for a guarantee, just an absence in practice.

pron

Well, software today is much bigger and of higher quality that was thought possible in the seventies. That's not to say that it can scale indefinitely, but my point was that unsound methodologies (i.e. not formal proofs) work much better than expected, and the software correctness world has moved from the seventies' "soundness is the only way" to "software correctness is a complex game of costs and benefits, a combination of sound and unsound techniques is needed, and we don't know of an approach that is universally better than others; there are too many variables".

tzs

OT: Are there any memory safe languages that that are fast and support goto?

I'm writing something that needs to implement some tax computations and I want to implement them to follow as closely as possible the forms that are used to report those computations to the government. That way it is easy to be sure they are correct and easy to update them if the rules change.

The way those forms work is something like this:

  1. Enter your Foo: _________
  2. Enter your Bar: _________
  3. Add line 1 and line 2: ________
  4: Enter your Spam: _______
  5: Enter the smaller of line 1 and 4: _____
  6: If line 5 is less than $1000 skip to line 9
  7: Enter the smaller of line 2 and $5000: _____
  8: If line 7 is greater than line 4 skip to 13
  ...
With goto you can write code that exactly follows the form:

  Line1: L1 = Foo;
  Line2: L2 = Bar;
  Line3: L3 = L1 + L2;
  Line4: L4 = Spam;
  Line5: L5 = min(L1, L4);
  Line6: if (L5 < 1000) goto Line9;
  Line7: L6 = min(L2, 5000);
  Line8: if (L7 > L4) goto Line13;
  ...
For some forms an

  if (X) goto Y
    ....
  Y:
can be replaced by

  if (!X) {
     ...
  }
because nothing before that has a goto into the body of the if statement. But some forms do have things jumping into places like that. Also jumping out of what would be such a body into the body of something later.

Writing those without goto tends to require duplicating code. The duplication in the source code could be eliminated with a macro system but don't most memory safe languages also frown on macro systems?

Putting the duplicate code in separate functions could also work but often those sections of code refer to things earlier in the form so some of the functions might need a lot of arguments. However the code then doesn't look much like the paper form so it is harder to see that it is correct or to update it when the form changes in different years.

ameliaquining

Rust has macros. It also has labeled blocks that you can break out of, which are similar to goto except with more nesting required. You could plausibly reduce the nesting with a macro though.

In most languages I'd just solve this by ending each block with a call to the next block. The "too many arguments required" problem can be addressed with closures.

steveklabnik

> OT: Are there any memory safe languages that that are fast and support goto?

Irreducible control flow is a pain for static analysis.

> Writing those without goto tends to require duplicating code

This feels like a regular old state machine to me, which obviously is nice to write with goto, but isn't required.

returningfory2

Maybe you could build a domain-specific language.