Skip to content(if available)orjump to list(if available)

There is no memory safety without thread safety

chadaustin

Every time this conversation comes up, I'm reminded of my team at Dropbox, where it was a rite of passage for new engineers to introduce a segfault in our Go server by not synchronizing writes to a data structure.

Swift has (had?) the same issue and I had to write a program to illustrate that Swift is (was?) perfectly happy to segfault under shared access to data structures.

Go has never been memory-safe (in the Rust and Java sense) and it's wild to me that it got branded as such.

tptacek

Right, the issue here is that the "Rust and Java sense" of memory safety is not the actual meaning of the term. People talk as if "memory safety" was a PLT axiom. It's not; it's a software security term of art.

This is just two groups of people talking past each other.

It's not as if Go programmers are unaware of the distinction you're talking about. It's literally the premise of the language; it's the basis for "share by communicating, don't communicate by sharing". Obviously, that didn't work out, and modern Go does a lot of sharing and needs a lot of synchronization. But: everybody understands that.

moefh

I agree that there are two groups here talking past each other. I think it would help a lot to clarify this:

> the issue here is that the "Rust and Java sense" of memory safety is not the actual meaning of the term

So what is the actual meaning? Is it simply "there are no cases of actual exploited bugs in the wild"?

Because in another comment you wrote:

> a term of art was created to describe something complicated; in this case, "memory safety", to describe the property of programming languages that don't admit to memory corruption vulnerabilities, such as stack and heap overflows, use-after-frees, and type confusions. Later, people uninvolved with the popularization of the term took the term and tried to define it from first principles, arriving at a place different than the term of art.

But type confusion is exactly what has been demonstrated in the post's example. So what kind of memory safety does Go actually provide, in the term of art sense?

tptacek

It's a contrived type confusion bug. It reads 42h because that address is hardcoded, and it does something that ordinary code doesn't do.

If you were engaged to do a software security assessment for an established firm that used Go (or Python, or any of the other mainstream languages that do shared-memory concurrency and don't have Rust's type system), and you said "this code is memory-unsafe", showing them this example, you would not be taken seriously.

If people want to make PLT arguments about Rust's correctness advantages, I will step out of the way and let them do that. But this article makes a security claim, and that claim is in the practical sense false.

socalgal2

> But: everybody understands that.

Everybody does not understand that otherwise there would be zero of these issues in shipping code.

This is the problem with the C++ crowd hoping to save their language. Maybe they'll finally figure out some --disallow-all-ub-and-be-memory-safe-and-thread-safe flag but at the moment it's still insanely trivial to make a mistake and return a reference to some value on the stack or any number of other issues.

The answer can not be "just write flawless code and you'll never have these issues" but at the moment that's all C++, and Go, from this article has.

tptacek

Again: if you want to make that claim about correctness bugs, that's fine, I get it. But if you're trying to claim that naive Go code has memory safety security bugs: no, that is simply not true.

blub

This comment highlights a very important philosophical difference between the Rust community and the communities of other languages:

- in other languages, it’s understood that perhaps the language is vulnerable to certain errors and one should attempt to mitigate them. But more importantly, those errors are one class of bug and bugs can happen. Set up infra to detect and recover.

- in Rust the code must be safe, must be written in a certain way, must be proven correct to the largest extent possible at compile time.

This leads to the very serious, solemn attitude typical of Rust developers. But the reality is that most people just don’t care that much about a particular type of error as opposed to other errors.

dev_l1x_be

> But: everybody understands that.

I had to convince Go people that you can segfault with Go. Or you mean the language designers with using everybody?

pclmulqdq

You can segfault in Rust, too - there's a whole subset of the language marked "unsafe" that people ignore when making "safe language" arguments. The question is how difficult is it to have a segfault, and in Go it's honestly pretty hard. It's arguably harder in Rust but it's not impossible.

Ygg2

> People talk as if "memory safety" was a PLT axiom. It's not; it's a software security term of art.

It's been in usage for PLT for at least twenty years[1]. You are at least two decades late to the party.

    Software is memory-safe if (a) it never references a memory location outside the address space allocated by or that entity, and (b) it never executes intstruction outside code area created by the compiler and linker within that address space.
[1]https://llvm.org/pubs/2003-05-05-LCTES03-CodeSafety.pdf

zbentley

Not GP, but that definition seems not to be the one in use when describing languages like Rust--or even tools like valgrind. Those tools value a definition of "memory safety" that is a superset (a big one) of the definition referenced in that paper: safety as preventing incorrect memory accesses within a program, regardless of whether those accesses are out of bounds/segmentation violations.

Wowfunhappy

...by that definition, can a C program be memory safe as long as it doesn't have any relevant bugs, despite the choice of language? (I realize that in practice, most people are not aware of every bug that exists in their program.)

blub

The problem Rust has is that it’s not enough to be memory safe, because lots of languages are memory safe and have been for decades.

Hence the focus on fearless concurrency or other small-scale idioms like match in an attempt to present Rust as an overall better language compared to other safe languages like Go, which is proving to be a solid competitor and is much easier to learn and understand.

zozbot234

Except that Swift also has safe concurrency now. It's not just Rust. Golang is actually a very nice language for problems where you're inherently dependent on high-performance GC and concurrency, so there's no need to present it as "better" for everything. Nevertheless its concurrency model is far from foolproof when compared to e.g. Swift.

junebash

Swift is in the process of fixing this, but it’s a slow and painful transition; there’s an awful lot of unsafe code in the wild that wasn’t unsafe until recently.

cosmic_cheese

One of the biggest hurdles is just getting all the iOS/macOS/etc APIs up to speed with the thread safety improvements. It won’t make refactoring all that application code any easier, but as things stand even if you’ve done that, you’re going to run into problems anywhere your code makes contact with UI code because there’s a lot of AppKit and UIKit that have yet to make the transition.

RetpolineDrama

Swift 6 is only painful if you wrote a ton of terrible Swift 5, and even then Swift 5 has had modes where you could gracefully adopt the Swift 6 safety mechanisms for a long time (years?)

~130k LoC Swift app was converted from 5 -> 6 for us in about 3 days.

jamil7

Yes and no, our app is considerably larger than 130k LoC. While we’ve migrated some modules there are some parts that do a lot of multithreaded work that we probably will never migrate because they’d need to essentially be rewritten and the tradeoff isn’t really worth it for us.

isodev

It's also painful if you wrote good Swift 5 code but now suddenly you need to closely follow Apple's progress on porting their own frameworks, filling your code base with #if and control flow just to make the compiler happy.

ardit33

It is still incomplete and a mess. I don't think they thought through the actual main cases Swift is used for (ios apps), and built a hypothetical generic way which is failing on most clients. Hence lots of workarounds, and ways to get around it (The actor system). The isolated/nonisolated types are a bit contrived and causing real productivity loss, when the old way was really just 'everything ui in main thread, everything that takes time, use a dispatch queue, and call main when done'.

Swift is strating to look more like old java beans. (if you are old enough to remember this, most swift developers are too young). Doing some of the same mistakes.

Anways https://forums.swift.org/t/has-swifts-concurrency-model-gone... Common problems all devs face: https://www.massicotte.org/problematic-patterns

Anyways, they are trying to reinvent 'safe concurrency' while almost throwing the baby with the bathwater, and making swift even more complex and harder to get into.

There is ways to go. For simple apps, the new concurrency is easy to adopt. But for anything that is less than trivial, it becomes a lot of work, to the point that it might not make it worth it.

pjmlp

Their goal was always to be able to evolve to the point of being able fully replace C, Objective-C and C++ with Swift, it has been on their documentation and plenty of WWDC sessions since the early days.

isodev

You're getting downvoted but I fully agree. The problem with Swift's safety has now moved to the tooling. While your code doesn't fail so often at runtime (still does, because the underlying system SDKs are not all migrated), the compiler itself often fails. Even the latest developer snapshot with Swift 6.2 it's quite easy to make it panic with just... "weird syntax".

A much bigger problem I think are the way concurrency settings are provided via flags. It's no longer possible to know what a piece of code does without knowing the exact build settings. For example, depending on Xcode project flags, a snippet may always run on the main loop, or not at all or on a dedicated actor all together.

A piece of code in a library (SPM) can build just fine in one project but fail to build in another project due to concurrency settings. The amount of overhead makes this very much unusable in a production / high pressure environment.

CJefferson

Before Rust, I'd reached the personal conclusion that large-scale thread-safe software was almost impossible -- certainly it required the highest levels of software engineering. Multi-process code was a much more reasonable option for mere mortals.

Rust on the other hand solves that. There is code you can't write easily in Rust, but just yesterday I took a rust iteration, changed 'iter()' to 'par_iter()', and given it compiled I had high confidence it was going to work (which it did).

potato-peeler

I am curious. Generally basic structures like map are not thread safe and care has to be taken while modifying it. This is pretty well documented in go spec. In your case in dropbox, what was essentially going on?

tsimionescu

I think the surprise here is that failing to synchronize writes leads to a SEGFAULT, not a panic or an error. This is the point GP was making, that Go is not fully memory safe in the presence of unsynchronized concurrent writes. By contrast, in Java or C#, unsynchronized writes will either throw an exception (if you're lucky and they get detected) or let the program continue with some unexpected values (possibly ones that violate some invariants). Getting a SEGFAULT can only happen if you're explicitly using native code, raw memory access APIs, or found a bug in the runtime.

elcritch

Segfault sounds better than running with inconsistent data.

maxlybbert

I thought the same thing. Maybe the point of the story isn’t “we were surprised to learn you had to synchronize access” but instead “we all thought we were careful, but each of us made this mistake no matter how careful we tried to be.”

nine_k

In Java, there are separate synchronized collections, because acquiring a lock takes time. Normally one uses thread-unsafe collections. Java also gives a very ergonomic way to run any fragment under a lock (the `synchronized` operator).

Rust avoids all this entirely, by using its type system.

layer8

Java has separate synchronized collections only because that was initially the default, until people realized that it doesn’t help for the common cases of check-and-modify operations or of having consistency invariants with state outside a single collections (besides the performance impact). In practice, synchronized collections are rarely useful, and instead accesses are synchronized externally.

noisem4ker

Golang has a synchronized map:

https://pkg.go.dev/sync#Map

Thaxll

I have a hard time believing that it's common to create SEGFAULT in Go, I worked with the language for a very long time and don't remember a single time where I've seen that. ( and i've seen many data race )

Not synchronizing writes on most data structure does not create a SEGFAULT, you have to be in a very specific condition to create one, those conditions are extremely rares and un-usual ( from the programmer perspective).

In OP blog to triggers one he's doing one of those condition in an infinite loop.

https://research.swtch.com/gorace

commandersaki

You really have to go hunting for a segfault in Go. The critical sentence in OP article is: in practice, of course, safety is not binary, it is a spectrum, and on that spectrum Go is much closer to a typical safe language than to C. OP just has a vested interest in proving safety of languages and is making a big deal where in practice there is none. People are not making loads of unsafe programs in Go nor deploying as such because it would be pretty quickly detected. This is much different to C and C++.

rowanG077

I'm pretty surprised by some other comments in this thread saying this is a rare occurrence in go. In my experience it's not rare at all.

commandersaki

To put things in perspective, I posit to you, how many memory unsafe things can you do in Go that isn’t a variant of the same thing?

Or put another way what is the likelihood that a go program is memory unsafe?

tapirl

Listens your team had not sufficient review capacity at that time.

chc4

This is one of the things that I'm also looking on at Zig like a slow moving car crash about: they claim they are memory safe (or at least "good enough" memory safe if you use the safe optimization level, which is it's own discussion), but they don't have the equivalent to Rust's Send/Sync types. It just so happens that in practice no one was writing enough concurrent Zig code to get bitten by it a lot, I guess...except that now they're working on bringing back first-class async support to the language, which will run futures on other threads and presumably a lot of feet are going to be fired at once that lands.

ameliaquining

IIUC even single-threaded Zig programs built with ReleaseSafe are not guaranteed to be free of memory corruption vulnerabilities; for example, dereferencing a pointer to a local variable that's no longer alive is undefined behavior in all optimization modes.

skeezyboy

well just dont do it then

ameliaquining

That's also the standard advice in C and C++, and yet, people screw it up frequently enough to merit a CWE category: https://cwe.mitre.org/data/definitions/562.html

cibyr

Zig's claims of memory safety are a bad joke. Sure, it's easier to avoid memory safety bugs in Zig than it is in C, but that's also true of C++ (which nobody claims is a memory safe language).

jchw

This comes up now and again, somewhat akin to the Rust soundness hole issue. To be fair, it is a legitimate issue, and you could definitely cause it by accident, which is more than I can say about the Rust soundness hole(s?), which as far as I know are basically incomprehensible and about as likely to come across naturally as guessing someone's private key.

That said in many years of using Go in production I don't think I've ever come across a situation where the exact requirements to cause this bug have occurred.

Uber has talked a lot about bugs in Go code. This article is useful to understand some of the practical problems facing Go developers actually wind up being, particularly the table at the bottom summarizing how common each issue is.

https://www.uber.com/en-US/blog/data-race-patterns-in-go/

They don't have a specific category that would cover this issue, because most of the time concurrent map or slice accesses are on the same slice and this needs you to exhibit a torn read.

So why doesn't it come up more in practice? I dunno. Honestly beats me. I guess people are paranoid enough to avoid this particular pitfall most of the time, kind of like the Technology Connections theory on Americans and extension cords/powerstrips[1]. Re-assigning variables that are known to be used concurrently is obvious enough to be a problem and the language has atomics, channels, mutex locks so I think most people just don't wind up doing that in a concurrent context (or at least certainly not on purpose.) The race detector will definitely find it.

For some performance hit, though, the torn reads problem could just be fixed. I think they should probably do it, but I'm not losing sweat over all of the Go code in production. It hasn't really been a big issue.

[1]: https://www.youtube.com/watch?v=K_q-xnYRugQ

bombela

It took months to finally solve a data race in Go. No race detector would see anything. Nobody understood what was happening.

It ultimately resulted in a loop counter overflowing, which recomputed the same thing a billion of time (but always the same!). So the visible effect was a request would randomly take 3 min instead of 100ms.

I ended up using perf in production, which indirectly lead me to understand the data race.

I was called in to help the team because of my experience debugging the weirdest things as a platform dev.

Because of this I was exposed to so many races in Go, from my biased point of view, I want Rust everywhere instead.

But I guess I am putting myself out of a job? ;)

norir

It is very unfortunate that we use fixed width numbers by default in most programming languages and that common ops will silently overflow. Smarter compilers can work with richer numeric primitives and either automatically promote machine words to big numbers or throw an error on overflow.

People talk a lot about the productivity gains of ai, but fixing problems like this at the language level could have an even bigger impact on productivity, but are far less sensational. Think about how much productivity is lost due to obscure but detectable bugs like this one. I don't think rust is a good answer (it doesn't check overflow by default), but at least it points a little bit in the vaguely correct direction.

recursivecaveat

The situation with numbers in basically every widely used programming language is kind of an indictment of our industry. Silent overflow for incorrect results, no convenient facilities for units, lossy casts everywhere. It's one of those things where standing in 1975 you'd think surely we'll spend some of the next 40 years of performance gains to give ourselves nice, correct numbers to work with, but we never did.

astrange

Swift traps on overflow, which I think is the correct solution. You shouldn't make all your numbers infinitely-ranged, that turns all O(1) operations into O(N) in time and memory, and introduces a lot of possibilities for remote DoS.

devnullbrain

Rust checks overflow by default in debug builds

jchw

I think the true answer is that the moment you have to do tricky concurrency in Go, it becomes less desirable. I think that Go is still better at tricky concurrency than C, though there are some downsides too (I think it's a bit easier to sneak in a torn read issue in Go due to the presence of fat pointers and slice headers everywhere.)

Go is really good at easy concurrency tasks, like things that have almost no shared memory at all, "shared-nothing" architectures, like a typical web server. Share some resources like database handles with a sync.Pool and call it a day. Go lets you write "async" code as if it were sync with no function coloring, making it decidedly nicer than basically anything in its performance class for this use case.

Rust, on the other hand, has to contend with function coloring and a myriad of seriously hard engineering tasks to deal with async issues. Async Rust gets better every year, but personally I still (as of last month at least) think it's quite a mess. Rust is absolutely excellent for traditional concurrency, though. Anything where you would've used a mutex lock, Rust is just way better than everything else. It's beautiful.

But I struggle to be as productive in Rust as I am in Go, because Rust, the standard library, and its ecosystem gives the programmer so much to worry about. It sometimes reminds me of C++ in that regard, though it's nowhere near as extremely bad (because at least there's a coherent build system and package manager.) And frankly, a lot of software I write is just boring, and Go does fine for a lot of that. I try Rust periodically for things, and romantically it feels like it's the closest language to "the future", but I think the future might still have a place for languages like Go.

dev_l1x_be

> But I struggle to be as productive

You should calculate TCO in productivity. Can you write Python/Go etc. faster? Sure! Can you operate these in production with the same TCO as Rust? Absolutely not. Most of the time the person debugging production issues and data races is different than the one who wrote the code. This gives the illusion of productivity being better with Python/Go.

After spending 20+ years around production systems both as a systems and a software engineer I think that Rust is here for reducing the TCO by moving the mental burden to write data race free software from production to development.

bombela

It wasn't really tricky concurrency. Somebody just made the mistake of sharing a pointer across goroutines. It was quite indirect. Boils down to a function takeing a param and holds onto it. `go` is used at some point closing over this pointer. And now we have a data race in the waiting.

zozbot234

> And frankly, a lot of software I write is just boring, and Go does fine for a lot of that. I try Rust periodically for things, and romantically it feels like it's the closest language to "the future", but I think the future might still have a place for languages like Go.

It's not so much about being "boring" or not; Rust does just fine at writing boring code once you get familiar with the boilerplate patterns (Real-world experience has shown that Rust is not really at a disadvantage wrt. productivity or iteration speed).

There is a case for Golang and similar languages, but it has to do with software domains where there literally is no viable alternative to GC, such as when dealing with arbitrary, "spaghetti" reference graphs. Most programs aren't going to look like that though, and starting with Rust will yield a higher quality solution overall.

wavemode

> It ultimately resulted in a loop counter overflowing, which recomputed the same thing a billion of time (but always the same!). So the visible effect was a request would randomly take 3 min instead of 100ms.

This means that multiple goroutines were writing to the same local variable. I've never worked on a Go team where code that is structured in such a way would be considered normal or pass code review without good justification.

bombela

It happens all the time sadly.

It's not because people intentionally write this way. A function takes a parameter (a Go slice for example) and calls another function and so one. Deep down a function copies the pointer to the slice (via closure for example). And then a goroutine is spawned with this closure.

The most obvious mistakes are caught quickly. Buu sharing a memory address between two threads can happen very indirectly.

And somehow in Go, everybody feels incredibly comfortable spawning millions of coroutines/threads.

Thaxll

Rust does have loop counter overflow.

jason_oster

This is irrelevant on 64-bit platforms [^1] [^2]. For platforms with smaller `usize`, enable overflow-checks in your release builds.

[^1]: https://www.reddit.com/r/ProgrammerTIL/comments/4tspsn/c_it_...

[^2]: https://stackoverflow.com/questions/69375375/is-it-safe-to-a...

swiftcoder

Theoretically you can construct a loop counter that overflows, but I don't that there is any reasonable way to do it accidentally?

Within safe rust you would likely need to be using an explicit .wrapping_add() on your counter, and explicitly constructing a for loop that wasn't range-based...

ameliaquining

I think it's also worth noting that Rust's maintainers acknowledge its various soundness holes as bugs that need to be fixed. It's just that some of them, like https://github.com/rust-lang/rust/issues/25860 (which I assume you're referring to), need major refactors of certain parts of the compiler in order to fix, so it's taking a while.

ralfj

Yeah, I can totally believe that this is not a big issue in practice.

But I think terms like "memory safety" should have a reasonably strict meaning, and languages that go the extra mile of actually preventing memory corruption even in concurrent programs (which is basically everything typically considered "memory safe" except Go) should not be put into the same bucket as languages that decide not to go through this hassle.

sethammons

That Uber article is fantastic. I believe Go fixed the first example recently.

We had a rule at my last gig: avoid anonymous functions and always recover from them.

qcnguy

What do Uber mean in that article when they say that Go programs "expose 8x more concurrency compared to Java microservices"? They're using the word concurrency as if it were a countable noun.

wmf

If the Java version creates 4 concurrent tasks (could be threads, fibers, futures, etc.) but the Go version creates 32 goroutines, that's 8x the concurrency.

camgunz

I feel like I'm defending Go constantly these days. I don't even like Go!

Go can already ensure "consistency of multi-word values": use whatever synchronization you want. If you don't, and you put a race into your code, weird shit will happen because torn reads/writes are fuckin weird. You might say "Go shouldn't let you do that", but I appreciate that Go lets me make the tradeoff myself, with a factoring of my choosing. You might not, and that's fine.

But like, this effort to blow data races up to the level of C/C++ memory safety issues (this is what is intended by invoking "memory safety") is polemic. They're nowhere near the same problem or danger level. You can't walk 5 feet through a C/C++ codebase w/o seeing a memory safety issue. There are... zero Go CVEs resulting from this? QED.

EDIT:

I knew I remembered this blog. Here's a thing I read that I thought was perfectly reasonable: https://www.ralfj.de/blog/2021/11/18/ub-good-idea.html. Quote:

"To sum up: most of the time, ensuring Well-Defined Behavior is the responsibility of the type system, but as language designers we should not rule out the idea of sharing that responsibility with the programmer."

dcsommer

Unsafety in a language is fine as long as it is clearly demarcated. The problem with Go's approach is there no clear demarcation of the unsafety, making reasoning about it much more difficult.

camgunz

The "go" keyword is that demarcation

bobbylarrybobby

“go” being a necessary keyword even for benign operations makes its use an unsafety marker pointless; you end up needing to audit your entire codebase anyway. The whole point of demarcation is that you have a small surface area to go over with a fine-toothed comb.

tptacek

This is a canard.

What's happening here, as happens so often in other situations, is that a term of art was created to describe something complicated; in this case, "memory safety", to describe the property of programming languages that don't admit to memory corruption vulnerabilities, such as stack and heap overflows, use-after-frees, and type confusions. Later, people uninvolved with the popularization of the term took the term and tried to define it from first principles, arriving at a place different than the term of art. We saw the same thing happen with "zero trust networking".

The fact is that Go doesn't admit memory corruption vulnerabilities, and the way you know that is the fact that there are practically zero exploits for memory corruption vulnerabilities targeting pure Go programs, despite the popularity of the language.

Another way to reach the same conclusion is to note that this post's argument proves far too much; by the definition used by this author, most other higher-level languages (the author exempts Java, but really only Java) also fail to be memory safe.

Is Rust "safer" in some senses than Go? Almost certainly. Pure functional languages are safer still. "Safety" as a general concept in programming languages is a spectrum. But "memory safety" isn't; it's a threshold test. If you want to claim that a language is memory-unsafe, POC || GTFO.

kllrnohj

> in this case, "memory safety", to describe the property of programming languages that don't admit to memory corruption vulnerabilities, such as [..] type confusions

> The fact is that Go doesn't admit memory corruption vulnerabilities

Except it does. This is exactly the example in the article. Type confusion causes it to treat an integer as a pointer & deference it. This then trivially can result in memory corruption depending on the value of the integer. In the example the value "42" is used so that it crashes with a nice segfault thanks to lower-page guarding, but that's just for ease of demonstration. There's nothing magical about the choice of 42 - it could just as easily have been any number in the valid address space.

dboreham

Everyone knows that there's something very magical about the choice of 42.

Sharlin

> to describe the property of programming languages that don't admit to memory corruption vulnerabilities, such as stack and heap overflows, use-after-frees, and type confusions.

And data races allow all of that. There cannot be memory-safe languages supporting multi-threading that admit data races that lead to UB. If Go does admit data races it is not memory-safe. If a program can end up in a state that the language specification does not recognize (such as termination by SIGSEGV), it’s not memory safe. This is the only reasonable definition of memory safety.

tptacek

If that were the case, you'd be able to support the argument with evidence.

chowells

You mean like the program in the article where code that never dereferences a non-pointer causes the runtime to dereference a non-pointer? That seems like evidence to me.

ralfj

> Another way to reach the same conclusion is to note that this post's argument proves far too much; by the definition used by this author, most other higher-level languages (the author exempts Java, but really only Java) also fail to be memory safe.

This is wrong.

I explicitly exempt Java, OCaml, C#, JavaScript, and WebAssembly. And I implicitly exempt everyone else when I say that Go is the only language I know of that has this problem.

(I won't reply to the rest since we're already discussing that at https://news.ycombinator.com/item?id=44678566 )

jstarks

> If you want to claim that a language is memory-unsafe, POC || GTFO.

There's a POC right in the post, demonstrating type confusion due to a torn read of a fat pointer. I think it could have just as easily been an out-of-bounds write via a torn read of a slice. I don't see how you can seriously call this memory safe, even by a conservative definition.

Did you mean POC against a real program? Is that your bar?

tptacek

You need a non-contrived example of a memory-corrupting data race that gives attackers the ability to control memory, through type confusion or a memory lifecycle bug or something like it. You don't have to write the exploit but you have to be able to tell the story of how the exploit would actually work --- "I ran this code and it segfaulted" is not enough. It isn't even enough for C code!

codys

The post is a demonstration that a class of problems: causing Go to treat a integer field as a pointer and access the memory behind that pointer without using any of Go's documented "unsafe.Pointer" (or other documented as unsafe operations).

We're talking about programming languages being memory safe (like fly.io does on it's security page [1]), not about other specific applications.

It may be helpful to think of this as talking about the security of the programming language implementation. We're talking about inputs to that implementation that are considered valid and not using "unsafe" marked bits (though I do note that the Go project itself isn't very clear on if they claim to be memory-safe). Then we want to evaluate whether the programming language implementation fulfills what people think it fulfills; ie: "being a memory safe programming language" by producing programs under some constraints (ie: no unsafe) that are themselves memory-safe.

The example we see in the OP is demonstrating a break in the expectations for the behavior of the programming language implementation if we expected the programming language implementation to produce programs that are memory safe (again under some conditions of not using "unsafe" bits).

[1]: https://fly.io/docs/security/security-at-fly-io/#application...

Mawr

> The fact is that Go doesn't admit memory corruption vulnerabilities, and the way you know that is the fact that there are practically zero exploits for memory corruption vulnerabilities targeting pure Go programs, despite the popularity of the language.

Another way to word it: If "Go is memory unsafe" is such a revelation after its been around for 13 years, it's more likely that such a statement is somehow wrong than that nobody's picked up on such a supposedly impactful safety issue in all this time.

As such, the burden of proof that addresses why nobody's ran into any serious safety issues in the last 13 years is on the OP. It's not enough to show some theoretical program that exhibits the issue, clearly that is not enough to cause real problems.

zozbot234

There's no "revelation" here, it's always been well known among experts that Go is not fully memory safe for concurrent code, same for previous versions of Swift. OP has simply spelled out the argument clearly and made it easier to understand for average developers.

tptacek

It's made what would be a valid point using misleading terminology and framing that suggests these are security issues, which they simply are not.

"One could easily turn this example into a function that casts an integer to a pointer, and then cause arbitrary memory corruption."

No, one couldn't! One has contrived a program that hardcodes precisely the condition one wants to achieve. In doing so, one hasn't even demonstrated even one of the two predicates for a memory corruption vulnerability (attacker control of the data, and attacker ability to place controlled data somewhere advantageous to the attacker).

What the author is doing is demonstrating correctness advantages of Rust using inappropriate security framing.

weinzierl

"What's happening here, as happens so often in other situations, is that a term of art was created to describe something complicated; [..] Later, people uninvolved with the popularization of the term took the term and tried to define it from first principles, arriving at a place different than the term of art."

Happens all the time in math and physics but having centuries of experience with this issue we usually just slap the name of a person on the name of the concept. That is why we have Gaussian Curvature and Riemann Integrals. Maybe we should speak of Jung Memory Safety too.

Thinking about it, the opposite also happens. In the early 19th century "group" had a specific meaning, today it has a much broader meaning with the original meaning preserved under the term "Galois Group".

Or even simpler: For the longest time seconds were defined as fraction of a day and varied in length. Now we have a precise and constant definition and still call them seconds and not ISO seconds.

lenkite

How does Java "fail" to be memory safe by the definition used by the author ? Please give an example.

elktown

The older I get the more I just see these kinds of threads like I see politics: Exaggerate your "opponents" weaknesses, underplay/ignore its strengths and so on. So if something no matter how disproportionate can be construed to be, or be associate with, a current zeitgeist with a negative sentiment, it's an opportunity to gain ground.

I really don't understand why people get so obsessed with their tools that it turns into a political battleground. It's a means to an end. Not the end itself.

FiloSottile

I have never seen real Go code (i.e. not code written purposefully to be exploitable) that was exploitable due to a data race.

This doesn’t prove a negative, but is probably a good hint that this risk is not something worth prioritizing for Go applications from a security point of view.

Compare this with C/C++ where 60-75% of real world vulnerabilities are memory safety vulnerabilities. Memory safety is definitely a spectrum, and I’d argue there are diminishing returns.

stouset

Maintenance in general is a burden much greater than CVEs. Exploits are bad, certainly, but a bug not being exploitable is still a bug that needs to be fixed.

With maintenance being a "large" integer multiple of initial development, anything that brings that factor down is probably worth it, even if it comes at an incremental cost in getting your thing out the door.

9rx

> but a bug not being exploitable is still a bug that needs to be fixed.

Do you? Not every bug needs to be fixed. I've never see a data race bug in documented behaviour make it past initial development.

I have seen data races in undocumented behaviour in production, but as it isn't documented, your program doesn't have to do that! It doesn't matter if it fails. It wasn't a concern of your program in the first place.

That is still a problem if an attacker uses undocumented behaviour to find an exploit, but when it is benign... Oh well. Who cares?

null

[deleted]

LtWorf

I have! What do i win?

EE84M3i

Was it open source? Would be interested to know more.

LtWorf

Yeah, reading binary files in go with an mmap library and the whole file is based on offsets to point to other sections of the file. Damaged file or programming error and segfault.

crawshaw

Memory safety is a big deal because many of the CVEs against C programs are memory safety bugs. Thread safety is not a major source of CVEs against Go programs.

It’s a nice theoretical argument but doesn’t hold up in practice.

nine_k

A typical memory safety issue in a C program is likely to generate an RCE. A thread-safety issue that leads to a segfault can likely only lead to a DoS attack, unpleasant but much less dangerous. A race condition can theoretically lead to more powerful attacks, but triggering it should be much harder.

SkiFire13

A thread-safety issue does not always lead to a segfault. Here it did because the address written was 42, but if you somehow manage to obtain the address of some valid value then you could read from that instead, and not cause an immediate segfault.

I agree with the sentiment that data races are generally harder to exploit, but it _is possible_ to do.

okanat

It depends on what threads can do. Threads share memory with other threads and you can corrupt the data structure to force the other thread to do an unsafe / invalid operation.

It can be as simple as changing the size of a vector from one thread while the other one accesses it. When executed sequentiality, the operations are safe. With concurrency all bets are off. Even with Go. Hence the argument in TFA.

crawshaw

All bets aren’t off, we empirically measure the safety of software based on exploits. C memory handling is most of its exploits.

Show me the exploits based on Go parallelism. This issue has been discussed publicly for 10 years yet the exploits have not appeared. That’s why it's a nice theoretical argument but does not hold up in practice.

stouset

A CVE is worse, but a threading bug resulting in corrupted data or a crash is still a bug that needs someone to triage, understand, and fix.

crawshaw

But it's not why I stopped writing C programs. It's just a bug and I create and fix a dozen bugs every day. Security is the only argument for memory safety that moves mountains.

kllrnohj

This isn't arguing about exploit risks of the language but simply whether or not it meets the definition of memory safe. Go doesn't satisfy the definition, so it's not memory safe. It's quite black & white here.

Nice strawman though

qcnguy

The point being made is sound, but I can never escape the feeling that most concurrency discussion in programming language theory is ignoring the elephant in the room. The concurrency bugs that matter in most apps are all happening inside the database due to lack of proper locking, transactions or transactional isolation. PL theory ignores this and so things like Rust's approach to race freedom ends up not mattering much outside of places like kernels. A Rust app can avoid use of unsafe entirely and still be riddled with race conditions because all the data that matters is in an RDBMS and someone forgot a FOR UPDATE in their SELECT clause.

layer8

What’s worse, even if you use proper transactions for everything, it’s hard to reason about visibility and data races when performing SQL across tables, or multiple dependent SQL statements within a transaction.

norir

The sad thing is that most languages with threads have a default of global variables and unrestricted shared memory access. This is the source of the vast majority of data corruption and races. Processes are generally a better concurrency model than threads, but they are unfortunately too heavyweight for many use cases. If we defaulted to message passing all required data to each thread (either by always copying or tracking ownership to elide unnecessary copying), most of these kinds of problems would go away.

In the meantime, we thankfully have agency and are free to choose not to use global variables and shared memory even if the platform offers them to us.

kibwen

> The sad thing is that most languages with threads have a default of global variables and unrestricted shared memory access. This is the source of the vast majority of data corruption and races. Processes are generally a better concurrency model than threads

Modern languages have the option of representing thread-safety in the type system, e.g. what Rust does, where working with threads is a dream (especially when you get to use structured concurrency via thread::scope).

People tend to forget that Rust's original goal was not "let's make a memory-safe systems language", it was "let's make a thread-safe systems language", and memory safety just came along for the ride.

tialaramex

Originally Rust is something altogether different. Graydon has written about that extensively. Graydon wanted tail calls, reflection, more "natural" arithmetic with Python style automatic big numbers, decimal for financial work and so on.

The Rust we have from 1.0 onwards is not what Graydon wanted at all. Would Graydon's language have been broadly popular? Probably not, we'll never know.

kibwen

Even in pre-1.0 Rust, concurrency was a primary goal; there's a reason that Graydon listed Newsqueak, Alef, Limbo, and Erlang in the long list of influences for proto-Rust.

nine_k

While at it, I suppose it's straightforward to implement arbitrary-precision integers and decimals in today's Rust; there are several crates for that. There's also a `tailcall` crate that apparently implements TCO [1].

[1]: https://docs.rs/tailcall/latest/tailcall/

zozbot234

Message passing can easily lead to more logical errors (such as race conditions and/or deadlocks) than sharing memory directly with properly synchronized access. It's not a silver bullet.

umpalumpaaa

100%.

Some more modern languages - eg. Swift – have "sendable" value types that are inherently thread safe. In my experience some developers tend to equate "sendable" / thread safe data structures with a silver bullet. But you still have to think about what you do in a broader sense… You still have to assemble your thread safe data structures in a way that makes sense, you have to identify what "transactions" you have in your mental model and you still have to think about data consistency.

advisedwang

Wow that's a really big gotcha in go!

To be fair though, go has a big emphasis on using its communication primitives instead of directly sharing memory between goroutines [1].

[1] https://go.dev/blog/codelab-share

TheDong

Even if you use channels to send things between goroutines, go makes it very hard to do so safely because it doesn't have the idea of sendable types, ownership, read-only references, and so on.

For example, is the following program safe, or does it race?

    func processData(lines <-chan []byte) {
      for line := range lines {
        fmt.Printf("processing line: %v\n", line)
      }
    }

    func main() {
      lines := make(chan []byte)
      go processData(lines)

      var buf bytes.Buffer
      for range 3 {
        buf.WriteString("mock data, assume this got read into the buffer from a file or something")
        lines <- buf.Bytes()
        buf.Reset()
      }
    }
The answer is of course that it's a data race. Why?

Because `buf.Bytes()` returns the underlying memory, and then `Reset` lets you re-use the same backing memory, and so "processData" and "main" are both writing to the same data at the same time.

In rust, this would not compile because it is two mutable references to the same data, you'd either have to send ownership across the channel, or send a copy.

In go, it's confusing. If you use `bytes.Buffer.ReadBytes("\n")` you get a copy back, so you can send it. Same for `bytes.Buffer.String()`.

But if you use `bytes.Buffer.Bytes()` you get something you can't pass across a channel safely, unless you also never use that bytes.Buffer again.

Channels in rust solve this problem because rust understands "sending" and ownership. Go does not have those things, and so they just give you a new tool to shoot yourself in the foot that is slower than mutexes, and based on my experience with new gophers, also more difficult to use correctly.

Mawr

> In go, it's confusing. If you use `bytes.Buffer.ReadBytes("\n")` you get a copy back, so you can send it. Same for `bytes.Buffer.String()`.

>

> But if you use `bytes.Buffer.Bytes()`

If you're experienced, it's pretty obvious that a `bytes.Buffer` will simply return its underlying storage if you call `.Bytes()` on it, but will have to allocate and return a new object if you call say `.String()` on it.

> unless you also never use that bytes.Buffer again.

I'm afraid that's concurrency 101. It's exactly the same in Go as in any language before it, you must make sure to define object lifetimes once you start passing them around in concurrent fashion.

Channels are nice in that they model certain common concurrency patterns really well - pipelines of processing. You don't have to annotate everything with mutexes and you get backpressure for free. But they are not supposed to be the final solution to all things concurrency and they certainly aren't supposed to make data races impossible.

> Even if you use channels to send things between goroutines, go makes it very hard to do so safely

Really? Because it seems really easy to me. The consumer of the channel needs some data to operate on? Ok, is it only for reading? Then send a copy. For writing too? No problem, send a reference and never touch that reference on our side of the fence again until the consumer is done executing.

Seems about as hard to understand to me as the reason why my friend is upset when I ate the cake I gave to him as a gift. I gave it to him and subsequently treated it as my own!

Such issues only arise if you try to apply concurrency to a problem willy-nilly, without rethinking your data model to fit into a concurrent context.

Now, would the Rust approach be better here? Sure, but not if that means using Rust ;) Rust's fancy concurrency guarantees come with the whole package that is Rust, which as a language is usually wildly inappropriate for the problem at hand. But if I could opt into Rust-like protections for specific Go data structures, that'd be great.

hu3

That code would never pass a human pull request review. It doesn't even pass AI code review with a simple "review this code" prompt: https://chatgpt.com/share/68829f14-c004-8001-ac20-4dc1796c76...

"2. Shared buffer causes race/data reuse You're writing to buf, getting buf.Bytes(), and sending it to the channel. But buf.Bytes() returns a slice backed by the same memory, which you then Reset(). This causes line in processData to read the reset or reused buffer."

I mean, you're basically passing a pointer to another thread to processData() and then promptly trying to do stuff with the same pointer.

nemothekid

If you are familiar with the internals of bytes/buffer you would catch this. But it would be great for the compiler to catch this instead of a human reviewer. In Rust, this code wouldn't even compile. And I'd argue even in C++, this mistake would be clearer to see in just the code.

TheDong

> I mean, you're basically passing a pointer to another thread to processData()

And yet, "bytes.Buffer.ReadBytes(delim)" returns a copy of the underlying data which would be safe in this context.

The type system does not make it obvious when this is safe or not, and passing pointers you own across channels is fine and common.

> That code would never pass a human pull request review

Yes, that was a simplified example that a human or AI could spot.

When you actually see this in the wild, it's not a minimal example, it's a small bug in hundreds of lines of code.

I've seen this often enough that it obviously does actually happen, and does pass human code review.

Mawr

Wow who knew concurrency is hard!

This isn't anything special, if you want to start dealing with concurrency you're going to have to know about race conditions and such. There is no language that can ever address that because your program will always be interacting with the outside world.

zozbot234

Real-world golang programs share memory all the time, because the "share by communicating" pattern leads to pervasive logical problems, i.e. "safe" race conditions and "safe" deadlocks.

jrockway

I am not sure sync.Mutex fixes either of these problems. Press C-\ on a random Go server that's been up for a while and you'll probably find 3000 goroutines stuck on a Lock() call that's never going to return. At least you can time out channel operations:

   select {
   case <-ctx.Done():
          return context.Cause(ctx)
   case msg := <-ch:
          ...
   }

aatd86

Why does it segfault? Because you have not used a sufficiently clever value for the integer that wouldn't when used as an address?

Just wondering.

Realistically that would be quite rare since it is obvious that this is unprotected shared mutable access. But interesting that such a conversion without unsafe may happen. If it segfaults all the time though then we still have memory safety I guess.

The article is interesting but I wish it would try to provide ideas for solutions then.

codys

Curiously, Go itself is unclear about its memory safety on go.dev. It has a few references to memory safety in the FAQ (https://go.dev/doc/faq#Do_Go_programs_link_with_Cpp_programs, https://go.dev/doc/faq#unions) implying that Go is memory safe, but never defines what those FAQ questions mean with their statements about "memory safety". There is a 2012 presentation by Rob Pike (https://go.dev/talks/2012/splash.slide#49) where it is stated that go is "Not purely memory safe", seeming to disagree with the more recent FAQ. What is meant by "purely memory safe" is also not defined. The Go documentation for the race detector talks about whether operations are "safe" when mutexes aren't added, but doesn't clarify what "safe" actually means (https://go.dev/doc/articles/race_detector#Unprotected_global...). The git record is similarly unclear.

In contrast to the go project itself, external users of Go frequently make strong claims about Go's memory safety. fly.io calls Go a "memory-safe programming language" in their security documentation (https://fly.io/docs/security/security-at-fly-io/#application...). They don't indicate what a "memory-safe programming language" is. The owners of "memorysafety.org" also list Go as a memory safe language (https://www.memorysafety.org/docs/memory-safety/). This later link doesn't have a concrete definition of the meaning of memory safety, but is kind enough to provide a non-exaustive list of example issues one of which ("Out of Bounds Reads and Writes") is shown by the article from this post to be something not given to us by Go, indicating memorysafety.org may wish to update their list.

It seems like at the very least Go and others could make it more clear what they mean by memory safety, and the existence of this kind of error in Go indicates that they likely should avoid calling Go memory safe without qualification.

ralfj

> Curiously, Go itself is unclear about its memory safety on go.dev.

Yeah... I was actually surprised by that when I did the research for the article. I had to go to Wikipedia to find a reference for "Go is considered memory-safe".

Maybe they didn't think much about it, or maybe they enjoy the ambiguity. IMO it'd be more honest to just clearly state this. I don't mind Go making different trade-offs than my favorite language, but I do mind them not being upfront about the consequences of their choices.

phire

The definition kind of changed.

At the time Go was created, it met one common definition of "memory safety", which was essentially "have a garbage collector". And compared to c/c++, it is much safer.

ralfj

> it met one common definition of "memory safety", which was essentially "have a garbage collector"

This is the first time I hear that being suggested as ever having been the definition of memory safety. Do you have a source for this?

Given that except for Go every single language gets this right (to my knowledge), I am kind of doubtful that this is a consequence of the term changing its meaning.

codys

That seems contrasted by Rob Pike's statement in 2012 in the linked presentation being one of the places where it's called "not purely memory safe". That would have been early, and Go is not called memory safe then. It seems like calling Go memory safe is a more recent thing rather than a historical thing.

phire

Keep in mind that the 2012 presentations dates to 10 months after Rust's first release, and its version of "Memory Safety" was collecting quite a bit of attention. I'd argue the definition was already changing by this point. It's also possible that Go was already discovering their version of "Memory Safety" just wasn't safe enough.

If you go back to the original 2009 announcement talk, "Memory Safety" is listed as an explicit goal, with no carveouts:

"Safety is critical. It's critical that the language be type-safe and that it be memory-safe."

"It is important that a program not be able to derive a bad address and just use it; That a program that compiles is type-safe and memory-safe. That is a critical part of making robust software, and that's just fundamental."

https://youtu.be/rKnDgT73v8s?t=463