Go Data Structures (2009)

84 comments

·February 5, 2025

zellyn

The most important thing to know is that Go slices are exactly how you would implement a dynamically growable slice in C if I asked you to. No less, but importantly, no more: a pointer to a piece of memory, and a current length and capacity.

The classic slice mistake is to forget that append doesn't necessarily return a completely new slice backed by a disjoint piece of memory, especially easy if you've been using a language where variables are immutable by default.

The problem is that it will often accidentally work; if you over-run the backing store, you will get a new piece of memory:

    a := make([]int, 1, 1) // a is backed by a one-element-long piece of memory
    a[0] = 42
    b := append(a, 17) // this must allocate a new piece of memory
    c := append(a, 37) // this must also allocate a new piece of memory

    // At this point a == [42], b == [42, 17], c == [42, 37]

The problem is when the existing capacity is sufficient:

    a := make([]int, 1, 10) // a is backed by a ten-element-long piece of memory
    a[0] = 42
    b := append(a, 17) // no need to allocate: just use the existing memory
    c := append(a, 37) // no need to allocate: just (re)use the existing memory

    // At this point a == [42], b == [42, 37], c == [42, 37]
    //                          ^^^^^^^^^^^^^

ustad

Go's slice behavior reveals a concerning trade-off in language design where performance was prioritized over predictability. Consider this trap:

    a := make([]int, 1, 10)
    a[0] = 42
    b := append(a, 17)
    c := append(a, 37)
    // Surprise! b and c both equal [42, 37]

The fact that append()'s behavior changes based on a hidden property (capacity) violates what I consider a core principle of language design: fundamental operations should be predictable and easy to reason about without having to think about implementation details.

While I understand the performance benefits of reusing memory, language fundamentals shouldn't require developers to constantly think about backing array capacity to avoid bugs. This is especially problematic because it "usually works" until it doesn't - the worst kind of bug.

Modern languages have shown us better ways. Rust makes ownership explicit. Immutable-by-default languages prevent such surprises entirely. Even if Go wanted to maintain performance, they could have made this behavior more explicit - perhaps with separate "append_safe" and "append_reuse" functions.

Programming language fundamentals should be like good UI design - they should do what you expect them to do. When core operations have surprising edge cases, it creates cognitive overhead that ripples through all code written in that language.

never_inline

Usually the idiom is using the same variable on LHS.

``` a = append(a, elem) ```

If you assign it to anything else, `a` might still remain with the older slice memory which will cause worse problems than equality comparison.

So the kind of code you wrote is hardly ever written in production.

The advantage of having this is avoiding one more heap allocation for slice, which is a good tradeoff if you ask me.

deergomoo

In production almost every

`foo, err := callSomeFn()`

is followed by

``` if err != nil { return err } ```

too; it’s the ones that get missed that are the problem because convention will do nothing to prevent silly mistakes.

Like many things in Go, there are steps the language could take but chooses not to. For example, Swift implements copy-on-write for variable-length arrays. I’m not qualified to comment on whether the Go team is right or wrong in their decisions, but these aren’t unsolved problems.

ternaryoperator

Honestly, I couldn't believe that this code works as you stated it does. But I ran it in the Playground[17] and sure enough! That is exactly what it produces. Ugh!

[37] https://go.dev/play/p/rO7rlvK7Dna

lolpanda

oh i never thought about y = append(x, ...) in which scenario would make sense to have different x and y?

neillyons

wow.

kccqzy

I'd argue that the Go example is worse than the equivalent in C, because Go has a garbage collector and C does not. In C, precisely because of the lack of a garbage collector, people need to explicitly and carefully document who is responsible for deallocating the memory. This makes things clear who is owning that memory. And this by extension leads to an intuition on whether the append is acceptable on the original slice or not. If a C function returns such a slice and is documented that the caller should free it, you can call realloc and add to it; if the C function documents that the caller should not free it, you naturally wouldn't even try to append to it: you would copy it first.

In Go, this garbage collector frees people from thinking about freeing memory. But it doesn't free people from thinking about owned vs borrowed memory. And that's where bugs can happen.

zellyn

I happen to like the way Go did it, but now that generics and iterators have landed, it should be possible to create an always-make-a-copy vector type (although it'll need to use `get` and `set` like Java Lists, not [42] like slices/arrays).

It's not like you could have appended twice to the same List in Java and expected to get two disjoint arrays, unless you copy first, and `slices.Clone()` makes that easy in Go now too.

kbolino

This isn't really a practical problem.

If you created the slice, you control it and you can modify its elements, append to it, etc.

If you didn't create the slice, you don't control it. If you want to modify it or append to it, you should copy it first.

This has reflected how I've seen Go used professionally by people experienced in the language. The language could offer more help (but probably never will), but it's not hardly a regression from C. The real regression from C is the lack of const pointers.

pjmlp

And enums,

   type Colours (Red, Blue, Green)

   //....
   c := Colours.Blue
   a := pred(c)
   b := succ(c)
   
   for x := range Colours { 
      //....
   }

1976's Pascal style is unthinkable to ever land in Go, but we have iota and const, lets not complain.

kccqzy

The practical problem is about transferring the control. Without it, you end up doing way too much defensive copying.

BoingBoomTschak

That's not the "classic slice mistake", that's slices being one of the stupidest things Go ever decided upon. In C++, that'd be the equivalent of merging std::vector<T> and std::span<T, std::dynamic_extent> into a single structure for "reasons" ("simplicity"?).

https://build-your-own.org/blog/20241125_go_slice_surprise/ goes a bit further, but this is really the point where I decided I wouldn't be able to stand Go.

zellyn

In the period when Go was ready for production use and Rust wasn't, I couldn't wait for Rust to be ready, so that the response to everyone hating Go for not being a C++ equivalent could just be, "I think you want Rust."

I think you want Rust.

wavemode

It's not clear to me why I would need a "dynamically growable slice". That's not a slice at that point, it's an vector aka ArrayList.

If all I need is a slice of memory, I'd rather only have to pass around the two pieces of data (pointer and length) than all three.

tialaramex

I think of this as a failed experiment. Like coal powered cars or when we thought maybe transatlantic travel by airship was the Right Thing. It may be hard to know what'll happen without trying.

The growable array type (what you call "vector") is a venerable data type, although it does still have parameters you might reasonable tweak e.g. what's the correct growth factor? Doubling is popular but e.g. Zig chooses new_size = (old_size * 1.5) + 8 and you'll see disagreement about what API should be offered and why - e.g. C++ std::vector has the wrong reservation API

But this thing clearly isn't a mistake, it's intentionally a hybrid, and if it had been a huge success maybe everybody else would scramble to copy it.

slashdev

Indeed, this is how rust handles it.

A mutable String or Vec has all three fields, but the borrowed slice or str has just pointer and length.

Go chooses to be simpler by not having those kinds of separations, which also makes it a little less efficient under some circumstances.

zellyn

Yes. "dynamically growable slice" : "vector" :: "tomato" : "to-mah-to"

tialaramex

The growable array type is an owning container, whereas a slice (in C++ and C# span) is a reference type, it doesn't actually own anything.

This type isn't either of those things, it's a hybrid, and Go tries to get away with that because it's a garbage collected language so people don't need to care as much about ownership.

jas39

Very disturbing how append sometimes decouples, sometimes not.

it’s a good read, but i think it should focus more on some of the common mistakes that people make when slicing a slice.

rednafi

Slicing a slice is full of gotchas. I tend to forget all the rules and avoid it whenever I can.

adonovan

A slice operation s[i:] seems like it should be little more than an ADD instruction for a registerized slice where i is known to be in bounds, but a surprising little detail is that when i==cap(s) we really don't want to create an empty slice whose data pointer points one past the end of the original array, as this could keep some arbitrary object live. So the compiler generates a special branch-free sequence (NEG;SUB;ASR;AND) to compute the correct pointer increment, ((i - cap) >> 63) & (8 * i).

https://go.dev/play/p/J2U4djvMVoY

rendaw

Appending a slice is also full of gotchas. Sometimes it modifies the slice in place, sometimes it reallocates and copies.

pstuart

Only really a gotcha if you pass a slice into a function and expect to see modifications in that slice after the function completes. It's helpful to remember that Go passes by value, not reference.

That can be addressed by passing the slice as a pointer: https://go.dev/play/p/h9Cg8qL9kNL

bborud

It is a suprisingly hard thing to implement well. I have no idea how many times I have implemented slice-like things in C (back in the 1990-2000s when I mostly wrote C) and it was one of those things I never managed to be happy with.

pjmlp

Something that even Mesa and Modula-2 already supported by 1980's, maybe one day C will eventually get proper slices.

neonsunset

It can be done but that requires a better, more expressive type system.

fmbb

Can you link some article about these common mistakes? Sounds like good learnings can be had.

WolfCop

(2009)

rednafi

It's Go we're talking about. Other than 64-bit the dominant word size, nothing much has changed.

4ad

The interface layout has changed since the article (although this specific article doesn't mention interfaces, a later article in the series does). Additionally Go now has generics.

It's true that little has changed, but very little is changing in the data representation of any language, really. Even ones that are evolving rapidly.

Cthulhu_

This is a truth that's easily overlooked; most languages are several levels beyond basic types to the point that people forget about the low level constructs involved. This is one reason why I like Go, it exposes and educates on fairly low-level mechanisms that are not unfamiliar to anyone who's studied computer science, but at the same time you don't have to worry too much about the lower level stuff like memory, pointers, zeroing, etc. I think it's a good tradeoff.

mseepgood

[flagged]

Cthulhu_

HN is not primarily a news site though, despite the name.

null

[deleted]