A compact bitset implementation used in Ocarina of Time save files
30 comments
·July 4, 2025orlp
> The most novel aspect of OoT bitsets is that the first 4 bits in the 16-bit coordinate IDs index which bit is set. For example, 0xA5 shows that the 5th bit is set in the 10th word of the array of 16-bit integers. This only works in the 16-bit representation! 32 bit words would need 5 bits to index the bit, which wouldn't map cleanly to a nibble for debugging.
There is nothing novel about this really. It's neat that it works with hexadecimal printing to directly read off the sub-limb index but honestly who cares about that.
Outside of that observation there's no advantage to 16-bit limbs and this is just a bog-standard bitset where the first k bits indicate the position within the 2^k bit limb, and the remaining bits give you the limb index.
jb55
updated for the hacker news crowd
https://github.com/jb55/oot_bitset#:~:text=Hacker%20news%20f...
lpribis
Haha, good to take the classic HN condescension on the chin ;)
gurkwart
To give some counterweight to the general sentiment here: I love this one! Thanks for sharing. Similar to yourself, I've been using bitsets for decades (mostly std::bitset, in addition to some older third-party libs and self-made implementations), and still found this very kool.
It's not about the bitset, duh. It's about how to organize and think about your flags. The small visualization grid is fantastic for debugging, and the `word:bit` structure lends itself directly to organize your data in categories and display them as such.
jamesu
Reminds me of the extensive use of bitsets in Starfox Adventures. Not only was object spawning tied to flags, many dynamic effects were too so if you prodded the flags in memory often a lot of funky things would happen like explosions getting triggered and cutscenes.
Const-me
In C++, std::vector<bool> does pretty much the same thing under the hood.
Sadly, C++ standard library API designers made it impossible to directly access the array of integers stored inside std::vector<bool>. This made serialization/de-serialization of these vectors very inefficient, sometimes prohibitively so.
abcd_f
How's this notable? It’s literally the most straight-forward, trivial bitset implementation.
rixed
After reading your comment, I was expecting some kind of RLE over a bit set. But no, not even that?
So I guess, it is notable because it shows how far coding has been industrialized; ie has been dumbed down so much that a mere bit field can be presented as some esoteric trick.
At this speed AGI is indeed around the corner :-D
pixelpoet
Reminds me of the kerfuffle about https://en.wikipedia.org/wiki/Tai%27s_model, where someone managed to get published what is basically the Trapezoidal Rule for numerical integration, and of course had to name it after themselves.
perching_aix
The world is big and rediscoveries are not always going to be so graceful. If stuff is harder / more annoying to find than to figure it out yourself, you're downright reasonable for intentionally attempting so even.
If I remember correctly, in the "Tia's rule" case specifically, her coworkers even defended her for the whole thing - implying of course that she was being ridiculed for this, making neither side of this story look very pretty if you ask me.
Also why people don't "RTFM". Yes, I can read through the pages of prose you wrote how you implemented e.g. bog standard role based auth. Or I can just brute force try using all the related looking commands and see what happens...
bangonkeyboard
> Tai disagreed that Tai's model is simply the trapezoidal rule, on the basis that her model uses the summed areas of rectangles and triangles rather than trapezoids.
Oh my god.
convolvatron
omg, its so obvious! making computers smart is hard, but making people think less requires no effort at all. I can't believe I missed the plan so thoroughly.
TOGoS
The neat thing here is that they did it without involving three package managers, two Docker containers, and an LLM.
armada651
> The most novel aspect of OoT bitsets is that the first 4 bits in the 16-bit coordinate IDs index which bit is set.
Am I the only one who was trouble parsing this sentence correctly?
coldpie
No, haha, it took me a while, too. "Index" is operating as a verb here.
ethan_smith
It means the first 4 bits of a 16-bit ID determine which bit position (0-15) to set in the bitset, essentially using part of the ID itself as an index into the bitset.
curiousgal
The coordinates IDS are 16-bit long. The first 4 bits index (verb) which bit is set.
a_e_k
Darn. By "compact", I thought there'd be some clever sparsity or compression to get below 1 bit/flag in common cases.
bangonkeyboard
Apparently "compact" here refers to the five lines of actual code.
Just use <bitstring.h>.
tacker2000
How is this in any way novel? People have been doing this since the dawn of time.
jb55
I will remove the novel wording just for the hacker news geniuses. I have been programming for 26 years and have never encountered this pattern, nor could I find it in any libraries, which is why I decided to wrap it up in a library.
If a simple bitset like this exists in a library somewhere I would love to see where! Most implementations I've seen over-complicate it for simple use cases like this.
sumtechguy
You may enjoy this page then. https://graphics.stanford.edu/~seander/bithacks.html
Bunches of bit twiddling things people like to do in different places for either speed/space.
jb55
I have had the experience of explaining to coworkers how bitwise operators even work more than once. I think sometimes people overestimate the average programmers knowledge when it comes to bit operations. modern programming is so detached from having to use that for day to day work.
I am aware of the bithacks page, but I just found encoding the bit coordinate in the ID itself so clever.
brcmthrowaway
So there is no compression?
dazzawazza
No. This might seem useless to non game coders but it's a very common pattern in videogames. The state of the world is frequently 1000s of yes/no data points.
It's not rocket science but this seems like a decent enough implementation of it. The number of data points rarely goes beyond thousands so compression doesn't have much value.
sandinmyjoints
(From a non-game coder) How do teams keep track of which events need to be tracked in state, what their names are, etc.? (Not how as in how is this possible, but rather, what kinds of processes have evolved for this.) Like are state events typically planned out ahead of time, or do devs realize which ones they need as they go? What if I want to add a new event to state, how can I check if someone else has added it -- do I just scan all the names in the code and hope that it was named something that I will recognize?
kvdveer
No compression. In the olden days, storage was scarce. While you could compres the bitset, the worst case compression of that bitset would be larger than the uncompressed size. You likely have no way to guarantee the worst case is impossible, so you'd have to reserve extra space. In the end you'd store more data, in a slower to use format.
These days storage bandwidth is the main limiter, so now compression makes sense. When storage size is the main limiter, compression is ironically not very helpful.
I used to work at a Company where I did parking software and it was based on Borland C++ Builder 6 codebase. Up to 2012, when I left, we were using Bit flags to indicate configuration settings. We would store these settings in a config file as binary and recall it back when the application started up. We stored a ton of configuration in each flag and it was easy to add more.
But in retrospect, it wasn't super great if the disc or installation somehow got corrupted. You would lose your entire configuration and parking operators do not have the best IT.
This way of storing configuration was designed back in the 90s at this Company and it survived over 20 years.