Skip to content(if available)orjump to list(if available)

You could have invented Fenwick trees

You could have invented Fenwick trees

28 comments

·January 25, 2025

code_biologist

Written by Brent Yorgey — I'll definitely give it a read when I have time. When I was learning Haskell, I found his Typeclassopedia [1] to be the best explanation of Monads, Monoids, Applicative and such. His pedagogical approach was delightful in contrast to all the "monad is like a burrito" tutorials.

[1] https://wiki.haskell.org/wikiupload/e/e9/Typeclassopedia.pdf

vh311

I used them in my paper: https://arxiv.org/pdf/1701.07072

They appear quite naturally in the fermion-qubit mappings we looked at and I worked the data structure out at the time and only then found out about Fenwick’s work.

So I guess I also agree with the title :)

guimplen

While it might be ok to use colloquially the term, "Fenwick trees" since it became a common name, calling them "a Fenwick's work" and not mentioning Ryabko is a clear misattribution.

vh311

Sure, by what I wrote above I meant more of a "Fenwick's paper" than "Fenwick's work" :)

That said, as far as I remember, I did not know about Ryabko. So while it might have been a misatribution, it's far from a clear one.

Did he publish it btw.? If so, do you have a reference?

guimplen

All the references are on the wiki page: https://en.wikipedia.org/wiki/Fenwick_tree

cosmic_quanta

I always feel warm inside when I hear of concepts in separate fields connecting in some way. Very cool

DannyBee

Fenwick trees are really interesting - I was implementing graph layout algorithms a few weeks ago, and it turns out they exactly answer the question of how many edges cross between two layers of a bipartite graph drawn with straight lines[1].

Or, in english, if you have a bunch of graph nodes and edges, and lay them out in straight lines horizontally and vertically in some fashion, with lines between them representing the edges, fenwick trees tell you how many of the lines that you draw run into each other [2].

This is normally an expensive thing to compute (the alternative is to use a standard line intersection calculation for every edge or something like this). However, it turns out fenwick trees directly encode and give you the answer.

So this is what ELK and Dagre and friends use to compute the number of edges that cross between layers as they do layout.

[1] see "Simple and Efficient Bilayer Cross Counting"

[2] In general, the graph is more visually pleasing to most people if you minimize the number of lines that cross. So most layout algorithms spend time rearranging the nodes to minimize edge crossings. It's NP-hard to make it optimal, so heuristics are used.

sundarurfriend

Tangential to the algorithm itself, this is about naming:

> Fenwick trees, also known as binary indexed trees

Every time I read something like this, I'm reminded of https://willcrichton.net/notes/naming-conventions-that-need-... . "Fenwick tree" makes it seem like some unknown strange entity, while "binary indexed tree" immediately makes it a lot more accessible and gives you a better handle on the concept too.

mbb70

I'm reminded of a town in New Mexico named after the railroad paymaster because everyone said "I'm going to see Gallup to get paid" -> "I'm going to Gallup to get paid" -> "I'm going to Gallup".

Short, memorable names will always dominate precise, descriptive ones.

Finding short, memorable, precise and descriptive names is why "naming things is hard".

zeroonetwothree

Meh, I agree in spirit but in practice it’s quite hard to get enough unique names using descriptive terms. I also think there’s some benefit in learning history that comes from naming things after people.

sundarurfriend

In practice, the most it gets is people thinking "I guess some guy named Fenwick was involved somewhere along the way", if even that. I think there's a lot of benefit to learning the history of how things were discovered, but that should be done consciously by making it part of the teaching material. The names make barely any difference.

It's not trivial to come up with descriptive names, but the difficulty is often overstated because people underestimate its importance. The names don't have to be perfectly descriptive, we don't need twenty syllable names like organic chemical compounds, but every step taken towards making it descriptive helps a lot. The reality is that the names get chosen basically arbitrarily in journals or textbooks, with hardly any thought given to how it will affect generations of students, working professionals, and even future researchers in remembering the concept and looking it up in their mind among the hundreds or thousands of other things they've learnt.

SerCe

I remember trying to memorise the implementation by following the e-maxx guides [1] (back then it was http://e-maxx.ru/) for use in competitive programming contests. They're so simple once you understand them, yet it's pretty easy to forget the details of the simple implementation if you haven't done it for a while.

[1]: https://cp-algorithms.com/data_structures/fenwick.html

celeritascelery

In the same way that B-trees have emerged as a more cache friendly version of binary trees, it seems like you could create a Fenwick tree that stores multiple “children” at the same location. It would probably be less algorithmically beautiful, but would be faster over all.

kadoban

I am not sure that is something worth trying. In practice Fenwick trees are array-encoded, in memory they're just an array you index into. The reason B-trees are used is kind of because you can't do that with a dynamic binary tree.

celeritascelery

They are array encoded, but you still will be jumping all over the array to do your operations. Putting multiple “nodes” at the same location will mean fewer jumps, and hence fewer cache misses.

o11c

That's what the Eytzinger ordering is for. Using 1-based indexing like the article, a single lookup will only hit the following nodes:

  1
  2 or 3 (let's assume 2)
  4 or 5 (let's assume 5)
  10 or 11
Remember these are all adjacent. So the nodes near the root stay in cache (regardless of whether a path through them in particular was taken), and the other nodes are in cache if you've recently looked up a similar key.

There won't be any further improvement from using a B-tree, which only scatters the memory further. (if anything, you might consider using a higher-base Eytzinger tree for SIMD reasons, but I've never actually seen anyone do this)

dreamcompiler

Skip lists also have [statistically] sublinear times for insertions and range queries. Both are O(logn).

Fenwick trees seem to use much less memory while not being quite as general-purpose as skip lists.

SethTro

Totally agree with the title. I discovered Fenwick trees as part of solving a Project Euler, then from the problem forum I found out someone had invented this in the 90s and other people had imagined it much earlier.

plasticeagle

There's several things like that you can accidentally re-invent. I re-invented the Trie, and Markov Chaining whilst at University. What I didn't do is give them a thorough and rigorous description in any kind of publication.

That's quite a big part of the difference, I think.

bowsamic

There’s also a huge difference between accidentally implementing an application of something vs spotting that it is a general principle

kevmo314

I used to think inventing new things was the hard part. Now I'm finding that realizing solutions are novel and noteworthy is a lot harder.

nayuki

When making my implementation, the math felt unintuitive. https://www.nayuki.io/page/binary-indexed-tree

null

[deleted]

agnishom

This is the beauty of some of the best algorithms. It always feels like "I could have written that paper"

zeroonetwothree

It seems like all the algorithms post 2000 or so do not have that property as much. For example: https://web.eecs.umich.edu/~pettie/papers/jacm-optmsf.pdf