Undergraduate Disproves 40-Year-Old Conjecture, Invents New Kind of Hash Table

115 comments

·March 17, 2025

taylodl

> Krapivin was not held back by the conventional wisdom for the simple reason that he was unaware of it. “I did this without knowing about Yao’s conjecture,” he said.

This idea resonates with me, especially in the context of modern physics. The extensive learning required to make advancements often means that one's thinking is heavily influenced by established theories and teachings, potentially stifling true innovation.

rmah

If I may be so bold, it resonates (with you, with me, with most people) because the idea that one can make revolutionary discoveries that contradicts the status quo as an outsider is an inherently attractive idea. It scratches many emotional itches: tweaking the nose of "the man"; suspicion of institutions; being an unknown hero (before the discovery); discovering one's inherent, but unrecognized, greatness; and more.

But the reality of the matter is that for every Krapivin, there are thousands of other people who tried but did not. And more importantly, there are hundreds of others who did the work, gained the foundational knowledge, and THEN made a discovery that changed how we think about the world even if only a tiny bit. While it's not as romantic, it's the way reality works. Mostly.

It sounds like Krapivin is continuing his education. And hopefully his hard work will hone his brilliance and help him to make even more discoveries in the future.

forbiddenvoid

We can have both. I think we need both. Some problems are better suited to solutions coming from outsiders. Most are not. We don't know which are which.

jl6

Physics skunkworks where the inductees are purposefully shielded from status quo ideas, as a hedge against local maxima.

Pxtl

Yeah, everybody wants to be the Karate Kid who binged karate studies in one montage and then won the tournament instead of training in the dojo for years and still having to cheat to come close to winning (and still failing) like the chumps at Cobra Kai.

phkahler

>> Yeah, everybody wants to be the Karate Kid who binged karate studies in one montage and then won the tournament instead of training in the dojo for years and still having to cheat to come close to winning (and still failing) like the chumps at Cobra Kai.

Somehow I think you missed the lesson of that movie - if there is one. It's been a long time and I can't put it into words well, but I suspect it was about someone with character and humility vs brute force and cheating. Something like that, but far different from underdog find shortcut to success.

campbel

> because the idea that one can make revolutionary discoveries that contradicts the status quo as an outsider is an inherently attractive idea

This might be satisfying if you have a grievance with the "institution" but to OP's point its also interesting because we often limit our areas of exploration to what is conventionally accepted. If you are ignorant of convention you are more likely to retread ground, but less likely to be bounded by it. As you say, conventional wisdom is conventional because its pretty dang good, so this doesn't often pay off.

SirYandi

I find that thought reassuring. There is less of a gap between oneself and the greats than one might think.

casey2

This exact thread was posted the last time this article was posted.

Is this site all bots. Hello? any real humans?!?!

bloqs

Just old farts repeating the same 4 perspective responses to each other without remembering nor caring for the last time they did it

refulgentis

Been here for 15 years, there's been a slight eternal summer / "YCombinator is a Thing Now" effect and a quite dramatic effect of agreeableness being rewarded --- when people hear "agreeableness", they can misread it as kindness --- on HN, a common manifestation of agreeableness is the not-even-wrong idea that physics is stuck.

fragmede

second chance pool? https://news.ycombinator.com/item?id=26998308

forbiddenvoid

That was the first thought I had, too! Glitch in the Matrix, anyone?

jfengel

All of the advances of modern physics were made by people who were well trained and well acquainted with the state of the art.

There is a myth that Einstein was an outsider. He had a degree in physics. He was working as a patent clerk because he couldn't find a job as a high school teacher, not because he didn't know physics.

gaugefield

One of his earliest great works is one that indicated wrong in the foundational aspect of the theory, namely incompatibility with E&M Maxwell equations and Galilean Transformations and E&M equations not being invariant under those transformations. The principle of symmetry is one of the foundation issues in physics. He also had th wisdom of understanding of the physics of new transformation, Lorentz transformation, which we know today as Special Relativity.

Yes, of course he was well-trained and had the enough background, but also the problem at the time was the type of problem that was solvable (i.e. no limitation in terms of tech) and that required new framework with new understanding.

vincekerrazzi

Potential digression here, this is why I absolutely insist on high performing software teams to have at least one junior. You need someone asking questions like this.

r00f

Would it work in this case? Had he asked about this in a team, he would be told about the existing approach and that'd be it. I've been on the both sides, as junior questioning common practices and as senior answering such questions - and it always resulted in transfer of common knowledge, not some breakthrough.

I totally support searching for the new truths, but we must not forget why the phrase "do not roll your own crypto" exists. It is ok, or maybe it even MUST be done by students and researchers, but I am not so sure about juniors working on production systems. Still fine if you work in R&D department

KineticLensman

Yes, just like evil geniuses should have their plans for world domination reviewed by a 5 year old

jsty

Of course not - that would be ridiculous - it's clearly a job for a Mini-Me! ;)

socks

unfortunately, I have a feeling that in the age of LLMs, this junior on the team will have no impetus to actually put in effort and _think_ about such a problem

falconertc

We survived the age of StackOverflow. I don't see why LLM's will be the death of critical thinking where all else has failed so far.

gopher_space

It’s nice to have someone around who doesn’t understand that I’ve given them hard work. Big, open ended tasks like “invent a photogrammetry pipeline”

geodel

I think a presence of even single senior can dampen the questioning part. So I prefer all junior team to make some big path breaking changes.

andrewflnr

Since people are getting a little squirrelly, I think it's important to point out that this discovery was only in contradiction with a conjecture, not something anyone pretended to prove. Conjectures exist to be falsified.

graemep

Somewhat similar to Plank's principle:

"A new scientific truth does not triumph by convincing its opponents and making them see the light, but rather because its opponents eventually die and a new generation grows up that is familiar with it "

smithkl42

Or more simply: "Science progresses one funeral at a time."

oneshtein

There is Manga and anime about this: https://en.wikipedia.org/wiki/Orb:_On_the_Movements_of_the_E...

eviks

This is just a phrase in an article, there is no reason to believe it reflects the reality in any way. For example, the guy faced a sceptical advisor's response to his proposal. You could say "Krapivin was held back by the conventional scepticism of his boss" (which is a real thing vs the mythical power of conventional wisdom). Yet he wasn't.

snarf21

One question I had in college was if learning inhibits our ability to learn new things. We can gain knowledge but is that the same as learning? I'm not sure where the line sits between questioning and obstinance.

Looking back at Theranos, lots of "experts" (at Stanford, etc.) told her her idea was impossible given the small amount of blood but she convinced people she found a way. It was all a fraud but is there a way to do blood testing at that scale? Do we just need people to keep failing and see if anyone ever gets there? Should academia stick to things that aren't known to be "impossible"? I don't have any answers but find it interesting to contemplate.

andrewflnr

Picking Theranos as an example in favor of ignoring conventional wisdom is really a bold choice. I know there are better examples, based on actual results instead of wishful thinking.

fragmede

Avoiding more recent (aka controversial) examples, the best search engines of the early Internet required users to trawl through pages and pages of search results, with the useful answers on page 8 or so. Conventional wisdom was this was as good as it was gonna get, and that's just how search is. That was, of course, until Google came around and totally changed the game.

plorkyeran

Your example of a time where ignoring the "experts" was a good thing was a time when the experts were correct?

9rx

Do you have a better example to set the context up to ask whether blood testing at that scale could ever become possible if enough people kept trying to crack that nut, contrary to the general consensus of infeasibility?

viccis

Martha Argerich has a similar anecdote when it came to learning a very difficult piano piece. Ravel's Gaspard de la nuit is notoriously difficult, even for accomplished pianists. She learned it (along with another tough one) in 5 days because her teacher told her to learn it. She claims that she was helped by the fact that she was unaware that it was supposed to be difficult.

mNovak

I was hoping they'd have a discussion of the algorithm itself. Quanta is usually good about making these sorts of things approachable.

In any case, the full paper is here [1] if you don't want to scroll through the Wired article.

[1] https://arxiv.org/abs/2501.02305

antognini

Quanta did have an article about this discovery: https://www.quantamagazine.org/undergraduate-upends-a-40-yea...

rolandog

Related HN discussion:

https://news.ycombinator.com/item?id=43002511

mNovak

Right, the Wired article is just a 1:1 reprint of the Quanta article. Both lack a substantive description of the algorithms in question

vlovich123

If my reading of the paper was correct, this kind of hash table is incredibly complicated to resize because resizing would invalidate all previously handed out pointers unless you only do chaining?

The other challenge is that computing N hash functions is quite expensive so in practice I think this ends up slower in real world terms despite the big-O speed up?

Does this actually end up beating open addressed hash tables? It seems more like a neat hypothetical CS result that wouldn’t have any actual real world application? That’s cool in and of itself but I’m curious if there’s any real world reason to do this.

ForTheKidz

Still worth it for situations where you can memoize hashes and reuse them. Basically, any situation where you want to intern strings.

augusto-moura

Also worth it for the cases were you already know the maximum size for the table.

vlovich123

You realize that the number of hashes you have to memoize is variable and unknown ahead of time? And depending on the key, you’re adding a decent amount of overhead on memoization vs just computing the hash on the fly?

I’m going to go ahead and challenge you to actually try implementing this hash table and use it for anything where a normal highly optimized hash table won’t beat this (eg try to beat hashbrown from rust / abseil swisstable in c++)

hrdwdmrbl

Likely why the article said that it wouldn’t lead to any immediate applications.

Cause from the headline, one would get all excited about the applications

hullo

original discussion: https://news.ycombinator.com/item?id=43002511

keybored

Top comment there: https://news.ycombinator.com/item?id=43005231

Second-to-top comment here: https://news.ycombinator.com/item?id=43388872

Shows the wide range of ideas we produce here.

null

[deleted]

hinkley

Based on the last two convos I saw on this, I feel like there’s a simpler algorithm here waiting to break out.

Underneath this is a sort of b-tree scented data structure that avoids the bimodal distribution on insertion times, which is important for a concurrent hash table.

herf

If it was his discovery, would be nice if they'd give him first author on the paper's author list (Farach-Colton, Krapivin, Kuszmaul). Though I understand if the proofs were not done by him.

ziddoap

I believe alphabetical is the norm in computer science, is it not?

treefarmer

It is not, other than sometimes in the case of equal contribution. The first and sometimes second authors are the most important, and the last author is often the advisor/senior researcher supervising the work.

senderista

This is not accurate; it depends on the subfield. As a rule, the more theoretical the subfield, the more likely that alphabetical order is used. See e.g. papers from a theoretical conference like STOC vs. a systems conference like HotOS.

abdullahkhalids

If you look at the papers of the third author [1], almost all of them seem to be alphabetical by last name.

[1] https://arxiv.org/search/cs?searchtype=author&query=Kuszmaul...

ziddoap

Interesting! I didn't realize it varied between sub-disciplines of CS, I guess.

Theoretical computer science and cryptography both typically do alphabetical. Maybe because of their adjacency to pure math?

tylerhou

Alphabetical is the norm in algorithms theory. It is not the norm in other subfields that I can think of, even theoretical fields like programming language theory.

austin-cheney

This article strongly reminded me of Heckle Diff, which was first published almost 47 years ago. https://dl.acm.org/doi/10.1145/359460.359467

The performance consideration Paul Heckle identified was in consideration of index access in arrays versus hash tables. Hash tables are accessed randomly, or pseudo-randomly, until the desired index is found where as indexes in an array are accessed in index order.

diyseguy

I wonder if there is a memory consumption tradeoff for this new data structure? Based on a few initial implementations I see in github, looks like it may be significant? Still a nice discovery.

jsbg

what makes the new memory consumption significant? from the paper they break the initial array into log(n) arrays of size 1, 2, 4, 8...

avinassh

I watched the video, but I didn't understand why they create a 'funnel' at each step. Why not create an equal size array? is it because it would take more memory?

yencabulator

Earlier: https://news.ycombinator.com/item?id=43002511