Acronymy (Can we define every word as an acronym?)
6 comments
·October 25, 2025jihadjihad
congress [0]
> collection of non governors really exhibiting self service
clueless
this is pretty cool, here's another question, how much language compression would we get if we collapse all related words to a single synonymous word? Here's what chatgps came up with:
Assume an English-like active vocabulary V = 50,000 word types (a rough stand-in for “distinct words” seen commonly). We could get a realistic guess of: ~30% reduction for a less modest, more aggressive embedding-style collapse in typical English text. I.e. Collapse words with similar meaning directions in vector space... happy, glad, pleased, delighted → happy
Nzen
If you want to see examples of this in practice, I recommend reading Randall Monroe's Thing Explainer [0] or some simple wikipedia articles [1].
[0] https://xkcd.com/thing-explainer/
[1] https://simple.wikipedia.org/wiki/Rabbit (versus https://en.wikipedia.org/wiki/Rabbit)
this is the type of important work that transformer LLM’s are actually really good at, I think