The LLMentalist Effect
32 comments
·February 8, 2025Terr_
bloomingkales
And only to your eyes and those you force your vision onto. The rest of the universe never sees it. You don’t exist to much of the universe (if a tree falls and no one is around to hear it, you understand what I mean).
So you simultaneously exist and don’t exist. Sorry about this, your post took me on this tangent.
JKCalhoun
> 1 The tech industry has accidentally invented the initial stages a completely new kind of mind, based on completely unknown principles...
> 2) The intelligence illusion is in the mind of the user and not in the LLM itself.
I've felt as though there is something in between. Maybe:
3) The tech industry invented the initial stages a kind of mind that, though misses the mark, is approaching something not too dissimilar to how an aspect of human intelligence works.
> By using validation statements, … the chatbot and the psychic both give the impression of being able to make extremely specific answers, but those answers are in fact statistically generic.
"Mr. Geller, can you write some Python code for me to convert a 1-bit .bmp file to a hexadecimal string?"
Sorry, even if you think the underlying mechanisms have some sort of analog there's real value in LLM's, not so psychics doing "cold readings".
ianbicking
Yeah, the basic premise is off because LLM responses are regularly tested against ground truth (like running the code they produce), and LLMs don't get to carefully select what requests they fulfill. To the contrary they fulfill requests even when they are objectively incapable of answering correctly, such as incomplete or impossible questions.
I do think there is a degree of mentalist-like behavior that happens, maybe especially because of the RLHF step, where the LLM is encouraged to respond in ways that seem more truthful or compelling than is justified by its ability. We appreciate the LLM bestowing confidence on us, and rank an answer more highly if it gives us that confidence... not unlike the person who goes to a spiritualist wanting to receive comforting news of a loved one who has passed. It's an important attribute of LLMs to be aware of, but not the complete explanation the author is looking for.
dosinga
This feels rather forced. The article seems to claim both that LLMs don't actually work, it is all an illusion and that of course the LLMs know everything, they stole all our work from the last 20 years by scraping the internet and underpaying people to produce content. If it was a con, it wouldn't have to do that. Or in other words, if you had a psychic who actually memorized all biographies of all people ever, they wouldn't need their cons
pona-a
Why would it have to be one or the other? Yes, it's been proven LLMs do create world models, how good they are is a separate matter. There still could be goal misalignment, especially when it comes to RLHF.
If the model has in its internal world model knowledge it likely does not know how to solve a coding question, but the RLHF stage has reviewers rate refusals lower, it would in turn force its hand when it comes to tricks it knows it can pull based on its model of human reviewers. It can only implement the surface level boilerplate and pass that off as a solution, write its code in APL to obfuscate its lack of understanding, or keep misinterpreting the problem into a simpler one.
A psychic that read on ten thousand biographies might start to recall them, or he might interpolate the blanks with a generous dose of BS, or more likely do both in equal measure.
swaraj
You should try the arc agi puzzles yourself, and then tell me you think these things aren't intelligent
https://arcprize.org/blog/openai-o1-results-arc-prize
I wouldn't say it's full agi or anything yet, but these things can definitely think in a very broad sense of the word
EagnaIonat
I was hoping it was talking about how it can resonate with users using those techniques. Or some experiments to prove the point. But it is not even that.
There is nothing of substance in this and it feels like the author has a grudge against LLMs.
manmal
Well they have a book to sell, at the bottom of the article.
jbay808
I was interested in this question so I trained NanoGPT from scratch to sort lists of random numbers. It didn't take long to succeed with arbitrary reliability, even given only an infinitesimal fraction of the space of random and sorted lists as training data. Since I can evaluate the correctness of a sort arbitrarily, I could be certain that I wasn't projecting my own beliefs onto its response, and reading more into the output than was actually there.
That settled this question for me.
dartos
I don’t really understand what you’re testing for?
Language, as a problem, doesn’t have a discrete solution like the question of whether a list is sorted or not.
Seems weird to compare one to the other, unless I’m misunderstanding something.
What’s more, the entire notion of a sorted list was provided to the LLM by how you organized your training data.
I don’t know the details of your experiment, but did you note whether the lists were sorted ascended or descended?
Did you compare which kind of sorting was most common in the output and in the training set?
Your bias might have snuck in without you knowing.
IshKebab
A large number of commenters are under the illusion that LLMs are "just" stochastic parrots and can't generalise to inputs not seen in their training data. He was proving that that isn't the case.
tossandthrow
Commenter is merely saying that LLMs indeed are able to approximate arbitrary functions exemplified through sorting.
It is nothing new and has been well established in the literature since the 90s.
The shared article really is not worth the read and mostly uncovers an author who does not know what he write about.
manmal
Have you considered that the nature of numeric characters is just so predictable that they can be sorted without actually understanding their numerical value?
pama
This is from 2023 and is clearly dated. It is mildly interesting to notice how quickly things changed since then. Nowadays models can solve original math puzzles much of the time and it is harder to argue they cannot reason when we have access to R1, o1, and o3-mini.
Terr_
> Nowadays models can solve original math puzzles much of the time
Isn't that usually by not even trying, and delegating the work regular programs?
bbor
In what way is your mathematical talent truly you, but a python tool called by an LLM-centric agent not truly that agent?
Terr_
For starters, it means you should not take the success of the math and ascribe it to an advance in the LLM, or whatever phrase is actually being used to describe the the new fancy target of hype and investment.
An LLM is at best, a possible future component of the speculative future being sold today.
How might future generations visualize this? I'm imagining some ancient Greeks, who have invented an inefficient reciprocating pump, which they declare is a heart and that means they've basically built a person. (At the time, they believed the brain was just there to cool the blood.) Look! The fluid being pumped can move a lever: It's waving to us.
null
twobitshifter
Lost me here - “LLMs are not brains and do not meaningfully share any of the mechanisms that animals or people use to reason or think.“
“the initial stages a completely new kind of mind, based on completely unknown principles, using completely unknown processes that have no parallel in the biological world.”
We just call it a neural network because we wanted to confuse biology with math for the hell of it?
“There is no reason to believe that it thinks or reasons—indeed, every AI researcher and vendor to date has repeatedly emphasised that these models don’t think.”
I mean just look at the Nobel Prize winners for counter examples to all of this https://www.cnn.com/2024/10/08/science/nobel-prize-physics-h...
I don’t understand the denialism behind replicating minds and thoughts with technology - that had been the entire point from the start.
exclipy
Yeah I was expecting the article to give an argument to back up this claim by talking about the mechanisms behind LLMs and the mechanisms behind human thought and demonstrating a lack of overlap.
But I don't see any discussion of multilayer perceptrons or multi-head attention.
Instead, the rest of the article is just saying "it's a con" with a lot of words.
prideout
I lost interest fairly quickly because the entire article seems to rely on a certain definition of "intelligent" that is not made clear in the beginning.
null
karmakaze
AlphaGo also doesn't reason. That doesn't mean it can't do things that humans do by reasoning. It doesn't make sense to make these comparisons. It's like saying that planes don't really fly because they aren't flapping their wings.
Edit: Don't conflate mechanisms with capabilities.
olddustytrail
> One of the issues in during this research—one that has perplexed me—has been that many people are convinced that language models, or specifically chat-based language models, are intelligent.
Different people have different definitions of intelligence. Mine doesn't require thinking or any kind of sentience so I can consider LLMs to be intelligent simply because they provide intelligent seeming answers to questions.
If you have a different definition, then of course you will disagree.
It's not rocket science. Just agree on a definition beforehand.
IshKebab
> But there isn’t any mechanism inherent in large language models (LLMs) that would seem to enable this
Stopped reading here. What is the mechanism in humans that enables intelligence? You don't know? Didn't think so. So how do you know LLMs don't have the required mechanism?
There's another layer here: Humans are being encouraged to confuse a fictional character with the real-world "author" system.
I can create a mad-libs program which dynamically reassembles stories involving a kind and compassionate Santa Claus, but that does not mean the program shares those qualities. I have not digitally reified the spirit of Christmas. Not even if excited human kids contribute some of the words that go into it and shape its direction.
P.S.: This "LLM just makes document bigger" framing is also very useful understanding how prompt injection and hallucinations are constant core behaviors, which we just ignore except when they inconvenience us The assistant-bot in the story can be twisted or vanish so abruptly because it's just something in a digital daydream.