Large Language Models Think Too Fast to Explore Effectively
12 comments
·January 31, 2025xerox13ster
cab11150904
God I hate potheads. It’s like they beg you to dislike them.
Jimmc414
Maps well to Kahneman's "Thinking Fast and Slow" framework
system 1 thinking for early layer processing of uncertainty in LLMs. quick, intuitive decisions, focuses on uncertainty, happens in early transformer layers.
system 2 thinking for later layer processing of empowerment (selecting elements that maximize future possibilities). strategic, deliberate evaluation, considering long-term possibilities, happens in later layers.
system 1 = 4o/llama 3.1
system 1 + system 2 = o1/r1 reasoning models
empowerment calculation seems possibly oversimplified - assumes a static value for elements over a dynamic context-dependent empowerment
interesting that higher temperatures improved performance slightly for system 1 models although they still made decisions before empowerment information could influence them
edit: removed the word "novel". The paper shows early-layer processing of uncertainty vs later-layer processing of empowerment.
hinkley
Some people have been trying to claim this book is based on faulty research but I definitely know people who don’t believe they work the way the book describes but the only sane explanation for their behavior is that either they do or I know an inordinate number of pathological liars.
And I know some neurodiverse people who absolutely believe this is why they are like this. And that’s always a problem with studies. Some things are 10% of the time, it works every time. If you test the general public for mental acuity and caffeine you’re going to see undiagnosed ADHD people benefitting, or find they’re already consuming alarming quantities of the stuff (self medication).
fudged71
Stanovich proposes a three tier model http://keithstanovich.com/Site/Research_on_Reasoning_files/S...
svnt
Predicting language tokens but without logic probably does not map well at all to system 1 thinking, except where fast=conversation.
Jimmc414
the system 1/2 analogy is obviously imperfect for token prediction, but the paper does provide evidence that uncertainty processing in early layers (layer 2) affects decisions before empowerment processing in later layers (layer 72) can influence them. quick decisions based on uncertainty before deeper strategic evaluation does echo how fast thinking can override slow thinking, even if the mechanisms are substantially different.
optimalsolver
> quick, intuitive decisions, focuses on what's novel/unknown
Wouldn't it be the exact opposite? That is, novel stimuli requires more extensive processing at higher levels of cognition.
hinkley
Requires != receives. Knee-jerk reactions to novel circumstances get people into trouble all the time. Particularly in impulsive people, who find their actions sometimes don’t align with their values, to the point they can’t be friends with people who are particularly judgemental about actions versus words and won’t accept apologies.
It is a lot of work to retrain your lizard brain to not react in unpleasant ways when the shit hits the fan or you’re exhausted, and the default answer to everything is “no”. There are ways I show up in a crisis that are exemplary, and others that I’m not proud of. Understanding why doesn’t fix it. It’s just step 1, like in AA.
null
Maybe this is a really really, really, really, really dumb idea:
What if we started Refining and reinforcing these Extremely fast thinking, models with text written exclusively by stoners?
Stoners think slower, but they have more wide, branching thoughts. I find I do some of my most seemingly creative and explorative thinking when I am high.
If there are effects of time and season and energy in models of LLMS such that you can say it’s June and it performs worse or it’s January and it performs better. Or you can say you’ve got a good nights rest so do better and it does, due to the fact that human writing is affected by our mental state, and our mental state is affected by physiological factors.
So then, I wonder what happens if you do HFRL and refinement on text from people under the influence of marijuana.