Large language models think too fast to explore effectively

43 comments

·January 31, 2025

xerox13ster

Maybe this is a really really, really, really, really dumb idea:

What if we started Refining and reinforcing these Extremely fast thinking, models with text written exclusively by stoners?

Stoners think slower, but they have more wide, branching thoughts. I find I do some of my most seemingly creative and explorative thinking when I am high.

If there are effects of time and season and energy in models of LLMS such that you can say it’s June and it performs worse or it’s January and it performs better. Or you can say you’ve got a good nights rest so do better and it does, due to the fact that human writing is affected by our mental state, and our mental state is affected by physiological factors.

So then, I wonder what happens if you do HFRL and refinement on text from people under the influence of marijuana.

1970-01-01

That just sounds too much like a low budget movie plot.

One day the stoners witness the birth of AI but forget which system they were logged into. And they're presenting it to the CEO on Monday morning. Dude, where's my LLM?

catlifeonmars

I didn’t read the paper (yet), but the abstract implies that the “thinking too fast” phenomena is not a consequence of the training data but of the architecture of the models:

> Representational analysis of the models with Sparse Autoencoders revealed that uncertainty and choices are represented at earlier transformer blocks, while empowerment values are processed later

My read is that it’s not so much the content of the training input, but how the models are designed and trained.

digdugdirk

Someone get this person a pile of VC money, stat.

m3kw9

Like train it like “ umm….. uhhhhh… emmmmmm….. “ right?

jeffhuys

Depends on how “trained” in being stoned they are - after 10 years, I could eloquently talk to you, provided I feel comfortable with you knowing I’m stoned.

The brain can adapt and use the thc to its advantage!

dr_dshiv

I have a hard time prompting if I’m not stoned— I just get way better results

User23

Whoah. That's deep. Deep learning! Yeah man, you're totally on to something!

cab11150904

God I hate potheads. It’s like they beg you to dislike them.

codr7

Hate is a choice.

cab11150904

mandmandam

I've noticed that visceral hatred of stoners is a common trait with people who define their success in life almost entirely by their career, and other visible external factors - marriage, kids, house size.

... Maybe stoners know something you don't. Maybe part of you sees that, and seethes.

miningape

Usually just people of the older generation who contrast themselves to the hippie movement of the 60s and 70s who they call stoners. Younger generations don't have the same cultural context so they just see stoners as chill people that smoke instead of drink.

_bin_

marriage and kids are crazy things to stack up against house size

a bigger house and being a stoner are both fairly unfulfilling. a wife and kids are the most meaningful and joyful things a man can have. i think comparing those to a house and suggesting they're unimportant, or no more important than getting stoned, is a great example of why so many of us dislike stoners.

jeffhuys

What makes you dislike them so much?

Jimmc414

Maps well to Kahneman's "Thinking Fast and Slow" framework

system 1 thinking for early layer processing of uncertainty in LLMs. quick, intuitive decisions, focuses on uncertainty, happens in early transformer layers.

system 2 thinking for later layer processing of empowerment (selecting elements that maximize future possibilities). strategic, deliberate evaluation, considering long-term possibilities, happens in later layers.

system 1 = 4o/llama 3.1

system 1 + system 2 = o1/r1 reasoning models

empowerment calculation seems possibly oversimplified - assumes a static value for elements over a dynamic context-dependent empowerment

interesting that higher temperatures improved performance slightly for system 1 models although they still made decisions before empowerment information could influence them

edit: removed the word "novel". The paper shows early-layer processing of uncertainty vs later-layer processing of empowerment.

nonameiguess

I think it goes beyond that. When we say animals learn and humans in particular generate new knowledge by exploration, it's not just system I versus system II thinking. You might legitimately not be able to answer question just by thinking at all. You have to go out in the world and find the answer. It might be as simple and walking around a building to see what's on the other side or it might as complicated as spending a few thousand years learning to build particle accelerators and sensors that can infer what is happening at a subatomic level in these, but you can't just "generate" an answer with sheer brainpower.

This is a lot older than Kahneman. It's the classical split between rationalists and empiricists.

AI maximalists and x-risk death cult fanatics are effectively being rationalists. They don't believe a sufficiently intelligent entity will be bound by the need to do science. It can, by sheer force of thought, figure out the answer to any question, develop a perfect course of action to achieve any goal, as quickly as electricity can traverse its own internal circuits.

It's interesting to see the constraints on how they can even study this in the paper, though. You can only ever ask a computing agent to do what it even can do. If you want it to explore, it needs a toy world that maybe digitally encodes a reasonable stripped-down representation of the real world, but without a body, sensors, and full autonomy, it can't ever truly explore the real real world.

Qwertious

>it can't ever truly explore the real real world.

Would you believe that computers are made out of atoms, and that computer programs modify those atoms? Via electrons, I believe.

hinkley

Some people have been trying to claim this book is based on faulty research but I definitely know people who don’t believe they work the way the book describes but the only sane explanation for their behavior is that either they do or I know an inordinate number of pathological liars.

And I know some neurodiverse people who absolutely believe this is why they are like this. And that’s always a problem with studies. Some things are 10% of the time, it works every time. If you test the general public for mental acuity and caffeine you’re going to see undiagnosed ADHD people benefitting, or find they’re already consuming alarming quantities of the stuff (self medication).

fudged71

Stanovich proposes a three tier model http://keithstanovich.com/Site/Research_on_Reasoning_files/S...

nradov

Modeling an analog system like human cognition into any number of discrete tiers is inherently kind of arbitrary and unscientific. I doubt you could ever prove experimentally that all human thinking works through exactly two or three or ten or whatever number of tiers. But it's at least possible that a three-tier model facilitates building AI software which is "good enough" for many practical use cases.

User23

Funny you should say that because an American guy did that a hundred years ago and nailed it.

He divided reasoning into the two categories corollarial and theorematic.

optimalsolver

> quick, intuitive decisions, focuses on what's novel/unknown

Wouldn't it be the exact opposite? That is, novel stimuli requires more extensive processing at higher levels of cognition.

hinkley

Requires != receives. Knee-jerk reactions to novel circumstances get people into trouble all the time. Particularly in impulsive people, who find their actions sometimes don’t align with their values, to the point they can’t be friends with people who are particularly judgemental about actions versus words and won’t accept apologies.

It is a lot of work to retrain your lizard brain to not react in unpleasant ways when the shit hits the fan or you’re exhausted, and the default answer to everything is “no”. There are ways I show up in a crisis that are exemplary, and others that I’m not proud of. Understanding why doesn’t fix it. It’s just step 1, like in AA.

null

[deleted]

svnt

Predicting language tokens but without logic probably does not map well at all to system 1 thinking, except where fast=conversation.

Jimmc414

the system 1/2 analogy is obviously imperfect for token prediction, but the paper does provide evidence that uncertainty processing in early layers (layer 2) affects decisions before empowerment processing in later layers (layer 72) can influence them. quick decisions based on uncertainty before deeper strategic evaluation does echo how fast thinking can override slow thinking, even if the mechanisms are substantially different.

kadushka

From the abstract:

Results show most LLMs underperform compared to humans, except for the o1 model

GaggiX

It's the only reasoning model they evaluated to be honest.

hulitu

> Large language models think

Really ? Can they ask pertinent questions ?

brotchie

Open question for LLMs, does creativity and new ideas come from a process or is it a laddered emergent capability.

What I mean by this, is the process of coming up with novel ideas a single capability that has to be trained and reinforced.

Or is it a ladder of capabilities of increasing complexity in that a model that could figure of General Relativity from scratch would not be able to continue the process and perhaps come up with a viable “theory of everything.”

One thing I’ve wanted to do, I’m sure somebody has tried it, is build a dataset to RL a model to be more creative: Get a human expert in a field, have them ask a reasoning model some open questions, and then have the expert look at 20 outputs and rank them by creativity / insight. Have the expert iterate and see how much new “insight” they can mine from the model.

Do this across many fields, and then train a model on these rankings.

Perhaps creativity is a different way of moving in latent space which is “ablated” from existing models because they’re tuned to be “correct” rather than “creative.”

Also curious what techniques there are to sample a reasoning model to deliberately perturb its internal state into more creative realms. Though these a fine line between insight and hallucination.

In some respects creativity is hallucination. As a human, you’re effectively internally postulating creative ideas “hallucinations” and then one of them “hits” and fires a whole bunch of neurons which indicate: “ok that wild idea actually has grounding and strong connections to the existing knowledge in your brain.”

visarga

I think creativity comes from search, inspiration or discoveries from the environment. Creativity is basically searching and stumbling into novel ideas.

Creativity doesn't come from the brain itself, the brain is just exploring and accumulating experience. Experience builds on itself. The role of the environment is both to spark and to invalidate ideas.

For example AlphaZero with just search-and-learn strategy could beat humans at our own game. It's not a magic of the model, but of the search loop.

null

[deleted]

otabdeveloper4

Large language models don't think at all. Please stop with the crude marketing bullshit.

int_19h

Nobody cares if they "really think" or just "pretend to think", so long as the resulting capability to reason is real (which it is).

HN

Large language models think too fast to explore effectively

Large language models think too fast to explore effectively