Skip to content(if available)orjump to list(if available)

Robust autonomy emerges from self-play

Robust autonomy emerges from self-play

15 comments

·February 7, 2025

nine_k

Can there be "smart toys" for models that help them self-improve in a particularly efficient way?

visarga

Yes, the smart toys are search, code execution, simulations and games.

grandma_tea

Can you expand on that? Efficient in what way?

null

[deleted]

mitthrowaway2

Something about dreams that fascinates me is that I usually am genuinely surprised by events that occur in dreams. I interact with other characters whose motivation I cannot understand and whose actions I cannot fully anticipate. It feels like there's a foreign entity acting as DM.

This isn't fake surprise. Sometimes I'll wake up and think, "who on earth were those guys and what were they trying to do? And yet their actions make sense..." or, "who came up with that punchline? It's legitimately funny and I never saw it coming, so it can't have been me..."

And yet I know it's all being generated by my own brain somehow. Through some kind of privileged access level.

And then I think about the bicameral brain structure. Does our brain have two halves so that it can function in a self-play training mode during sleep? Are each halves of my brain experiencing the same dream from opposite points of view?

Apologies for the tangent; this is almost totally unrelated to the article and probably something well known to neuroscience for decades. But still, it fascinates me, and the more we learn about the effectiveness of self-play in AI, the more I wonder.

jes5199

I think you may have hit upon a novel combination of ideas here. There is something called "social simulation theory" regarding the purpose of dreams, but I don't think it has a neuroanatomical description included.

vjerancrnjak

It’s a very conscious part of sleep. So who knows what’s actually going on elsewhere, and if you ever make it conscious, would it just be an interpretation of this temporary conscious machinery inspecting what was previously running without it with no labels.

It is similar to solving problems. You want most of it to happen in unconsciousness, otherwise it’s too slow.

Things are learned when they are natural, without thought.

svnt

I don’t think this requires two halves, although it certainly seems possible that is what is happening.

I believe it only requires that your sensory and post-sensory systems be unpredictably generative when feeding to your subjective sense-making/observer. This could be provided for within a coherent whole brain.

The28thDuck

The concept of being able to simulate 42 years of “experience” in one hour seems so foreign to me. Something about it creeps me out.

null

[deleted]

ThrowawayTestr

Don't watch the White Christmas Black Mirror episode.

null

[deleted]

awinter-py

[flagged]

esafak

"Guys, do you think we'll get away it?"

bloomingkales

Well, it's time I handle my daily biological self supervised learning.