Skip to content(if available)orjump to list(if available)

How to Build Conscious Machines

How to Build Conscious Machines

63 comments

·June 14, 2025

esafak

Interesting stuff. I don't have time to read a dissertation so I skimmed his latest paper instead: Why Is Anything Conscious? https://arxiv.org/abs/2409.14545

In it he proposes a five-stage hierarchy of consciousness:

0 : Inert (e.g. a rock)

1 : Hard Coded (e.g. protozoan)

2 : Learning (e.g. nematode)

3 : First Order Self (e.g. housefly). Where phenomenal consciousness, or subjective experience, begins. https://en.wikipedia.org/wiki/Consciousness#Types

4 : Second Order Selves (e.g. cat). Where access consciousness begins. Theory of mind. Self-awareness. Inner narrative. Anticipating the reactions of predator or prey, or navigating a social hierarchy.

5 : Third Order Selves (e.g. human). The ability to model the internal dialogues of others.

The paper claims to dissolve the hard problem of consciousness (https://en.wikipedia.org/wiki/Hard_problem_of_consciousness) by reversing the traditional approach. Instead of starting with abstract mental states, it begins with the embodied biological organism. The authors argue that understanding consciousness requires focusing on how organisms self-organize to interpret sensory information based on valence (https://en.wikipedia.org/wiki/Valence_(psychology)).

The claim is that phenomenal consciousness is fundamentally functional, making the existence of philosophical zombies (entities that behave like conscious beings but lack subjective experience) impossible.

The paper does not seem to elaborate on how to assess which stage the organism belongs to, and to what degree. This is the more interesting question to me. One approach is IIT: http://www.scholarpedia.org/article/Integrated_information_t...

The author's web site: https://michaeltimothybennett.com/

phrotoma

Dang, this is great stuff. You may enjoy this piece that tackles similar themes but focuses on what use evolution has for consciousness.

My reading of it is that the author suggests global workspace theory is a plausible reason for evolution to spend so much time and energy developing phenomenal consciousness.

https://www.frontiersin.org/journals/psychology/articles/10....

signal-intel

Do you (or this paper) think consciousness exists in the humans out there who have no inner narrative?

esafak

That's a fair question. I don't know that the theory of mind mentioned here is the same as an internal monologue. I think one could model other people's minds without conducting an internal monologue, by visualizing it, for example. Maybe the anendophasiacs in the audience can enlighten us.

The author also has a Youtube channel: https://www.youtube.com/@michaeltimothybennett

Lerc

I can think words in conversations as if I am writing a story (actually thinking about it it's more like reading a script), but as far as I can tell I don't experience what most people describe as an internal monologue, I also have aphantasia which I understand is a frequent co-occurrence with a lack of an internal monologue.

Obviously I'm conscious (but a zombie would say that too). I can certainly consider the mental states of others. Sometimes embarrassingly so, there are a few boardgames where you have to anticipate the actions of others, where the other players are making choices based upon what they think others might do rather than a strictly analytical 'best' move. I'm quite good at those. I am not a poker player but I imagine that professional players have that ability at a much higher level than I do.

So yeah, My brain doesn't talk to me, but I can 'simulate' others inside my mind.

Does it bother anyone else that those simulations of others that you run in your mind might, in themselves, be conscious? If so, do we kill them when we stop thinking about them? If we start thinking about them again do we resurrect them or make a new one?

jbotz

Maybe all humans (and indeed other intelligent mammals) have an inner narrative, but it doesn't necessarily involve language. A mime or a silent film can tell a story without words, and the inner narrative can take likewise be in visual or other sensory form.

null

[deleted]

kingkawn

I’m not sure you are conscious

thrance

I'm wary of any classification that puts humans in a special category of their own, as the crown jewel of the tree of life (many such cases).

> The ability to model the internal dialogues of others.

It feels like someone spent a lot of time searching for something only humans can do, and landed on something related to language (ignoring animals that communicate with sounds too). How is this ability any different than the "Theory of mind"? And why is it so important that it requires a new category of its own?

photonthug

IIT has always interested me, and after reading some of the detractors[1] I get that it has problems, but I still don't get the general lack of attention/interest or even awareness about it. It seems like a step in the right direction, establishing a viable middle ground somewhere between work in CS or neuroscience that measure and model but are far too reluctant to ever speculate or create a unifying theory, vs a more philosophical approach to theory of mind that always dives all the way into speculation.

[1] https://scottaaronson.blog/?p=1799

fsmv

The creator of IIT doesn't understand the universality of Turing machines. He thinks that because in CPUs the physical transistors don't have as many connections as neurons in the brain, that it's fundamentally limited and cannot be conscious.

He even goes as far as to say that you cannot simulate the brain on a CPU and make it conscious because it's still connection limited in the hardware. If you understand computer science you know this is absurd, Turing machines can compute any computable function.

He says "you're not worried you will fall into a simulated black hole are you?" but that is an entirely different kind of thing. The only difference we would get by building a machine with hundreds of thousands of connections per node is faster and more energy efficient. The computation would be the same.

exe34

> The computation would be the same.

Assuming of course that Penrose is cuckoo when it comes to consciousness (which I'm happy to assume).

klabb3

> 4 : Second Order Selves (e.g. cat). Where access consciousness begins. Theory of mind. Self-awareness. Inner narrative. Anticipating the reactions of predator or prey, or navigating a social hierarchy.

Cats and dogs most definitely anticipate actions of other animals and navigate (and establish) social hierarchy. Is this even a trait of consciousness?

I’ve spent much time thinking of qualitative differences between human and close animals. I do think ”narrative” is probably one such construct. Narratives come early (seemingly before language). This lays the foundation of sequential step-by-step thinking. Basically it lets you have intermediate virtual (in-mind) steps supporting next steps, whether that’s through writing, oral communication or episodic memory.

An animal can 100% recall and associate memories, such as mentioning the name of a playmate to a dog (=tail wagging). However, it seems like they can neither remember nor project ”what happens next” and continue to build on it. Is it a degree of ability or a fundamental qualitative difference? Not sure.

In either case, we should be careful overfitting human traits into definition of consciousness, particularly language. Besides, many humans have non-verbal thoughts and we are no less conscious during those times.

jijijijij

There is this popular video of a crow repeatedly riding down a snow covered roof on a piece of plastic, basically snowboarding. Seemingly just for fun/play.

For me, it's hard to imagine how such behavior could be expressed without the pure conscious experience of abstract joy and anticipation thereof. It's not the sort of play, which may prepare a young animal for the specific challenges of their species (e.g. hunting, or fighting). I don't think you could snowboard on a piece of bark or something. Maybe ice, but not repeatedly by dragging it up the hill again. It's an activity greatly inspired by man-made, light and smooth materials, novelties considering evolutionary timescales. May even be inspired by observing humans...

I think it's all there, but the question about degree of ability vs. qualitative difference may be moot. I mean, trivially there is a continuous evolutionary lineage of "feature progression", unless we would expect our extend of consciousness being down to "a single gene". But it's also moot, because evolutionary specialization may as well be as fundamental a difference as the existence of a whole new organ. E.g. the energy economics of a bird are restricted by gravity. We wouldn't see central nervous systems without the evolutionary legacy of predation -> movement -> directionality -> sensory concentration at the front. And we simply cannot relate to solitary animals (who just don't care about love and friendship)... Abilities are somewhat locked-in by niche and physics constraints.

I think the fundamental difference between humans and animals, is the degree of freedom we progressively gained over the environment, life, death and reproduction. Of course we are governed by the wider idea of evolution like all matter, but in the sense of classical theory we don't really have a specific niche, except "doing whatever with our big, expensive brain". I mean, we're at a point where we play meta-evolution in the laboratory. This freedom may have brought extended universality into cognition. Energy economics, omnivorous diet, bipedal walking, hands with freely movable thumbs, language, useful lifespan, ... I think the sum of all these make the difference. In some way, I think we are like we are, exactly because we are like that. Getting here wasn't guided by plans and abstractions.

If it's a concert of all the things in our past and present, we may never find a simpler line between us and the crow, yet we are fundamentally different.

ben_w

> Is this even a trait of consciousness?

There's 40 or so different definitions of the word, so it depends which one you're using when you ask the question.

For me, and not just when it comes to machine minds, the meaning I find most interesting is qualia — unfortunately, I have no particular reason to think this hierarchy helps with that, because there might be a good evolutionary reason for us to have a subjective experience rather than mere unfeeling circuits of impulse and response, it's (1) not clear why this may have been selected for, and evolution does do things at random and only select for/against when they actually matter, and (2) it's not clear when in our evolution this may have happened, and (3) it's not clear how to test for it.

kazinator

Where dooes a human under anaesthesia fit in?

wwweston

Unconscious, in my experience.

But not aconscious.

kazinator

Is there a definition of unconscious distinct from and more useful than "temporarily aconscious with most memory intact"?

moffkalast

> phenomenal consciousness is fundamentally functional, making the existence of philosophical zombies (entities that behave like conscious beings but lack subjective experience) impossible

That's interesting, but I think that only applies if the consciousness is actually consistent in some wide set of situations? Like you can dump a few decent answers into a database and it answers correctly if asked the exact right questions, a la Eliza or Chinese room, does that mean SQL's SELECT is conscious?

With LLMs it's not entirely clear if we've expanded that database to near infinity with lossy compression or if they are a simplistic barely functional actual consciousness. Sometimes it feels like it's both at the same time.

mock-possum

Well shit I wonder what level 6 looks like

moffkalast

Some kind of multi-tier 5 hivemind perhaps.

talkingtab

The topic is of great interest to me, but the approach throws me off. If we have learned one thing from AI, it is the primal difference between knowing about something and being able to do something. [With extreme gentleness, we humans call it hallucination when an AI demonstrates this failing.]

The question I increasingly pose to myself and others, is which kind of knowledge is at hand here? And in particular, can I use this to actually build something?

If one attempted to build a conscious machine, the very first question I would ask, is what does conscious mean? I reason about myself so that means I am conscious, correct? But that reasoning is not a singularity. It is a fairly large number of neurons collaborating. An interesting question - for another tine - is then is whether a singular entity can in fact be conscious? But we do know that complex adaptive systems can be conscious because we are.

So step 1 in building a conscious machine could be to look at some examples of constructed complex adaptive systems. I know of one, which is the RIP routing protocol (now extinct? RIP?). I would bet my _money_ that one could find other examples of artificial CAS pretty easily.

[NOTE: My tolerance for AI style "knowledge" is lower and lower every day. I realize that as a result this may come off as snarky and apologize. There are some possibly good ideas for building conscious machines in the article, but I could not find them. I cannot find the answer to a builders question "how would I use this", but perhaps that is just a flaw in me.]

K0balt

I’d be careful about your modeling of LLM “hallucination”. Hallucination is not a malfunction. The LLM is correctly predicting the most probable symantic sequence to extend the context, based on its internal representation of the training process it was coded with.

The fact that this fails to produce a useful result is at least partially determined by our definition of “useful” in the relevant context. In one context, the output might be useful, in another, it is not. People often have things to say that are false, the product of magical thinking, or irrelevant.

This is not an attempt at LLM apologism, but rather a check on the way we think about useless or misleading outcomes. It’s important to realize that hallucinations are not a feature, nor a bug, but merely the normative operating condition. That the outputs of LLMs are frequently useful is the surprising thing that is worth investigating.

If I may, my take on why they are useful diverges a bit into light information theory. We know that data and computation are interchangeable. A logic gate which has an algorithmic function is interchangeable with a lookup table. The data is the computation, the computation is the data. They are fully equivalent on a continuum from one pure extreme to the other.

Transformer architecture engines are algorithmic interpreters for LLM weights. Without the weights, they are empty calculators, interfaces without data on which to calculate.

With LLMs, The weights are a lookup table that contains an algorithmic representation of a significant fraction of human culture.

Symbolic representation of meaning in human language is a highly compressed format. There is much more implied meaning than the meaning which is written on the outer surface of the knowledge. When we say something, anything beyond an intentionally closed and self referential system, it carries implications that ultimately end up describing the known universe and all known phenomenon if traced out to its logical conclusion.

LLM training is significant not so much for the knowledge it directly encodes, but rather for implications that get encoded in the process. That’s why you need so much of it to arrive at “emergent behavior”. Each statement is a CT beam sensed through the entirety of human cultural knowledge as a one dimensional sample. You need a lot of point data to make a slice, and a lot of slices to get close to an image…. But in the end you capture a facsimile of the human cultural information space, which encodes a great deal of human experience.

The resulting lookup table is an algorithmic representation of human culture, capable of tracing a facsimile of “human” output for each input.

This understanding has helped me a great deal to understand and accurately model the strengths and weaknesses of the technology, and to understand where its application will be effective and where it will have poor utility.

Maybe it will be similarly useful to others, at least as an interim way of modeling LLM applicability until a better scaffolding comes along.

talkingtab

Interesting thoughts. Thanks. As for your statement: "That the outputs of LLMs are frequently useful is the surprising thing that is worth investigating". In my view the hallucinations are just as interesting.

Certainly in human society the "hallucinations" are revealing. In my extremely unpopular opinion much of the political discussion in the US is hallucinatory. I am one of those people the New York Time called a "double hater" because I found neither of presidential candidate even remotely acceptable.

So perhaps if we understood LLM hallucinations we could then understand our own? Not saying I'm right, but not saying I'm wrong either. And in the case that we are suffering a mass hallucination, can we detect it and correct it?

disambiguation

I mainly read sections II and XII+, and skimmed others. My question is: does the author ever explain or justify handwaving "substrate dependence" as another abstraction in the representation stack, or is it an extension of "physical reductivism" (the author's position) as a necessary assumption to forge ahead with the theory?

This seems like the achilles heel of the argument, and IMO takes the analogy of software and simulated hardware and intelligence too far. If I understand correctly, the formalism can be described as a progression of intelligence, consciousness, and self awareness in terms of information processing.

But.. the underlying assumptions are all derived from the observational evidence of the progression of biological intelligence in nature, which is.. all dependent on the same substrate. The fly, the cat, the person - all life (as we know it) stems from the same tree and shares the same hardware, more or less. There is no other example in nature to compare to, so why would we assume substrate independence? The author's formalism selects for some qualities and discards others, with (afaict) no real justification (beyond some finger wagging as Descarte and his Pineal Gland).

Intelligence and consciousness "grew up together" in nature but abstracting that progression into a representative stack is not compelling evidence that "intelligent and self-aware" information processing systems will be conscious.

In this regard, the only cogent attempt to uncover the origin of consciousness I'm aware of is by Roger Penrose. https://en.wikipedia.org/wiki/Orchestrated_objective_reducti...

The gist of his thinking is that we _know_ consciousness exists in the brain, and that it's modulated under certain conditions (e.g sleep, coma, anesthesia) which implies a causal mechanism that can be isolated and tested. But until we understand more about that mechanism, it's hard to imagine my GPU will become conscious simply because it's doing the "right kind of math."

That said I haven't read the whole paper. It's all interesting stuff and a seemingly well organized compendium of prevailing ideas in the field. Not shooting it down, but I would want to hear a stronger justification for substrate independence, specifically why the author thinks their position is more compelling than Penrose's Quantum Dualism?

gcanyon

The obvious question (to me at least) is whether "consciousness" is actually useful in an AI. For example, if your goal is to replace a lawyer researching and presenting a criminal case, is the most efficient path to develop a conscious AI, or is consciousness irrelevant to performing that task?

It might be that consciousness is inevitable -- that a certain level of (apparent) intelligence makes consciousness unavoidable. But this side-steps the problem, which is still: should consciousness be the goal (phrased another way, is consciousness the most efficient way to achieve the goal), or should the goal (whatever it is) simply be the accomplishment of that end goal, and consciousness happens or doesn't as a side effect.

Or even further, perhaps it's possible to achieve the goal with or without developing consciousness, and it's possible to not leave consciousness to chance but instead actively avoid it.

qgin

Consciousness is something you know you have, but you can never know if someone else has it.

We extend the assumption of consciousness to others because we want the same courtesy extended to us.

paulddraper

There are a couple definitions of consciousness

moffkalast

Class consciousness, comrade.

Avicebron

> "There are a few other results too. I’ve given explanations of the origins of life, language, the Fermi paradox, causality, an alterna- tive to Ockham’s Razor, the optimal way to structure control within a company or other organisation, and instructions on how to give a computer cancer"

Sighs

canadiantim

The important point, I believe, is here:

> what is consciousness? Why is my world made of qualia like the colour red or the smell of coffee? Are these fundamental building blocks of reality, or can I break them down into something more basic? If so, that suggests qualia are like an abstraction layer in a computer.

He then proceeds to assume one answer to the important question of: is qualia fundamentally irreducible or can it be broken down further? The rest of the paper seems to start from the assumption that qualia is not fundamentally irreducible but instead can be broken down further. I see no evidence in the paper for that. The definition of qualia is that it is fundamentally irreducible. What is red made of? It’s made of red, a quality, hence qualia.

So this is only building conscious machines if we assume that consciousness isn’t a real thing but only an abstraction. While it is a fun and maybe helpful exercise for insights into system dynamics, it doesn’t engage with consciousness as a real phenomena.

jasonjmcghee

The smell of coffee is a combination of a bunch of different molecules that coffee releases into the air that when together we associate as "the smell of coffee".

I'm not even sure if we know why things smell the way they do - I think molecular structure and what they're made of both matter - like taste, though again not sure if we know why things taste the way they do / end up generating the signals in our brain that they do.

Similarly "red" is a pretty large bucket / abstraction / classification of a pretty wide range of visible light, and skips over all the other qualities that describe how light might interact with materials.

I feel like both are clearly not fundamental building blocks of anything, just classifications of physical phenomena.

jasperry

The smell of coffee is not the molecules in the air; the molecules in the air cause you to smell something, but the smelling itself is a subjective experience. The same for the signals in our brain; that's an objective explanation of the cause of our experience, but the subjective experience in itself doesn't seem to be able to be broken down into other things. It's prior to all other things we can know.

jasonjmcghee

That's a fair argument. Subjective experience doesn't require knowledge of how anything works- you can experience the stimuli without any understanding

ziofill

The smell of coffee (and your other examples) are not a property of the molecules themselves. It is the interpretation of such molecules given by our brain and the “coffee-ness” is a quality made up by the brain.

argentinian

Yes, in our experience we associate perceptions and also concepts to other concepts and words. But that doesn't explain 'qualia', the fact of having a conscious experience. The AIs also associate and classify. Associating does not explain qualia. Why would? The association happens, but we have 'an experience' of it happening.

canadiantim

You’re right that the experience of the smell of coffee is associated with a bunch of different molecules entering our nose and stimulating receptors there. These receptors then cause an electrochemical cascade of salts into the brain producing neural patterns which are associated with the experience of the smell of coffee. But this is all just association. The conscious experience of the smell of coffee, or red for that matter, is different than the associated electrochemical cascades in the brain. They’re very highly correlated but very importantly: these electrochemical cascades are just associated with qualia but are not qualia themselves. Only qualia is qualia, only red is red, though red, the smell of coffee, etc are very tightly correlated with brain processes. That’s the distinction between consciousness and the brain.

catigula

Consciousness is an interesting topic because if someone pretends to have a compelling theory on what's actually going on there they're actually mistaken or lying.

The best theories are completely inconsistent with the scientific method and "biological machine" ideologists. These "work from science backwards" theories like IIT and illusionism don't get much respect from philosophers.

I'd recommend looking into pan-psychicism and Russellian monism if you're interested.

Even still, these theories aren't great. Unfortunately it's called the "hard problem" for a reason.

briian

One thought I have from this is,

Are OpenAI funding research into neuroscience?

Artificial Neural Networks were somewhat based off of the human brain.

Some of the frameworks that made LLMs what they are today are too based of our understanding of how the brain works.

Obviously LLMs are somewhat black boxes at the moment.

But if we understood the brain better, would we not be able to imitate consciousness better? If there is a limit to throwing compute at LLMs, then understanding the brain could be the key to unlocking even more intelligence from them.

paulddraper

As far as anyone can tell, there is virtually no similarity between brains and LLMs.

Neural nets were named such because they have connected nodes. And that’s it.

permo-w

this so obviously not true that I can't fathom why you would say it

paulddraper

“Artificial Neural Networks were somewhat based off of the human brain.

“Some of the frameworks that made LLMs what they are today are too based of our understanding of how the brain works.”

kypro

If I were to build a machine that reported it was conscious and felt pain when it's CPU temperature exceeded 100C, why would that be meaningfully different to the consciousness a human has?

I understand I hold a very unromantic and unpopular view on consciousness, but to me it just seems like such an obvious evolutionary hack for the brain to lie about the importance of its external sensory inputs – especially in social animals.

If I built a machine that knew it was in "pain" when it's CPU exceeded 100C but was being lie to about the importance of this pain via "consciousness", why would it or I care?

Consciousness is surely just the brains way to elevate the importance of the senses such that the knowledge of pain (or joy) isn't the same as the experience of it?

And in social creatures this is extremely important, because if I program a computer to know it's in pain when it's CPU exceeds 100C you probably wouldn't care because you wouldn't believe that it "experiences" this pain in the same way as you do. You might even thing it's funny to harm such a machine that reports it's in pain.

Consciousness seems so simply and so obviously fake to me. It's clear a result of wiring that forces a creature to be reactive to its senses rather than just see them as inputs for which it has knowledge of.

And if conscious is not this, then what is it? Some kind of magical experience thing which happens in some magic non-physical conscious dimension which evolution thought would be cool even though it had no purpose? Even if you think about it obviously consciousness is fake and if you wanted to you could code a machine to act in a conscious way today... And in my opinion those machines are as conscious as you or me because our conscious is also nonsense wiring that we must elevate to some magical importance because if we didn't we'd just have the knowledge that jumping in a fire hurts, we wouldn't actually care.

Imo you could RLHF consciousness very easily in a modern LLM by encouraging it act it a way that it comparable to how a human might act when they experience being called names, or when it's overheating. Train it to have these overriding internal experiences which it cannot simply ignore, and you'll have a conscious machine which has conscious experiences in the a very similar way to how humans have conscious experiences.

m3kw9

I can’t even definitively be sure the other guy across the street is actually conscious

brookst

I’m not even sure I am.

m3kw9

Using who’s definition of consciousness and how do you even test it?

esafak

He addresses the first point. Not sure about the second.