Skip to content(if available)orjump to list(if available)

A non-anthropomorphized view of LLMs

BrenBarn

> In contrast to an LLM, given a human and a sequence of words, I cannot begin putting a probability on "will this human generate this sequence".

I think that's a bit pessimistic. I think we can say for instance that the probability that a person will say "the the the of of of arpeggio halcyon" is tiny compared to the probability that they will say "I haven't been getting that much sleep lately". And we can similarly see that lots of other sequences are going to have infinitesimally low probability. Now, yeah, we can't say exactly what probability that is, but even just using a fairly sizable corpus as a baseline you could probably get a surprisingly decent estimate, given how much of what people say is formulaic.

The real difference seems to be that the manner in which humans generate sequences is more intertwined with other aspects of reality. For instance, the probability of a certain human saying "I haven't been getting that much sleep lately" is connected to how much sleep they have been getting lately. For an LLM it really isn't connected to anything except word sequences in its input.

I think this is consistent with the author's point that we shouldn't apply concepts like ethics or emotions to LLMs. But it's not because we don't know how to predict what sequences of words humans will use; it's rather because we do know a little about how to do that, and part of what we know is that it is connected with other dimensions of physical reality, "human nature", etc.

This is one reason I think people underestimate the risks of AI: the performance of LLMs lulls us into a sense that they "respond like humans", but in fact the Venn diagram of human and LLM behavior only intersects in a relatively small area, and in particular they have very different failure modes.

fenomas

> The moment that people ascribe properties such as "consciousness" or "ethics" or "values" or "morals" to these learnt mappings is where I tend to get lost.

TFA really ought to have linked to some concrete examples of what it's disagreeing with - when I see arguments about this in practice, it's usually just people talking past each other.

Like, person A makes a statement like "the model wants to X, but it knows Y is wrong, so it prefers Z", or such. And person B interprets that as ascribing consciousness or values to the model, when the speaker meant it no differently from saying "water wants to go downhill" - i.e. a way of describing externally visible behaviors, but without saying "behaves as if.." over and over.

And then in practice, an unproductive argument usually ensues - where B is thinking "I am going to Educate this poor fool about the Theory of Mind", and A is thinking "I'm trying to talk about submarines; why is this guy trying to get me to argue about whether they swim?"

Veedrac

The author plot the input/output on a graph, intuited (largely incorrectly, because that's not how sufficiently large state spaces look) that the output was vaguely pretty, and then... I mean that's it, they just said they have a plot of the space it operates on therefore it's silly to ascribe interesting features to the way it works.

And look, it's fine, they prefer words of a certain valence, particularly ones with the right negative connotations, I prefer other words with other valences. None of this means the concerns don't matter. Natural selection on human pathogens isn't anything particularly like human intelligence and it's still very effective at selecting outcomes that we don't want against our attempts to change that, as an incidental outcome of its optimization pressures. I think it's very important we don't build highly capable systems that select for outcomes we don't want and will do so against our attempts to change it.

barrkel

The problem with viewing LLMs as just sequence generators, and malbehaviour as bad sequences, is that it simplifies too much. LLMs have hidden state not necessarily directly reflected in the tokens being produced and it is possible for LLMs to output tokens in opposition to this hidden state to achieve longer term outcomes (or predictions, if you prefer).

Is it too anthropomorphic to say that this is a lie? To say that the hidden state and its long term predictions amount to a kind of goal? Maybe it is. But we then need a bunch of new words which have almost 1:1 correspondence to concepts from human agency and behavior to describe the processes that LLMs simulate to minimize prediction loss.

Reasoning by analogy is always shaky. It probably wouldn't be so bad to do so. But it would also amount to impenetrable jargon. It would be an uphill struggle to promulgate.

Instead, we use the anthropomorphic terminology, and then find ways to classify LLM behavior in human concept space. They are very defective humans, so it's still a bit misleading, but at least jargon is reduced.

cmiles74

IMHO, anthrophormization of LLMs is happening because it's perceived as good marketing by big corporate vendors.

People are excited about the technology and it's easy to use the terminology the vendor is using. At that point I think it gets kind of self fulfilling. Kind of like the meme about how to pronounce GIF.

Angostura

IMHO it happens for the same reason we see shapes in clouds. The human mind through millions of years has evolved to equate and conflate the ability to generate cogent verbal or written output with intelligence. It's an instinct to equate the two. It's an extraordinarily difficult instinct to break. LLMs are optimised for the one job that will make us confuse them for being intelligent

brookst

Nobody cares about what’s perceived as good marketing. People care about what resonates with the target market.

But yes, anthropomorphising LLMs is inevitable because they feel like an entity. People treat stuffed animals like creatures with feelings and personality; LLMs are far closer than that.

cmiles74

Alright, let’s agree that good marketing resonates with the target market. ;-)

DrillShopper

> People treat stuffed animals like creatures with feelings and personality; LLMs are far closer than that.

Children do, some times, but it's a huge sign of immaturity when adults, let alone tech workers, do it.

I had a professor at University that would yell at us if/when we personified/anthropomorphized the tech, and I have that same urge when people ask me "What does <insert LLM name here> think?".

gugagore

I'm not sure what you mean by "hidden state". If you set aside chain of thought, memories, system prompts, etc. and the interfaces that don't show them, there is no hidden state.

These LLMs are almost always, to my knowledge, autoregressive models, not recurrent models (Mamba is a notable exception).

barrkel

Hidden state in the form of the activation heads, intermediate activations and so on. Logically, in autoregression these are recalculated every time you run the sequence to predict the next token. The point is, the entire NN state isn't output for each token. There is lots of hidden state that goes into selecting that token and the token isn't a full representation of that information.

brookst

State typically means between interactions. By this definition a simple for loop has “hidden state” in the counter.

gugagore

That's not what "state" means, typically. The "state of mind" you're in affects the words you say in response to something.

Intermediate activations isn't "state". The tokens that have already been generated, along with the fixed weights, is the only data that affects the next tokens.

halJordan

If you dont know, that's not necessarily anyone's fault, but why are you dunking into the conversation? The hidden state is a foundational part of a transformers implementation. And because we're not allowed to use metaphors because that is too anthropomorphic, then youre just going to have to go learn the math.

markerz

I don't think your response is very productive, and I find that my understanding of LLMs aligns with the person you're calling out. We could both be wrong, but I'm grateful that someone else spoke saying that it doesn't seem to match their mental model and we would all love to learn a more correct way of thinking about LLMs.

Telling us to just go and learn the math is a little hurtful and doesn't really get me any closer to learning the math. It gives gatekeeping.

tbrownaw

The comment you are replying to is not claiming ignorance of how models work. It is saying that the author does know how they work, and they do not contain anything that can properly be described as "hidden state". The claimed confusion is over how the term "hidden state" is being used, on the basis that it is not being used correctly.

gugagore

Do you appreciate a difference between an autoregressive model and a recurrent model?

The "transformer" part isn't under question. It's the "hidden state" part.

8note

do LLM models consider future tokens when making next token predictions?

eg. pick 'the' as the next token because there's a strong probability of 'planet' as the token after?

is it only past state that influences the choice of 'the'? or that the model is predicting many tokens in advance and only returning the one in the output?

if it does predict many, id consider that state hidden in the model weights.

patcon

I think recent Anthropic work showed that they "plan" future tokens in advance in an emergent way:

https://www.anthropic.com/research/tracing-thoughts-language...

d3m0t3p

Do they ? LLM embedd the token sequence N^{L} to R^{LxD}, we have some attention and the output is also R^{LxD}, then we apply a projection to the vocabulary and we get R^{LxV} we get therefore for each token a likelihood over the voc. In the attention, you can have Multi Head attention (or whatever version is fancy: GQA,MLA) and therefore multiple representation, but it is always tied to a token. I would argue that there is no hidden state independant of a token.

Whereas LSTM, or structured state space for example have a state that is updated and not tied to a specific item in the sequence.

I would argue that his text is easily understandable except for the notation of the function, explaining that you can compute a probability based on previous words is understandable by everyone without having to resort to anthropomorphic terminology

barrkel

There is hidden state as plain as day merely in the fact that logits for token prediction exist. The selected token doesn't give you information about how probable other tokens were. That information, that state which is recalculated in autoregression, is hidden. It's not exposed. You can't see it in the text produced by the model.

There is plenty of state not visible when an LLM starts a sentence that only becomes somewhat visible when it completes the sentence. The LLM has a plan, if you will, for how the sentence might end, and you don't get to see an instance of that plan unless you run autoregression far enough to get those tokens.

Similarly, it has a plan for paragraphs, for whole responses, for interactive dialogues, plans that include likely responses by the user.

8note

this sounds like a fun research area. do LLMs have plans about future tokens?

how do we get 100 tokens of completion, and not just one output layer at a time?

are there papers youve read that you can share that support the hypothesis? vs that the LLM doesnt have ideas about the future tokens when its predicting the next one?

szvsw

So the author’s core view is ultimately a Searle-like view: a computational, functional, syntactic rules based system cannot reproduce a mind. Plenty of people will agree, plenty of people will disagree, and the answer is probably unknowable and just comes down to whatever axioms you subscribe to in re: consciousness.

The author largely takes the view that it is more productive for us to ignore any anthropomorphic representations and focus on the more concrete, material, technical systems - I’m with them there… but only to a point. The flip side of all this is of course the idea that there is still something emergent, unplanned, and mind-like. So even if it is a stochastic system following rules, clearly the rules are complex enough (to the tune of billions of operations, with signals propagating through some sort of resonant structure, if you take a more filter impulse response like view of a sequential matmuls) to result in emergent properties. Even if we (people interested in LLMs with at least some level of knowledge of ML mathematics and systems) “know better” than to believe these systems to possess morals, ethics, feelings, personalities, etc, the vast majority of people do not have any access to meaningful understanding of the mathematical, functional representation of an LLM and will not take that view, and for all intents and purposes the systems will at least seem to have those anthropomorphic properties, and so it seems like it is in fact useful to ask questions from that lens as well.

In other words, just as it’s useful to analyze and study these things as the purely technical systems they ultimately are, it is also, probably, useful to analyze them from the qualitative, ephemeral, experiential perspective that most people engage with them from, no?

CharlesW

> The flip side of all this is of course the idea that there is still something emergent, unplanned, and mind-like.

For people who have only a surface-level understanding of how they work, yes. A nuance of Clarke's law that "any sufficiently advanced technology is indistinguishable from magic" is that the bar is different for everybody and the depth of their understanding of the technology in question. That bar is so low for our largely technologically-illiterate public that a bothersome percentage of us have started to augment and even replace religious/mystical systems with AI powered godbots (LLMs fed "God Mode"/divination/manifestation prompts).

(1) https://www.spectator.co.uk/article/deus-ex-machina-the-dang... (2) https://arxiv.org/html/2411.13223v1 (3) https://www.theguardian.com/world/2025/jun/05/in-thailand-wh...

brookst

Thank you for a well thought out and nuanced view in a discussion where so many are clearly fitting arguments to foregone, largely absolutist, conclusions.

It’s astounding to me that so much of HN reacts so emotionally to LLMs, to the point of denying there is anything at all interesting or useful about them. And don’t get me started on the “I am choosing to believe falsehoods as a way to spite overzealous marketing” crowd.

gtsop

No.

Why would you ever want to amplify a false understanding that has the potential to affect serious decisions across various topics?

LLMs reflect (and badly I may add) aspects of the human thought process. If you take a leap and say they are anything more than that, you might as well start considering the person appearing in your mirror as a living being.

Literally (and I literally mean it) there is no difference. The fact that a human image comes out of a mirror has no relation what so ever with the mirror's physical attributes and functional properties. It has to do just with the fact that a man is standing in front of it. Stop feeding the LLM with data artifacts of human thought and will imediatelly stop reflecting back anything resembling a human.

degamad

> Why would you ever want to amplify a false understanding that has the potential to affect serious decisions across various topics?

We know that Newton's laws are wrong, and that you have to take special and general relativity into account. Why would we ever teach anyone Newton's laws any more?

ifdefdebug

Newton's laws are a good enough approximation for many tasks so it's not a "false understanding" as long as their limits are taken into account.

szvsw

I don’t mean to amplify a false understanding at all. I probably did not articulate myself well enough, so I’ll try again.

I think it is inevitable that some - many - people will come to the conclusion that these systems have “ethics”, “morals,” etc, even if I or you personally do not think they do. Given that many people may come to that conclusion though, regardless of if the systems do or do not “actually” have such properties, I think it is useful and even necessary to ask questions like the following: “if someone engages with this system, and comes to the conclusion that it has ethics, what sort of ethics will they be likely to believe the system has? If they come to the conclusion that it has ‘world views,’ what ‘world views’ are they likely to conclude the system has, even if other people think it’s nonsensical to say it has world views?”

> The fact that a human image comes out of a mirror has no relation what so ever with the mirror's physical attributes and functional properties. It has to do just with the fact that a man is standing in front of it.

Surely this is not quite accurate - the material properties - surface roughness, reflectivity, geometry, etc - all influence the appearance of a perceptible image of a person. Look at yourself in a dirty mirror, a new mirror, a shattered mirror, a funhouse distortion mirror, a puddle of water, a window… all of these produce different images of a person with different attendant phenomenological experiences of the person seeing their reflection. To take that a step further - the entire practice of portrait photography is predicated on the idea that the collision of different technical systems with the real world can produce different semantic experiences, and it’s the photographer’s role to tune and guide the system to produce some sort of contingent affect on the person viewing the photograph at some point in the future. No, there is no “real” person in the photograph, and yet, that photograph can still convey something of person-ness, emotion, memory, etc etc. This contingent intersection of optics, chemical reactions, lighting, posture, etc all have the capacity to transmit something through time and space to another person. It’s not just a meaningless arrangement of chemical structures on paper.

> Stop feeding the LLM with data artifacts of human thought and will imediatelly stop reflecting back anything resembling a human.

But, we are feeding it with such data artifacts and will likely continue to do so for a while, and so it seems reasonable to ask what it is “reflecting” back…

fastball

"Don't anthropomorphize token predictors" is a reasonable take assuming you have demonstrated that humans are not in fact just sophisticated token predictors. But AFAIK that hasn't been demonstrated.

Until we have a much more sophisticated understanding of human intelligence and consciousness, any claim of "these aren't like us" is a bit spurious.

ants_everywhere

> I am baffled that the AI discussions seem to never move away from treating a function to generate sequences of words as something that resembles a human.

This is such a bizarre take.

The relation associating each human to the list of all words they will ever say is obviously a function.

> almost magical human-like powers to something that - in my mind - is just MatMul with interspersed nonlinearities.

There's a rich family of universal approximation theorems [0]. Combining layers of linear maps with nonlinear cutoffs can intuitively approximate any nonlinear function in ways that can be made rigorous.

The reason LLMs are big now is that transformers and large amounts of data made it economical to compute a family of reasonably good approximations.

> The following is uncomfortably philosophical, but: In my worldview, humans are dramatically different things than a function . For hundreds of millions of years, nature generated new versions, and only a small number of these versions survived.

This is just a way of generating certain kinds of functions.

Think of it this way: do you believe there's anything about humans that exists outside the mathematical laws of physics? If so that's essentially a religious position (or more literally, a belief in the supernatural). If not, then functions and approximations to functions are what the human experience boils down to.

[0] https://en.wikipedia.org/wiki/Universal_approximation_theore...

LeifCarrotson

> I am baffled that the AI discussions seem to never move away from treating a function to generate sequences of words as something that resembles a human.

You appear to be disagreeing with the author and others who suggest that there's some element of human consciousness that's beyond than what's observable from the outside, whether due to religion or philosophy or whatever, and suggesting that they just not do that.

In my experience, that's not a particularly effective tactic.

Rather, we can make progress by assuming their predicate: Sure, it's a room that translates Chinese into English without understanding, yes, it's a function that generates sequences of words that's not a human... but you and I are not "it" and it behaves rather an awful lot like a thing that understands Chinese or like a human using words. If we simply anthropomorphize the thing, acknowledging that this is technically incorrect, we can get a lot closer to predicting the behavior of the system and making effective use of it.

Conversely, when speaking with such a person about the nature of humans, we'll have to agree to dismiss the elements that are different from a function. The author says:

> In my worldview, humans are dramatically different things than a function... In contrast to an LLM, given a human and a sequence of words, I cannot begin putting a probability on "will this human generate this sequence".

Sure you can! If you address an American crowd of a certain age range with "We’ve got to hold on to what we’ve got. It doesn’t make a difference if..." I'd give a very high probability that someone will answer "... we make it or not". Maybe that human has a unique understanding of the nature of that particular piece of pop culture artwork, maybe it makes them feel things that an LLM cannot feel in a part of their consciousness that an LLM does not possess. But for the purposes of the question, we're merely concerned with whether a human or LLM will generate a particular sequence of words.

mewpmewp2

My question: how do we know that this is not similar to how human brains work. What seems intuitively logical to me is that we have brains evolved through evolutionary process via random mutations yielding in a structure that has its own evolutionary reward based algorithms designing it yielding a structure that at any point is trying to predict next actions to maximise survival/procreation, of course with a lot of sub goals in between, ultimately becoming this very complex machinery, but yet should be easily simulated if there was enough compute in theory and physical constraints would allow for it.

Because, morals, values, consciousness etc could just be subgoals that arised through evolution because they support the main goals of survival and procreation.

And if it is baffling to think that a system could rise up, how do you think it is possible life and humans came to existence in the first place? How could it be possible? It is already happened from a far unlikelier and strange place. And wouldn't you think the whole World and the timeline in theory couldn't be represented as a deterministic function. And if not then why should "randomness" or anything else bring life to existence.

cmiles74

Maybe the important thing is that we don't imbue the machine with feelings or morals or motivation: it has none.

mewpmewp2

If we developed feelings, morals and motivation due to them being good subgoals for primary goals, survival and procreation why couldn't other systems do that. You don't have to call them the same word or the same thing, but feeling is a signal that motivates a behaviour in us, that in part has developed from generational evolution and in other part by experiences in life. There was a random mutation that made someone develop a fear signal on seeing a predator and increased the survival chances, then due to that the mutation became widespread. Similarly a feeling in a machine could be a signal it developed that goes through a certain pathway to yield in a certain outcome.

ants_everywhere

> My question: how do we know that this is not similar to how human brains work.

It is similar to how human brains operate. LLMs are the (current) culmination of at least 80 years of research on building computational models of the human brain.

bbarn

I think it's just an unfair comparison in general. The power of the LLM is the zero risk to failure, and lack of consequence when it does. Just try again, using a different prompt, retrain maybe, etc.

Humans make a bad choice, it can end said human's life. The worst choice a LLM makes just gets told "no, do it again, let me make it easier"

mewpmewp2

But an LLM model could perform poorly in tests that it is not considered and essentially means "death" for it. But begs the question at which scope should we consider an LLM to be similar to identity of a single human. Are you the same you as you were few minutes back or 10 years back? Is LLM the same LLM it is after it has been trained for further 10 hours, what if the weights are copy pasted endlessly, what if we as humans were to be cloned instantly? What if you were teleported from location A to B instantly, being put together from other atoms from elsewhere?

Ultimately this matters from evolutionary evolvement and survival of the fittest idea, but it makes the question of "identity" very complex. But death will matter because this signals what traits are more likely to keep going into new generations, for both humans and LLMs.

Death, essentially for an LLM would be when people stop using it in favour of some other LLM performing better.

Culonavirus

> A fair number of current AI luminaries have self-selected by their belief that they might be the ones getting to AGI

People in the industry, especially higher up, are making absolute bank, and it's their job to say that they're "a few years away" from AGI, regardless of if they actually believe it or not. If everyone was like "yep, we're gonna squeeze maybe 10-15% more benchie juice out of this good ole transformer thingy and then we'll have to come up with something else", I don't think that would go very well with investors/shareholders...

chaps

I highly recommend playing with embeddings in order to get a stronger intuitive sense of this. It really starts to click that it's a representation of high dimensional space when you can actually see their positions within that space.

perching_aix

> of this

You mean that LLMs are more than just the matmuls they're made up of, or that that is exactly what they are and how great that is?

chaps

Not making a qualitative assessment of any of it. Just pointing out that there are ways to build separate sets of intuition outside of using the "usual" presentation layer. It's very possible to take a red-team approach to these systems, friend.

djoldman

Let's skip to the punchline. Using TFA's analogy: essentially folks are saying not that this is a set of dice rolling around making words. It's a set of dice rolling around where someone attaches those dice to the real world where if the dice land on 21, the system kills a chicken, or a lot worse.

Yes it's just a word generator. But then folks attach the word generator to tools where it can invoke the use of tools by saying the tool name.

So if the LLM says "I'll do some bash" then it does some bash. It's explicitly linked to program execution that, if it's set up correctly, can physically affect the world.

degun

This was the same idea that crossed my mind while reading the article. It seems far too naive to think that because LLMs have no will of their own, there will be no harmful consequences on the real world. This is exactly where ethics comes to play.

3cats-in-a-coat

Given our entire civilization is built on words, all of it, it's shocking how poorly most of us understand their importance and power.

Kim_Bruning

Has anyone asked an actual Ethologist or Neurophysiologist what they think?

People keep debating like the only two options are "it's a machine" or "it's a human being", while in fact the majority of intelligent entities on earth are neither.

szvsw

Yeah, I think I’m with you if you ultimately mean to say something like this:

“the labels are meaningless… we just have collections of complex systems that demonstrate various behaviors and properties, some in common with other systems, some behaviors that are unique to that system, sometimes through common mechanistic explanations with other systems, sometimes through wildly different mechanistic explanations, but regardless they seem to demonstrate x/y/z, and it’s useful to ask, why, how, and what the implications are of it appearing to demonstrating those properties, with both an eye towards viewing it independently of its mechanism and in light of its mechanism.”