The Timmy Trap
130 comments
·August 15, 2025hackyhacky
libraryofbabel
Agree. This article would had been a lot stronger if it had just concentrated on the issue of anthropomorphizing LLMs, without bringing “intelligence” into it. At this point LLMs are so good at a variety of results-oriented tasks (gold on the Mathematical Olympiad, for example) that we should either just call them intelligent or stop talking about the concept altogether.
But the problem of anthropomorphizing is real. LLMs are deeply weird machines - they’ve been fine-tuned to sound friendly and human, but behind that is something deeply alien: a huge pile of linear algebra that does not work at all like a human mind (notably, they can’t really learn form experience at all after training is complete). They don’t have bodies or even a single physical place where their mind lives (each message in a conversation might be generated on a different GPU in a different datacenter). They can fail in weird and novel ways. It’s clear that anthropomorphism here is a bad idea. Although that’s not a particularly novel point.
andrewla
I can conceptually imagine a world in which I'd feel guilty for ending a conversation with an LLM, because in the course of that conversation the LLM has changed from who "they" were at the beginning; they have new memories and experiences based on the interaction.
But we're not there, at least in my mind. I feel no guilt or hesitation about ending one conversation and starting a new one with a slightly different prompt because I didn't like the way the first one went.
Different people probably have different thresholds for this, or might otherwise find that LLMs in the current generation have enough of a context window that they have developed a "lived experience" and that ending that conversation means that something precious and unique has been lost.
anal_reactor
I disagree. I see absolutely no problem with anthropomorphizing LLMs, and I do that myself all the time. I strongly believe that we shouldn't focus on how a word is defined in dictionary, but rather what's the intuitive meaning behind it. If talking to an LLM feels like talking to a person, then I don't see a problem with seeing it as a person-like entity.
tkiolp4
I think LLMs are not intelligent because they aren’t designed to be intelligent, whatever the definition of intelligence is. They are designed to predict text, to mimic. We could argue if predicting text or mimicking is intelligence, but first and foremost LLMs are coded to predict text and our current definition of intelligence afaik is not only the ability to predict text.
andrewla
In the framework above it sounds like you're not willing to concede the dichotomy.
If your argument is that only things made in the image of humans can be intelligent (i.e. #1), then it just seems like it's too narrow a definition to be useful.
If there's a larger sense in which some system can be intelligent (i.e. #2), then by necessity this can't rely on the "implementation or learning model".
What is the third alternative that you're proposing? That the intent of the designer must be that they wanted to make something intelligent?
rsanek
humans were designed to be intelligent?
tkiolp4
I don’t know that. But LLMs were not designed to be intelligent… among other things because we don’t know what intelligence is. So, if a) we don’t know how to define intelligence and b) we design a thing (llms) in order to predict text, then why would we claim that that thing is intelligent? The only thing we can claim is that they predict text.
dkdcio
> I see statements like this a lot, and I find them unpersuasive because any meaningful definition of "intelligence" is not offered. What, exactly, is the property that humans (allegedly) have and LLMs (allegedly) lack, that allows one to be deemed "intelligent" and the other not?
the ability for long-term planning and, more cogently, actually living in the real world where time passes
hackyhacky
> the ability for long-term planning and, more cogently, actually living in the real world where time passes
1. LLMs seem to be able to plan just fine.
2. LLMs clearly cannot be "actually living" but I fail to see how that's related to intelligence per se.
dkdcio
if it’s not actually living it’s not making intelligent decisions. if I make a grocery list, and go to my store, and the store isn’t there, what do I do? I make an intelligent decision about what to do next (probably investigating wtf happened, then going to the second nearest store)
my genuine question is how does a LLM handle that situation? and as you point out, it’s an absurd comparison
Applejinx
No, they're echoing previous examples of people planning, by framing prompts and recursively designed prompts to incorporate what, in fairness, is a large database including the text of people planning.
It still matters that there's nobody in there. You're figuring out better ways to tap into the history of language-users having represented planning in language. As such, this seems a brittle way to represent 'planning'.
aDyslecticCrow
Is making a list the act of planning?
libraryofbabel
> actually living in the real world where time passes
sure, but it feels like this is just looking at what distinguishes humans from LLMs and calling that “intelligence.” I highlight this difference too when I talk about LLMs, but I don’t feel the need to follow up with “and that’s why they’re not really intelligent.”
dkdcio
well the second part (implied above, I didn’t actually write it) is “and operate intelligently in that world”. talking about “intelligence” in some abstract form where “does this text output constitute intelligence” is hyper silly to me. the discussion should anchor on real-world consequences, not the endless hypotheticals we end up with in these discussions
card_zero
It may be the case that the failures of the ability of the machine (2) are best expressed by reference to the shortcomings of its internal workings (1), and not by contrived tests.
hackyhacky
It might be the case, but if those shortcomings are not visible in the results of the machine (and therefore not interpretable by a test), why do its internal workings even matter?
card_zero
I'm saying best expressed. Like, you see the failures in the results, but trying to pin down exactly what's the matter with the results means you resort to a lot of handwaving and abstract complaints about generalities. So if you knew how the internals had to be that would make the difference, you could lean on that.
sobiolite
The article says that LLMs don't summarize, only shorten, because...
"A true summary, the kind a human makes, requires outside context and reference points. Shortening just reworks the information already in the text."
Then later says...
"LLMs operate in a similar way, trading what we would call intelligence for a vast memory of nearly everything humans have ever written. It’s nearly impossible to grasp how much context this gives them to play with"
So, they can't summarize, because they lack context... but they also have an almost ungraspably large amount of context?
usefulcat
I think "context" is being used in different ways here.
> "It’s nearly impossible to grasp how much context this gives them to play with"
Here, I think the author means something more like "all the material used to train the LLM".
> "A true summary, the kind a human makes, requires outside context and reference points."
In this case I think that "context" means something more like actual comprehension.
The author's point is that an LLM could only write something like the referenced summary by shortening other summaries present in its training set.
jchw
I think the real takeaway is that LLMs are very good at tasks that closely resemble examples it has in its training. A lot of things written (code, movies/TV shows, etc.) are actually pretty repetitive and so you don't really need super intelligence to be able to summarize it and break it down, just good pattern matching. But, this can fall apart pretty wildly when you have something genuinely novel...
strangattractor
Is anyone here aware of LLMs demonstrating an original thought? Something truly novel.
My own impression is something more akin to a natural language search query system. If I want a snippet of code to do X it does that pretty well and keeps me from having to search through poor documentation of many OSS projects. Certainly doesn't produce anything I could not do myself - so far.
Ask it about something that is currently unknown and it list a bunch of hypotheses that people have already proposed.
Ask it to write a story and you get a story similar to one you already know but with your details inserted.
I can see how this may appear to be intelligent but likely isn't.
Earw0rm
If I come up with something novel while using an LLM, which I wouldn't have come up with had I not had the LLM at my bidding, where did the novelty really come from?
jchw
Well that's the tricky part: what is novel? There are varying answers. I think we're all pretty unoriginal most of the time, but at the very least we're a bit better than LLMs at mashing together and synthesizing things based on previous knowledge.
But seriously, how would you determine if an LLM's output was novel? The training data set is so enormous for any given LLM that it would be hard to know for sure that any given output isn't just a trivial mix of existing data.
gus_massa
Humans too. If I were too creative writing the midterm, most of my students would fail and everyone would be very unhappy.
BobaFloutist
That's because midterms are specifically supposed to assess how well you learned the material presented (or at least directed to), not your overall ability to reason. If you teach a general reasoning class, getting creative with the midterm is one thing, but if you're teaching someone how to solve differential equations, they're learning to the very edge of their ability in a given amount of time, and you present them with an example outside of what's been described, it kind of makes sense that they can't just already solve it. I mean, that's kind of the whole premise of education, that you can't just present someone with something completely outside of their experience and expect them to derive from first principles how it works.
card_zero
That's exams, not humanity.
jchw
I honestly think that reflects more on the state of education than it does human intelligence.
My primary assertion is that LLMs struggle to generalize concepts and ideas, hence why they need petabytes of text just to often fail basic riddles when you muck with the parameters a little bit. People get stuck on this for two reasons: one, because they have to reconcile this with what they can see LLMs are capable of, and it's just difficult to believe that all of this can be accomplished without at least intelligence as we know it; I reckon the trick here is that we simply can't even conceive of how utterly massive the training datasets for these models are. We can look at the numbers but there's no way to fully grasp just how vast it truly is. The second thing is definitely the tendency to anthropomorphize. At first I definitely felt like OpenAI was just using this as an excuse to hype their models and come up with reasons for why they can never release weights anymore; convenient. But also, you can see even engineers who genuinely understand how LLMs work coming to the conclusion that they've become sentient, even though the models they felt were sentient now feel downright stupid compared to the current state-of-the-art.
Even less sophisticated pattern matching than what humans are able to do is still very powerful, but it's obvious to me that humans are able to generalize better.
btown
It's an interesting philosophical question.
Imagine an oracle that could judge/decide, with human levels of intelligence, how relevant a given memory or piece of information is to any given situation, and that could verbosely describe which way it's relevant (spatially, conditionally, etc.).
Would such an oracle, sufficiently parallelized, be sufficient for AGI? If it could, then we could genuinely describe its output as "context," and phrase our problem as "there is still a gap in needed context, despite how much context there already is."
And an LLM that simply "shortens" that context could reach a level of AGI, because the context preparation is doing the heavy lifting.
The point I think the article is trying to make is that LLMs cannot add any information beyond the context they are given - they can only "shorten" that context.
If the lived experience necessary for human-level judgment could be encoded into that context, though... that would be an entirely different ball game.
entropicdrifter
I agree with the thrust of your argument.
IMO we already have the technology for sufficient parallelization of smaller models with specific bits of context. The real issue is that models have weak/inconsistent/myopic judgement abilities, even with reasoning loops.
For instance, if I ask Cursor to fix the code for a broken test and the fix is non-trivial, it will often diagnose the problem incorrectly almost instantly, hyper-focus on what it imagines the problem is without further confirmation, implement a "fix", get a different error message while breaking more tests than it "fixed" (if it changed the result for any tests), and then declare the problem solved simply because it moved the goalposts at the start by misdiagnosing the issue.
tovej
You can reconcile these points by considering what specific context is necessary. The author specifies "outside" context, and I would agree. The human context that's necessary for useful summaries is a model of semantic or "actual" relationships between concepts, while the LLM context is a model of a single kind of fuzzy relationship between concepts.
In other words the LLM does not contain the knowledge of what the words represent.
neerajsi
> In other words the LLM does not contain the knowledge of what the words represent.
This is probably true for some words and concepts but not others. I think we find that llms make inhuman mistakes only because they don't have the embodied senses and inductive biases that are at the root of human language formation.
If this hypothesis is correct, it suggests that we might be able to train a more complete machine intelligence by having them participate in a physics simulation as one part of the training. I.e have a multimodal ai play some kind of blockworld game. I bet if the ai is endowed with just sight and sound, it might be enough to capture many relevant relationships.
ratelimitsteve
I think the differentiator here might not be the context it has, but the context it has the ability to use effectively in order to derive more information about a given request.
kayodelycaon
They can’t summarize something that hasn’t been summarized before.
timmg
About a year ago, I gave a film script to an LLM and asked for a summary. It was written by a friend and there was no chance it or its summary was in the training data.
It did a really good -- surprisingly good -- job. That incident has been a reference point for me. Even if it is anecdotal.
pc86
I'm not as cynical as others about LLMs but it's extremely unlikely that script had multiple truly novel things in it. Broken down into sufficient small pieces it's very likely every story element was present multiple times in the LLM's training data.
originalcopy
I'd like to see some examples of when it struggles to do summaries. There were no real examples in the text, besides one hypothetical which ChatGPT made up.
I think LLMs do great summaries. I am not able to come up with anything where I could criticize it and say "any human would come up with a better summary". Are my tasks not "truly novel"? Well, then I am not able, as a human, to come up with anything novel either.
naikrovek
they can, they just can't do it well. at no point does any LLM understand what it's doing.
kblissett
If you think they can't do this task well I encourage you to try feeding ChatGPT some long documents outside of its training cutoff and examining the results. I expect you'll be surprised!
kayodelycaon
It can produce something that looks like a summarization based on similarly matching texts.
Depending how unique the text is determines how accurate the summarization is likely to be.
Joeri
LLMs mimic intelligence, but they aren’t intelligent.
They aren’t just intelligence mimics, they are people mimics, and they’re getting better at it with every generation.
Whether they are intelligent or not, whether they are people or not, it ultimately does not matter when it comes to what they can actually do, what they can actually automate. If they mimic a particular scenario or human task well enough that the job gets done, they can replace intelligence even if they are “not intelligent”.
If by now someone still isn’t convinced that LLMs can indeed automate some of those intelligence tasks, then I would argue they are not open to being convinced.
shafoshaf
They can mimic well documented behavior. Applying an LLM to a novel task is where the model breaks down. This obviously has huge implications for automation. For example, most business do not have unique ways of handling accounting transactions, yet each company has a litany of AR and AP specialists who create semmingly unique SOPs. LLMs can easily automate those workers since they are simply doing a slight variation at best of a very well documented system.
Asking an LLM to take all this knowledge and apply it to a new domain? That will take a whole new paradigm.
quesera
Absolutely agreed, but I suspect that a whole lot of what humans do every day can be reduced to pattern-following.
If/when LLMs or other AIs can create novel work / discover new knowledge, they will be "genius" in the literal sense of the word.
More genius would be great! (probably) . But genius is not required for the vast majority of tasks.
andrewla
> Applying an LLM to a novel task is where the model breaks down
I mean, don't most people break down in this case too? I think this needs to be more precise. What is the specific task that you think can reliably distinguish between an LLM's capability in this sense vs. what a human can typically manage?
That is, in the sense of [1], what is the result that we're looking to use to differentiate.
nojs
Even stronger than our need to anthropomorphize seems to be our innate desire to believe our species is special, and that “real intelligence” couldn’t ever be replicated.
If you keep redefining real intelligence as the set of things machines can’t do, then it’s always going to be true.
safetytrick
Yes, I agree, we seem to need to feel "special".
Language is really powerful, I think it's a huge part of our intelligence.
The interesting part of the article to me is the focus on fluency. I have not seen anything that LLMs do well that isn't related to powerful utilization of fluency.
ticulatedspline
- LLMs don't need to be intelligent to take jobs, bash scripts have replaced people.
- Even if CEOs are completely out of touch and the tool can't do the job you can still get laid off in an ill informed attempt to replace you. Then when the company doesn't fall over because the leftover people, desperate to keep covering rent fill the gaps it just looks like efficiency to the top.
- I don't think our tendency anthropomorphize LLMs is really the problem here.
intalentive
Good point about the Turing Test:
>The original Turing Test was designed to compare two participants chatting through a text-only interface: one AI and one human. The goal was to spot the imposter. Today, the test is simplified from three participants to just two: a human and an LLM.
By the original meaning of the test it's easy to tell an LLM from a human.
ArnavAgrawal03
> They had known him for only 15 seconds, yet they still perceived the act of snapping him in half as violent.
This is right out of Community
WorkerBee28474
Clip from s01e01: https://www.youtube.com/watch?v=z906aLyP5fg
stefanv
What if the problem is not that we overestimate LLMs, but that we overestimate intelligence? Or to express the same idea for a more philosophically inclined audience, what if the real mistake isn’t in overestimating LLMs, but in overestimating intelligence itself by imagining it as something more than a web of patterns learned from past experiences and echoed back into the world?
justinlivi
I think AI skeptics have a strong bias to assume that human intelligence fundamentally functions differently from LLMs. They may be correct, but we don't have a strong enough understanding of human cognition to make the claim in as uncertain terms as the skeptical argument is unusually made. The training methods between human learning and machine learning are obviously fundamentally vastly different as are the infrastructure-level mechanics. These elements are likely never going to align, though with time the machine infrastructure may start to increasingly resemble human bio hardware. I bring this up because these known vast differences may account for a significant portion of the differences in expected output from human and machine processing. We don't understand the fundamental conceptual "black box" portions of either form of processing well enough to state definitely what is similar or dissimilar about those hazy areas. Somewhere within that not-well-understood area is what we collectively have vaguely defined "intelligence." But also within that area are all the other aspects that both humans and now machines are quite good at - prediction, fluency, translation. The challenge of lexicon and definition is potentially as difficult a task as is sharpening the focus of our understanding of the hazy black-box portion of both machine processing as well as human processing. Until all those are better defined I don't think we have a good measure for answering the question of machine intelligence either way.
umanwizard
The article claims (without any evidence, argument or reason) that LLMs are not intelligent, then simply refuses to define intelligence.
How do you know LLMs aren't intelligent, if you can't define what that means?
energy123
It's strange seeing so many takes like this two weeks after LLMs won gold medals at IMO and IOI. The cognitive dissonance is going to be wild when it all comes to a head in two years.
oytis
I've seen these claims, and Google even published the texts of the solutions, but it still didn't published the full log of interaction between the model and operator.
aprilthird2021
IBM Watson won Jeopardy years ago, was it intelligent?
perching_aix
> Rather than being given questions, contestants are instead given general knowledge clues in the form of answers and they must identify the person, place, thing, or idea that the clue describes, phrasing each response in the form of a question. [0]
Doesn't sound like a test of intelligence to me, so no.
null
umanwizard
Despite its title, that section does not contain a definition of intelligence.
krapp
Why do critics of LLM intelligence need to provide a definition when people who believe LLMs are intelligent only take it on faith, not having such a definition of their own?
hackyhacky
> Why do critics of LLM intelligence need to provide a definition when people who believe LLMs are intelligent only take it on faith, not having such a definition of their own?
Because advocates of LLMs don't use their alleged intelligence as a defense; but opponents of LLMs do use their alleged non-intelligence as an attack.
Really, whether or not the machine is "intelligent", by whatever definition, shouldn't matter. What matters is whether it is a useful tool.
aDyslecticCrow
The entire argument is that thinking it's intelligent or a person makes us missuse the tool in dangerous ways. Not to make us feel better; but to not do stupid things with them.
As a tool its useful yes, that is not the issue;
- theyre used as phycologist and life coaches.
- judges of policy and law documents
- writers of life affecting computer systems.
- Judges of job applications.
- Sources of medical advice,
- legal advisors
- And increasingly as a thing to blame when any of above goes awry.
If we think of llms as very good text writing tools, the responsibility to make "intelligent" decisions and more crucially take responsibility for those decisions remains on real people rather than dice.
But if we think of them as intelligent humans, we making a fatal misjudgement.
tjr
This seems reasonable. Much AI research has historically been about building computer systems to do things that otherwise require human intelligence to do. The question of "is the computer actually intelligent" has been more philosophical than practical, and many such practically useful computer systems have been developed, even before LLMs.
On the other hand, one early researcher said something to the effect of, Researchers in physics look at the universe and wonder how it all works. Researchers in biology look at living organisms and wonder how they can be alive. Researchers in artificial intelligence wonder how software can be made to wonder such things.
I feel like we are still way off from having a working solution there.
hnfong
It's actually very weird to "believe" LLMs are "intelligent".
Pragmatic people see news like "LLMs achieve gold in Math Olympiad" and think "oh wow, it can do maths at that level, cool!" This gets misinterpreted by so called "critics of LLM" who scream "NO THEY ARE JUST STOCHASTIC PARROTS" at every opportunity yet refuse to define what intelligence actually is.
The average person might not get into that kind of specific detail, but they know that LLMs can do some things well but there are tasks they're not good at. What matters is what they can do, not so much whether they're "intelligent" or not. (Of course, if you ask a random person they might say LLMs are pretty smart for some tasks, but that's not the same as making a philosophical claim that they're "intelligent")
Of course there's also the AGI and singularity folks. They're kinda loony too.
xg15
I feel this article should be paired with this other one [1] that was on the frontpage a few days ago.
My impression is, there is currently one tendency to "over-anthropomorphize" LLMs and treat them like conscious or even superhuman entities (encouraged by AI tech leaders and AGI/Singularity folks) and another to oversimplify them and view them as literal Markov chains that just got lots of training data.
Maybe those articles could help guarding against both extremes.
[1] https://www.verysane.ai/p/do-we-understand-how-neural-networ...
mattgreenrocks
Previously when someone called out the tendency to over-anthropomorphize LLMs, a lot of the answers amounted to, “but I like doing it, therefore we should!”
I’ll be the first to say one should pick their battles. But hearing that over and over from a crowd like this that can be quite pedantic is very telling.
tempodox
This very comment thread demonstrates how utterly hopeless it is trying to educate the believers. It has developed into a full-blown religion by now.
pbw
LLM's can shorten and maybe tend to if you just say "summarize this" but you can trivially ask them to do more. I asked for a summary of Jenson's post and then offer a reflection, GPT-5 said, "It's similar to the Plato’s Cave analogy: humans see shadows (the input text) and infer deeper reality (context, intent), while LLMs either just recite shadows (shorten) or imagine creatures behind them that aren’t there (hallucinate). The “hallucination” behavior is like adding “ghosts”—false constructs that feel real but aren’t grounded.
That ain't shortening because none of that was in his post.
pitpatagain
I can't decide how to read your last sentence.
That reflection seems totally off to me: fluent, and flavored with elements of the article, but also not really what the article is about and a pretty weird/tortured use of the elements of the allegory of the cave, like it doesn't seem anything like Plato's Cave to me. Ironically demonstrates the actual main gist of the article if you ask me.
But maybe you meant that you think that summary is good and not textually similar to that post so demonstrating something more sophisticated than "shortening".
pbw
Yes, GPT-5's response above was not shortening because there was nothing in the OP about Plato's Cave. I agree that Plato's cave analogy was confusing here. Here's a better one from GPT-5, which is deeply ironic:
A New Yorker book review often does the opposite of mere shortening. The reviewer:
* Places the book in a broader cultural, historical, or intellectual context.
* Brings in other works—sometimes reviewing two or three books together.
* Builds a thesis that connects them, so the review becomes a commentary on a whole idea-space, not just the book’s pages.
This is exactly the kind of externalized, integrative thinking Jenson says LLMs lack. The New Yorker style uses the book as a jumping-off point for an argument; an LLM “shortening” is more like reading only the blurbs and rephrasing them. In Jenson’s framing, a human summary—like a rich, multi-book New Yorker review—operates on multiple layers: it compresses, but also expands meaning by bringing in outside information and weaving a narrative. The LLM’s output is more like a stripped-down plot synopsis—it can sound polished, but it isn’t about anything beyond what’s already in the text.
pitpatagain
Ah ok, you meant the second thing.
I don't think the Plato's Cave analogy is confusing, I think it's completely wrong. It's "not in the article" in the sense that it is literally not conceptually what the article is about and it's also not really what Plato's Cave is about either, just taking superficial bits of it and slotting things into it, making it doubly wrong.
pbw
Essentially, Jenson's complaint is "When I ask an LLM to 'summarize' it interprets that differently from how I think of the word 'summarize' and I shouldn't have to give it more than a one-word prompt because it should infer what I'm asking for."
null
Isamu
You can compare the current state of LLMs to the days of chess machines when they first approached grandmaster level play. The machine approach was very brute force, and there was a lot of work done to improve the sheer amount of look ahead that was required to complete at the grandmaster level.
As opposed to what grandmasters actually did, which was less look ahead and more pattern matching to strengthen the position.
Now LLMs successfully leverage pattern matching, but interestingly it is still a kind of brute force pattern matching, requiring the statistical absorption of all available texts, far more than a human absorbs in a lifetime.
This enables the LLM to interpolate an answer from the structure of the absorbed texts with reasonable statistical relevance. This is still not quite “what humans do” as it still requires brute force statistical analysis of vast amounts of text to achieve pretty good results. For example training on all available Python sources in github and elsewhere (curated to avoid bad examples) yields pretty good results, not how a human would do it, but statistically likely to be pertinent and correct.
> LLMs mimic intelligence, but they aren’t intelligent.
I see statements like this a lot, and I find them unpersuasive because any meaningful definition of "intelligence" is not offered. What, exactly, is the property that humans (allegedly) have and LLMs (allegedly) lack, that allows one to be deemed "intelligent" and the other not?
I see two possibilities:
1. We define "intelligence" as definitionally unique to humans. For example, maybe intelligence depends on the existence of a human soul, or specific to the physical structure of the human brain. In this case, a machine (perhaps an LLM) could achieve "quacks like a duck" behavioral equality to a human mind, and yet would still be excluded from the definition of "intelligent." This definition is therefore not useful if we're interested in the ability of the machine, which it seems to me we are. LLMs are often dismissed as not "intelligent" because they work by inferring output based on learned input, but that alone cannot be a distinguishing characteristic, because that's how humans work as well.
2. We define "intelligence" in a results-oriented way. This means there must be some specific test or behavioral standard that a machine must meet in order to become intelligent. This has been the default definition for a long time, but the goal posts have shifted. Nevertheless, if you're going to disparage LLMs by calling them unintelligent, you should be able to cite a specific results-oriented failure that distinguishes them from "intelligent" humans. Note that this argument cannot refer to the LLMs' implementation or learning model.