A Definition of AGI

131 comments

·October 26, 2025

flkiwi

> defining AGI as matching the cognitive versatility and proficiency of a well-educated adult

I don't think people really realize how extraordinary accomplishment it would be to have an artificial system matching the cognitive versatility and proficiency of an uneducated child, much less a well-educated adult. Hell, AI matching the intelligence of some nonhuman animals would be an epoch-defining accomplishment.

ben_w

Or even to come up with a definition of cognitive versatility and proficiency that is good enough to not get argued away once we have an AI which technically passes that specific definition.

The Turing Test was great until something that passed it (with an average human as interrogator) turned out to also not be able to count letters in a word — because only a special kind of human interrogator (the "scientist or QA" kind) could even think to ask that kind of question.

cbdevidal

Have any benchmarks been made that use this paper’s definition? I follow the ARC prize and Humanity’s Last Exam, but I don’t know how closely they would map to this paper’s methods.

Edit: Probably not, since it was published less than a week ago :-) I’ll be watching for benchmarks.

surgical_fire

There are some sycophants that claim that LLMs can operate at Junior Enginee level.

Try to reconcile that with your ideas (that I think are correct for that matter)

ben_w

I'll simultaneously call all current ML models "stupid" and also say that SOTA LLMs can operate at junior (software) engineer level.

This is because I use "stupidity" as the number of examples some intelligence needs in order to learn from.

LLMs make up for being too stupid to live (literally: no living thing could survive if it needed so many examples) by going through each example faster than any living thing ever could — by as many orders of magnitude as there are between jogging and continental drift.

NedF

[dead]

jal278

The fundamental premise of this paper seems flawed -- take a measure specifically designed for the nuances of how human performance on a benchmark correlates with intelligence in the real world, and then pretend as if it makes sense to judge a machine's intelligence on that same basis, when machines do best on these kinds of benchmarks in a way that falls apart when it comes to the messiness of the real world.

This paper, for example, uses the 'dual N-back test' as part of its evaluation. In humans this relates to variation in our ability to use working memory, which in humans relates to 'g'; but it seems pretty meaningless when applied to transformers -- because the task itself has nothing intrinsically to do with intelligence, and of course 'dual N-back' should be easy for transformers -- they should have complete recall over their large context window.

Human intelligence tests are designed to measure variation in human intelligence -- it's silly to take those same isolated benchmarks and pretend they mean the same thing when applied to machines. Obviously a machine doing well on an IQ test doesn't mean that it will be able to do what a high IQ person could do in the messy real world; it's a benchmark, and it's only a meaningful benchmark because in humans IQ measures are designed to correlate with long-term outcomes and abilities.

That is, in humans, performance on these isolated benchmarks is correlated with our ability to exist in the messy real-world, but for AI, that correlation doesn't exist -- because the tests weren't designed to measure 'intelligence' per se, but human intelligence in the context of human lives.

stared

There’s already a vague definition that AGI is an AI with all the cognitive capabilities of a human. Yes, it’s vague - people differ.

This paper promises to fix "the lack of a concrete definition for Artificial General Intelligence", yet it still relies on the vague notion of a "well-educated adult". That’s especially peculiar, since in many fields AI is already beyond the level of an adult.

You might say this is about "jaggedness", because AI clearly lacks quite a few skills:

> Application of this framework reveals a highly “jagged” cognitive profile in contemporary models.

But all intelligence, of any sort, is "jagged" when measured against a different set of problems or environments.

So, if that’s the case, this isn’t really a framework for AGI; it’s a framework for measuring AI along a particular set of dimensions. A more honest title might be: "A Framework for Measuring the Jaggedness of AI Against the Cattell–Horn–Carroll Theory". It wouldn't be nearly as sexy, though.

bee_rider

Huh. I haven’t read the paper yet. But, it seems like a weird idea—wouldn’t the standard of “well educated (I assume, modern) adult” preclude the vast majority of humans who ever lived from being considered general intelligences?

vidarh

And this is indeed a huge problem with a lot of the attacks on LLM even as more limited AI - a lot of them are based on applying arbitrary standards without even trying to benchmark against people, and without people being willing to discuss where they draw the line for stating that a given subset of people do not possess general intelligence...

I think people get really uncomfortable trying to even tackle that, and realistically for a huge set of AI tasks we need AI that are more intelligent than a huge subset of humans for it to be useful. But there are also a lot of tasks where AI that is not needed, and we "just" need "more human failure modes".

catlifeonmars

I read this as a hypothetical well-educated adult. As in, given the same level of knowledge, the intelligence performs equally well.

I do agree that it’s a weird standard though. Many of our AI implementations exceed the level of knowledge of a well-educated adult (and still underperform with that advantage in many contexts).

Personally, I don’t think defining AGI is particularly useful. It is just a marketing term. Rather, it’s more useful to just speak about features/capabilities. Shorthand for a specific set of capabilities will arise naturally.

fjdjshsh

>But all intelligence, of any sort, is "jagged" when measured against a different set of problems or environments.

On the other hand, research on "common intelligence" AFAIK shows that most measures of different types of intelligence have a very high correlation and some (apologies, I don't know the literature) have posited that we should think about some "general common intelligence" to understand this.

The surprising thing about AI so far is how much more jagged it is wrt to human intelligence

pixl97

Human intelligence has had hundreds of thousands of years of evolution that removes any 'fatal' variance from our intelligence. Too dumb is obvious on how it's culled, but 'too smart' can get culled by social creatures too, really 'too different' in any way.

Current AI is in its infancy and we're just throwing data at it in the same way evolution throws random change at our DNA and sees what sticks.

stared

I think you are talking about correlation in humans of, say, verbal and mathematical intelligence. Still, it is a correlation, not equality - there are many word-acknowledged writers who suck at math, and mathematical prodigies who are are not the best at writing.

If you go beyond human species (and well, computers are not even living organisms), it gets tricky. Adaptability (which is arguably a broader concept than intelligence) is very different for, say octopodes, corvids and slime molds.

It is certainly not a single line of proficiency or progress. Things look like lines only if we zoom a lot.

zkmon

The problem, I guess, with these methods is, they consider human intelligence as something detached from human biology. I think this is incorrect. Everything that goes in the human mind is firmly rooted in the biological state of that human, and the biological cycles that evolved through millennia.

Things like chess-playing skill of a machine could be bench-marked against that of a human, but the abstract feelings that drive reasoning and correlations inside a human mind are more biological than logical.

Workaccount2

There is no reason to believe that consciousness, sentience, or emotions require a biological base.

sim7c00

they do not, but the same argument can hold true by the fact the true human nature is not really known and thus trying to define what a human like intelligence would consist of can only be incomplete.

there are many parts of human cognition, phycology etc. especially related to consciousness that are known unknowns and/or completely unknown.

a mitigation for this isaue would be to call it generally applicable intelligence or something, rather than human like intelligence. implying ita not specialized AI but also not human like. (i dont see why it would need to be human like, because even with all the right logic and intelligence a human can still do something counter to all of that. humans do this everyday. intuitive action, or irrational action etc.

what we want is generally applicable intelligence, not human like intelligence.

nebezb

I’m certainly not informed enough to have an intelligent conversation about this, but surely the emotions bit can’t be right?

My emotions are definitely a function of the chemical soup my brain is sitting in (or the opposite).

BugsJustFindMe

Your emotions are surely caused by the chemical soup, but chemical soup need not be the only way to arrive at emotions. It is possible for different mechanisms to achieve same outcomes.

steve_adams_86

Is there more reason to believe otherwise? I'm not being contrarian, I'm genuinely curious what people think.

Lerc

That asks you to consider the statements

There is reason to believe that consciousness, sentience, or emotions require a biological base.

There is no reason to believe that consciousness, sentience, or emotions do not require a biological base.

The first is simple, if there is a reason you can ask for it and evaluate it's merits. Quantum stuff is often pointed to here, but the reasoning is unconvincing.

The second form There is no reason to believe P does not require Q.

There are no proven reasons but there are suspected reasons. For instance if the operation that nerons perform is what makes consciousness work, and that operation can be reproduced non-biologicLly it would follow that non biological consciousness would be possible.

For any observable phenomenon in the brain the same thing can be asked. So far it seems reasonable to expect most of the observable processes could be replicated.

None of it acts as proof, but they probably rise to the bar of reasons.

ComplexSystems

What is the "irreplaceable" part of human biology that leads to consciousness? Microtubules? Whatever it is, we could presumably build something artificial that has it.

vhantz

What non-biological systems do we know of that have consciousness, sentience or emotions?

BugsJustFindMe

We have no known basis for even deciding that other than the (maybe right, maybe wrong) guess that consciousness requires a lot of organized moving complexity. Even with that guess, we don't know how much is needed or what kind.

zkmon

None of that comes from outside of your biology and chemistry.

runarberg

There is exactly one good reason, at least for consciousness and sentience. And the reason is that those are such a vaguely defined (or rather defined by prototypes; ala Wittgenstein [or JavaScript before classes]). And that reason is anthropism.

We only have one good example of consciousness and sentience, and that is our own. We have good reason to suspect other entities (particularly other human individuals, but also other animals) have that as well, but we cannot access it, and not even confirm its existence. As a result using these terms of non-human beings becomes confusing at best, but it will never be actually helpful.

Emotions are another thing, we can define that outside of our experience, using behavior states and its connection with patterns of stimuli. For that we can certainly observe and describe behavior of a non biological entity as emotional. But given that emotion is something which regulates behavior which has evolved over millions of years, whether such a description would be useful is a whole another matter. I would be inclined to use a more general description of behavior patterns which includes emotion but also other means of behavior regulators.

dangus

What if our definition of those concepts is biological to begin with?

How does a computer with full AGI experience the feeling of butterflies in your stomach when your first love is required?

How does a computer experience the tightening of your chest when you have a panic attack?

How does a computer experience the effects of chemicals like adrenaline or dopamine?

The A in AGI stands for “artificial” for good reason, IMO. A computer system can understand these concepts by description or recognize some of them them by computer vision, audio, or other sensors, but it seems as though it will always lack sufficient biological context to experience true consciousness.

Perhaps humans are just biological computers, but the “biological” part could be the most important part of that equation.

modeless

GPT-5 scores 58%? That seems way too high. GPT-5 is good but it is not that close to AGI.

Also, weird to see Gary Marcus and Yoshua Bengio on the same paper. Who really wrote this? Author lists are so performative now.

jonplackett

As anyone using AI knows - the first 90% is easy, the next 9% is much harder and the last 1% takes more time than the other 99%.

edulix

We have SAGI: Stupid Artificial General Intelligence. It's actually quite general, but works differently. In some areas it can be better or faster than a human, and in others it's more stupid.

Just like an airplane doesn't work exactly like a bird, but both can fly.

merksittich

I find the concept of low floor/high ceiling quite helpful, as for instance recently discussed in "When Will AI Transform the Economy?" [1] - actually more helpful than "jagged" intelligence used in TFA.

[1] https://andreinfante.substack.com/p/when-will-ai-transform-t...

quantum_state

Would propose to use the term Naive Artificial General Intelligence, in analogy to the widely used (by working mathematicians) and reasonably successful Naive Set Theory …

wizzwizz4

I was doing some naïve set theory the other day, and I found a proof of the Riemann hypothesis, by contradiction.

Assume the Riemann hypothesis is false. Then, consider the proposition "{a|a∉a}∈{a|a∉a}". By the law of the excluded middle, it suffices to consider each case separately. Assuming {a|a∉a}∈{a|a∉a}, we find {a|a∉a}∉{a|a∉a}, for a contradiction. Instead, assuming {a|a∉a}∉{a|a∉a}, we find {a|a∉a}∈{a|a∉a}, for a contradiction. Therefore, "the Riemann hypothesis is false" is false. By the law of the excluded middle, we have shown the Riemann hypothesis is true.

Naïve AGI is an apt analogy, in this regard, but I feel these systems aren't simple nor elegant enough to deserve the name naïve.

the_arun

It is a good analogy.

tcdent

Don't get me wrong, I am super excited about what AI is doing for technology. But this endless conversation about "what is AGI" is so boring.

It makes me think of every single public discussion that's ever been had about quantum, where you can't start the conversation unless you go through a quick 101 on what a qubit is.

As with any technology, there's not really a destination. There is only the process of improvement. The only real definitive point is when a technology becomes obsolete, though it is still kept alive through a celebration of its nostalgia.

AI will continue to improve. More workflows will become automated. And from our perception, no matter what the rapidness of advancement is, we're still frogs in water.

bongodongobob

I agree. It's an interesting discussion for those who have never taken college level philosophy classes I suppose. What consciousness/thought is is still a massively open question. Seeing people in the comments with what they think is their novel solution has already been posited like 400 years ago. Honestly it's kind of sad seeing this stuff on a forum like this. These posts are for sure the worst of Hackernews.

bonoboTP

There are a bunch of these topics that everyone feels qualified to say something about. Consciousness, intelligence, education methods, nutrition, men vs women, economic systems etc.

It's a very emotional topic because people feel their self image threatened. It's a topic related to what is the meaning of being human. Yeah sure it should be a separate question, but emotionally it is connected to it in a deep level. The prospect of job replacement and social transformation is quite a threatening one.

So I'm somewhat understanding of this. It's not merely an academic topic, because these things will be adopted in the real world among real people. So you can't simply make everyone shut up who is an outsider or just heard about this stuff incidentally in the news and has superficial points to make.

xnx

I like François Chollet definition of AGI as a system that can efficiently acquire new skills outside its training data.

zulban

Not bad. Maybe.

But maybe that's ASI. Whereas I consider chatgpt 3 to be "baby AGI". That's why it became so popular so fast.

JumpCrisscross

> I consider chatgpt 3 to be "baby AGI". That's why it became so popular so fast

ChatGPT became popular because it was easy to use and amusing. (LLM UX until then had been crappy.)

Not sure AGI aspirations had anything to do with uptake.

moffkalast

So... AGI is a few shot performance metric?

jsheard

We'll know AGI has arrived when AGI researchers manage to go five minutes without publishing hallucinated citations.

https://x.com/m2saxon/status/1979349387391439198

artninja1988

Came from the Google Docs to BibTeX conversion apparently

https://x.com/m2saxon/status/1979636202295980299

bonoboTP

This looks like a knee-jerk reaction to the title instead of anything substantial.

nativeit

I’m gonna start referring to my own lies as “hallucinations”. I like the implication that I’m not lying, but rather speaking truthfully, sincerely, and confidently about things that never happened and/or don’t exist. Seems paradoxical, but this is what we’re effectively suggesting with “hallucinations”. LLMs necessarily lack things like imagination, or an ego that’s concerned with the appearance of being informed and factually correct, or awareness for how a lack of truth and honesty may affect users and society. In my (not-terribly-informed) opinion, I’d assert that precludes LLMs from even approximate levels of intelligence. They’re either quasi-intelligent entities who routinely lie to us, or they are complex machines that identify patterns and reconstruct plausible-sounding blocks of text without any awareness of abstract concepts like “truth”.

Edit: toned down the preachiness.

MichaelZuo

It does seem a bit ridiculous…

CamperBob2

So infallibility is one of the necessary criteria for AGI? It does seem like a valid question to raise.

Edit due to rate-limiting, which in turn appears to be due to the inexplicable downvoting of my question: since you (JumpCrisscross) are imputing a human-like motivation to the model, it sounds like you're on the side of those who argue that AGI has already been achieved?

JumpCrisscross

> infallibility

Lying != fallibility.

cjbarber

Some AGI definition variables I see:

Is it about jobs/tasks, or cognitive capabilities? The majority of the AI-valley seems to focus on the former, TFA focuses on the latter.

Can it do tasks, or jobs? Jobs are bundles of tasks. AI might be able to do 90% of tasks for a given job, but not the whole job.

If tasks, what counts as a task: Is it only specific things with clear success criteria? That's easier.

Is scaffolding allowed: Does it need to be able to do the tasks/jobs without scaffolding and human-written few-shot prompts?

Today's tasks/jobs only, or does it include future ones too? As tasks and jobs get automated, jobs evolve and get re-defined. So, being able to do the future jobs too is much harder.

Remote only, or in-person too: In-person too is a much higher bar.

What threshold of tasks/jobs: "most" is apparently typically understood to mean 80-95% (Mira Ariel). Automating 80% of tasks is different to 90% and 95% and 99%. diminishing returns. And how are the tasks counted - by frequency, by dollar-weighted, by unique count of tasks?

Only economically valuable tasks/jobs, or does it include anything a human can do?

A high-order bit on many people's AGI timelines is which definition of AGI they're using, so clarifying the definition is nice.

AstroBen

Not only tasks, but you need to look at the net effect

If it does an hour of tasks, but creates an additional hour of work for the worker...

oidar

This is fine for a definition of AGI, but it's incomplete. It misses so many parts of the cognition that make humans flexible and successful. For example, emotions, feelings, varied pattern recognition, propreception, embodied awareness, social skills, and navigating ambiguous situation w/o algorithms. If the described 10 spectrums of intelligence were maxed by an LLM, it would still fall short.

pixl97

Eh, I don't like the idea of 'intelligence' of any type using humans as the base line. It blinds it to our own limitations and things that may not be limits to other types of intelligence. The "AI won't kill us all because it doesn't have emotions" problem is one of these. For example, just because AI doesn't get angry, doesn't mean it can't recognize your anger and manipulate if given such a directive to.

vardump

Whatever the definition may be, the goalposts are usually moved once AI reaches that point.

kelseyfrog

There's at least two distinct basis in AGI refutations : behaviorist and ontological. They often get muddled.

I can't begin to count the number of times I've encountered someone who holds an ontological belief for why AGI cannot exist and then for some reason formulates it as a behavioralist criteria. This muddying of argument results in what looks like a moving of the goalposts. I'd encourage folks to be more clear whether they believe AGI is ontologically possible or impossible in addition to any behavioralist claims.

zahlman

My experience has been more that the pro-AI people misunderstand where the goalposts were, and then complain when they're correctly pointed at.

The "Turing test" I always saw described in literature, and the examples of what passing output from a machine was imagined to look like, are nothing like what's claimed to pass nowadays. Honestly, a lot of the people claiming that contemporary chatbots pass come across like they would have thought ELIZA passed.

bonoboTP

Can you be more concrete? What kind of answer/conversation do you see as demonstrating passing the test, that you think is currently not possible.

tsimionescu

Ones in which both the human test takers and the human counterparts are actively trying to prove to each other that they are actually human.

With today's chat bots, it's absolutely trivial to tell that you're not talking to a real human. They will never interrupt you, continue their train of thought even thought you're trying to change the conversation, go on a complete non-sequitur, swear at you, etc. These are all things that the human "controls" should be doing to prove to the judges that they are indeed human.

LLMs are nowhere near beating the Turing test. They may fool some humans in some limited interactions, especially if the output is curated by a human. But left alone to interact with the raw output for more than a few lines, and if actively seeking to tell if you're interacting with a human or an AI (instead of wanting to believe), there really is no chance you'd be tricked.

krige

Are you saying that we already have AGI, except those pesky goalpost movers keep denying the truth? Hm.

NitpickLawyer

I'd say yes, by at least one old definition made by someone who was at the time in a position to have a definition.

When deepmind was founded (2010) their definition was the following: AI is a system that learns to perform one thing; AGI is a system that learns to perform many things at the same time.

I would say that whatever we have today, "as a system" matches that definition. In other words, the "system" that is say gpt5/gemini3/etc has learned to "do" (while do is debateable) a lot of tasks (read/write/play chess/code/etc) "at the same time". And from a "pure" ML point, it learned those things from the "simple" core objective of next token prediction (+ enhancements later, RL, etc). That is pretty cool.

So I can see that as an argument for "yes".

But, even the person who had that definition has "moved the goalposts" of his own definition. From recent interviews, Hassabis has moved towards a definition that resembles the one from this paper linked here. So there's that. We are all moving the goalposts.

And it's not a recent thing. People did this back in the 80s. There's the famous "As soon as AI does something, it ceases to be AI" or paraphrased "AI is everything that hasn't been done yet".

bossyTeacher

> AGI is a system that learns to perform many things at the same time.

What counts as a "thing"? Because arguably some of the deep ANNs pre-transfomers would also qualify as AGI but no one would consider them intelligent (not in the human or animal sense of intelligence).

And you probably don't even need fancy neural networks. Get a RL algorithm and a properly mapped solution space and it will learn to do whatever you want as long as the problem can be mapped.

wahnfrieden

Can you cite the Deepmind definition? No Google results for that.

darepublic

It doesn't play chess? Just can parrot it very well

vardump

No, just what has usually happened in the past with AI goalposts.

At first, just playing chess was considered to be a sign of intelligence. Of course, that was wrong, but not obvious at all in 1950.

derektank

It wasn't the best definition of AGI but I think if you asked an interested layman whether or not a system that can pass the Turing test was AGI 5 years ago, they would have said yes

jltsiren

An interested but uninformed layman.

When I was in college ~25 years ago, I took a class on the philosophy of AI. People had come up with a lot of weird ideas about AI, but there was one almost universal conclusion: that the Turing test is not a good test for intelligence.

The least weird objection was that the premise of the Turing test is unscientific. It sees "this system is intelligent" as a logical statement and seeks to prove or disprove it in an abstract model. But if you perform an experiment to determine if a real-world system is intelligent, the right conclusion for the system passing the test is that the system may be intelligent, but a different experiment might show that it's not.

A4ET8a8uTh0_v2

I think, given some of the signs of the horizon, there is a level of MAD type bluffing going around, but some of the actions by various power centers suggest it is either close, people think its close or it is there.

MattRix

Isn’t that the point of trying to define it in a more rigorous way, like this paper is doing?

rafram

Are you claiming that LLMs have achieved AGI?

moffkalast

Compared to everything that came before they are fairly general alright.

righthand

I agree if our comprehension of intelligence and “life” is incomplete, so is our model for artificial intelligence.

bigyabai

The authors acknowledge that this is entirely possible. Their work is just grounded in theory, after all:

> we ground our methodology in Cattell-Horn-Carroll theory, the most empirically validated model of human cognition.

HN

A Definition of AGI

A Definition of AGI