ARC-AGI without pretraining
20 comments
·March 4, 2025pona-a
tripplyons
I think that most human learning comes from years of sensory input. Why should we expect a machine to generalize well without any background?
aithrowawaycomm
Newborns (and certainly toddlers) seem to understand the underlying concepts for these things when it comes to visual/hepatic object identification and "folk physics":
A short list of abilities that cannot be performed by CompressARC includes:
Assigning two colors to each other (see puzzle 0d3d703e)
Repeating an operation in series many times (see puzzle 0a938d79)
Counting/numbers (see puzzle ce9e57f2)
Translation, rotation, reflections, rescaling, image duplication (see puzzles 0e206a2e, 5ad4f10b, and 2bcee788)
Detecting topological properties such as connectivity (see puzzle 7b6016b9)
Note: I am not saying newborns can solve the corresponding ARC problems! The point is there is a lot of evidence that many of the concepts ARC-AGI is (allegedly) measuring are innate in humans, and maybe most animals; e.g. cockroaches can quickly identify connected/disconnected components when it comes to pathfinding. Again, not saying cockroaches can solve ARC :) OTOH even if orcas were smarter than humans they would struggle with ARC - it would be way too baffling and obtuse if your culture doesn't have the concept of written standardized tests. (I was solving state-mandated ARCish problems since elementary school.)andoando
It does but it also generalizes extremely well
Krasnol
I'd guess it's because we don't want to have another human. We want something better. Therefore, the expectations on the learning process are way beyond what humans do. I guess some are expecting some magic word (formula) which would be like a seed with unlimited potential.
So like humans after all but faster.
I guess it's just hard to write a book about the way you write that book.
ta8645
The issue is that general intelligence is useless without vast knowledge. The pretraining is the knowledge, not the intelligence.
dchichkov
For long context sizes AGI is not useless without vast knowledge. You could always put a bootstrap sequence into the context (think Arecibo Message), followed by your prompt. A general enough reasoner with enough compute should be able to establish the context and reason about your prompt.
ta8645
Yes, but that just effectively recreates the pretraining. You're going to have to explain everything down to what an atom is, and essentially all human knowledge if you want to have any ability to consider abstract solutions that call on lessons from foreign domains.
There's a reason people with comparable intelligence operate at varying degrees of effectiveness, and it has to do with how knowledgeable they are.
conradev
Isn't knowledge of language necessary to decode prompts?
tripplyons
I'm not at all experienced in neuroscience, but I think that humans and other animals primarily gain intelligence by learning from their sensory input.
FergusArgyll
You don't think a lot is encoded in genes from before we're born?
pona-a
I don't think so. A lot of useful specialized problems are just patterns. Imagine your IDE could take 5 examples of matching strings and produce a regex you can count on working? It doesn't need to know the capital of Togo, metabolic pathways of the eukaryotic cell, or human psychology.
For that matter, if it had no pre-training, it means it can generalize to any new programming language, libraries, and entire tasks. You can use it to analyze the grammar of a dying African language, write stories in the style of Hemingway, and diagnose cancer on patient data. In all of these, there are only so many samples to fit on.
bloomingkales
A lot of useful specialized problems are just patterns.
It doesn't need to know the capital of Togo, metabolic pathways of the eukaryotic cell, or human psychology.
What if knowing those things distills down to a pattern that matches a pattern of your code and vice versa? There's a pattern in everything, so know everything, and be ready to pattern match.
ta8645
Of course, none of us have exhaustive knowledge. I don't know the capital of Togo.
But I do have enough knowledge to know what an IDE is, and where that sits in a technological stack, i know what a string is, and all that it relies on etc. There's a huge body of knowledge that is required to even begin approaching the problem. If you posted that challenge to an intelligent person from 2000 years ago, they would just stare at you blankly. It doesn't matter how intelligent they are, they have no context to understand anything about the task.
raducu
> The pretraining is the knowledge, not the intelligence.
I thought the knowledge is the training set and the intelligence is the emergent/side effect of reproducing that knowledge by making sure the reproduction is not rote memorisation?
ta8645
I'd say that it takes intelligence to encode knowledge, and the more knowledge you have, the more intelligently you can encode further knowledge, in a virtuous cycle. But once you have a data set of knowledge, there's nothing to emerge, there are no side effects. It just sits there doing nothing. The intelligence is in the algorithms that access that encoded knowledge to produce something else.
AIorNot
I was thinking about this lex friedman podcast with Marcus Hutter. Also, Joshua Bach defined intelligence as the ability to accurately model reality.. is lossless compression itself intelligence or a best fit model- is there a difference? https://www.youtube.com/watch?v=E1AxVXt2Gv4
I feel like extensive pretraining goes against the spirit of generality.
If you can create a general machine that can take 3 examples and synthesize a program that predicts the 4th, you've just solved oracle synthesis. If you train a network on all human knowledge, including puzzle making, and fine-tune it on 99% of the dataset and give it a dozen attempts for the last 1%, you've just made an expensive compressor for test-maker's psychology.