The average chess players of Bletchley Park and AI research in Britain
41 comments
·July 1, 2025marcodiego
I remember an interview with Kasparov. He said something, I don't remember exactly... It was something like "The skills chess develops are very important for... playing chess"; as a way to say "if you're good in chess, that doesn't mean you're particularly smart or good in other areas too".
As someone who played chess competitively in my childhood and teens, chess helped me a lot about concentration, problem solving and decision taking. I also learned to win and lose and to have respect for other people due to the competition.
As a teacher in my adulthood, I was extremely impressed by knowing a high rated player that was very weak student, especially in logic.
I now agree deeply with Kasparov about the importance of the skills chess develops.
PaulRobinson
It's strange today to remember that playing chess well was seen as a great marker of AI, but today we consider it much less so.
I thought Turing's Test would be a good barometer of AI, but in today's World of mountains of AI slop fooling more and more people, and ironically there being software that is better at solving CAPTCHAs than humans, I'm not so sure.
Add into the mix that there are reports of people developing psychological disorders when exposed deeply to LLMs, I'm not sure they are good replacements for therapists (ELIZA, ah, what a thought), and they seem - even with a lot of investment in agentic workflows and getting a lot of context into GraphRAG or wiring up MCP - to be good at helping experts get a bit faster, not replace experts. And that's not software development specific - it seems to be the case across all domains of expertise.
So what are we now chasing for? What's the test for AGI?
It's definitely not playing games well, like we thought, or pretending to be human, or even being useful to a human. What is it, then?
iamflimflam1
> It's strange today to remember that playing chess well was seen as a great marker of AI, but today we consider it much less so.
It was seen as so difficult to do that research should be abandoned.
Projects in category B were held to be failures. One important project, that of "programming and building a robot that would mimic human ability in a combination of eye-hand co-ordination and common-sense problem solving", was considered entirely disappointing. Similarly, chess playing programs were no better than human amateurs. Due to the combinatorial explosion, the run-time of general algorithms quickly grew impractical, requiring detailed problem-specific heuristics.
The report stated that it was expected that within the next 25 years, category A would simply become applied technologies engineering, C would integrate with psychology and neurobiology, while category B would be abandoned.
rjsw
The linked post points out that it is a low-cost area of research and you don't need to explain the context to a reviewer.
GuB-42
> I thought Turing's Test would be a good barometer of AI
Depends on what you consider a "Turing's Test".
Fooling unsuspecting humans is relatively easy, it has been done with relatively simple software and some trickery. LLMs can do that too of course.
A more convincing "Turing's Test" would be:
- You have one interrogator, and two players, one human and one computer
- The interrogator, after chatting with both players has to find which is which
- The interrogator is an expert in the field, he knows everything there is to know when it comes to finding the computer
- The human player is also an expert, he knows how to solve problems that are hard for computers to solve, he also knows what to expect from the interrogator
- The interrogator and human player collaborate to find the computer
- The interrogator and human player are not allowed to have shared information that the computer doesn't have (and ideally, they shouldn't know each other personally), but everything else is fair game
ben_w
> and they seem - even with a lot of investment in agentic workflows and getting a lot of context into GraphRAG or wiring up MCP - to be good at helping experts get a bit faster, not replace experts. And that's not software development specific - it seems to be the case across all domains of expertise.
For now, this is a good thing: Given how generally LLMs are displacing juniors, if this was a situation where doing the same thing but harder can replace experts, it replaces approximately all of them.
But: in limited domains, not the "G" of "AGI" but just specific places here and there, AI does beat human experts. Those domains are often sufficiently narrow that they don't even encompass the entire role — think "can analyse an X-ray for cancer, can't write up its findings" kind of specificity. Indeed, I can only think of two careers where even the broadest definition of AI (some kind of programmable system) has been able to essentially fully replace that occupation:
jeremyjh
I'm not sure that there is more to it than continuous learning. If an LLM of top-tier strength could learn from its experiences even the way a junior developer could, I'm not sure I can place an upper bound on how capable it would be at software engineering. But from what I understand this will require a completely different architecture.
captainbland
I think with the Turing test, it's turned out to be a fuzzier line than expected. People are to various degrees learning LLM tells even as they improve. So what might have passed the Turing test in 2020 might not today. Similarly it seems to be a case that conversations with LLMs often start better than they end, even today - so an LLM might pass a short turing test but fail a very long one that goes into hundreds of rounds.
kranke155
We’ve clearly passed the Turing test I think. I can’t think of many ways I’d be able to detect an LLM reliably, if it was coded to just act as a person talking to me on discord.
pyman
I have a similar philosophical question:
My dog doesn't know what I do for a living, and he has no concept of how intelligent I am. So if we're limited by our own intelligence, how would we ever recognise or measure the intelligence of an AI that's more advanced than us?
If an AI surpasses us, not just in memory or calculation but in reasoning, self-reflection, and abstraction, how would we even know?
ben_w
Depends by how much they're smarter, and in which ways.
As a trivial example, a century ago "can do arithmetic" was a signifier of being smart, yet if the entire human population today were all as fast as the current world record holder and on the same team, we would be defeated by one Raspberry Pi.
Easy to measure, but also very limited sense of "smart".
A Pi can also run Stockfish, so in that also-limited sense of "smart", it still beats humans. And chess inspires the wider use of Elo ratings in AI arenas, which means we can usefully assign scores to different AI that all beat the best humans.
For now, it's possible to point to things humans are (collectively) able to do better than AI — I originally wrote "very, very easy" rather than "possible", but then remembered noticing that whenever anyone actually tries to do so here on Hacker News, they're out of date already and there's is an AI which can do that thing superhumanly well (either that or they overstate what humans can do, e.g. claiming we can beat the halting problem); actual research papers with experiments generally do better when it comes to listing AI failure modes, including when the research comes from an AI lab showing off their new AI.
barrenko
Whatever the hell observes the mind is not the same thing as it.
officehero
Wittgenstein's lion
dale_glass
We could test it. We know with certainty that computers play far better chess than we do.
How do we know? Play a game with the computer, and see who wins.
There's no reason why we can't apply the same logic elsewhere. Set up a testable scenario, see who wins.
gadders
It can have a fight with Nagel's Bat.
null
dandellion
How about: the ability to independently implement ways to manipulate the local environment for their own benefit or self-preservation?
jltsiren
Tests can only show that something is not AGI. If you want to show that a system is AGI, you must wait for expert consensus. That means adding new tests and dropping old ones, as our understanding of intelligence improves. If something is truly AGI, people will eventually run out of plausible objections.
If you look at the stuff Turing was writing in the 1950s its fascinating because he really saw the potential of what computation was going to be able to do. There was a paradigm shift in thinking about possibilities here that he grasped in the very early days.
https://www.cs.ox.ac.uk/activities/ieg/e-library/sources/t_a...
It would be amazing to go and fetch Turing with a time machine and bring him to our time. Show him an iPhone, his face on the UK £50 note, and Wikipedia's list of https://en.wikipedia.org/wiki/List_of_openly_LGBTQ_heads_of_...