AI cracks superbug problem in two days that took scientists years

118 comments

·February 20, 2025

Frieren

News already corrected.

"Google Co-Scientist AI cracks superbug problem in two days! — because it had been fed the team’s previous paper with the answer in it" https://news.ycombinator.com/item?id=43162582#43163722

Let's see how many points gets the correction. It would be good that achieved the same or more visibility than this one to keep HN informative and truthful.

ipsum2

That's not exactly right. The answer was not explicitly given, according to the source:

"However, the team did publish a paper in 2023 – which was fed to the system – about how this family of mobile genetic elements “steals bacteriophage tails to spread in nature”. At the time, the researchers thought the elements were limited to acquiring tails from phages infecting the same cell. Only later did they discover the elements can pick up tails floating around outside cells, too.

So one explanation for how the AI co-scientist came up with the right answer is that it missed the apparent limitation that stopped the humans getting it.

What is clear is that it was fed everything it needed to find the answer, rather than coming up with an entirely new idea. “Everything was already published, but in different bits,” says Penadés. “The system was able to put everything together.”"

https://www.newscientist.com/article/2469072-can-googles-new...

strangescript

That is what 90% of science is really. There aren't a lot of truly "aha" moments where someone discovers something fundamental with little outside influence.

xbmcuser

This is what I think AI as it gets better will be great at and result in some great discoveries. Today we have millions of research papers on millions of topics the amount of knowledge is impossible for any one human or a group of humans to know so requires serendipity for some discoveries.

Mr_Bees69

A person suffering hallucinations are the only true creatives, the rest is inadvertent or intentional synthesis.

slashdev

Sometimes science isn’t doing completely novel things, but combining ideas across different disciplines or areas.

AI has some potential here, because unlike a human, AI can be trained across all of it and has the opportunity to make connections a human, with more limited scope, might miss.

Borealid

What matters isn't that an AI "make connections", it's that the AI generates some text that causes a human to make the connections. It doesn't even matter if what the AI generates is true or not, if it leads the human to truth.

In this particular example it wasn't useful because the reader already knew the answer and was fishing for it with the LLM. But generally using an LLM as a creativity-inducer (a brainstorming tool) is fine, and IMO a better idea than trying to use them as an oracle.

dmix

> Sometimes science isn’t doing completely novel things, but combining ideas across different disciplines or areas.

That's basically 99% of startups/business too.

godelski

  > to make connections a human, with more limited scope, might miss.

I have a concern, that all this AI stuff is giving less time to "fuck around and find out" (i.e. experimentation). Just like how shifting from compiled languages leads to less time thinking about the code or reading docs while things are compiling. Sometimes you just need to walk away from the desk to solve a problem. It's kinda ironic, that reducing blocking aspects creates more. But the reason walking away from the desk works is because your creative part needs time to think and imagine. It's why you play with your lab equipment or make fun programs to learn. But if everything is focused on only going to the product then you end up straying away from that.

Mr_Bees69

>Sometimes

Usually.

The media portrays science in an unhelpful manner, gpt-3.5 didnt appear out of thin air, deekseek R1 was built on deepseek MOE was built on research by mistral

math_dandy

I suspect your first sentence characterizes the vast majority of scientific activity.

letitgo12345

Or the humans did think of it and were actively proceeding to test that hypothesis

dartos

Every day it becomes clearer and clearer that LLMs are, mainly, a huge step forward in search algorithms.

guelo

I like to think of LLMs as amazing semantic search engines. Making these types of missed connections in vast lakes of existing data is not something humans are good at.

d1sxeyes

That’s absolutely not true, humans are very good at finding connections in huge datasets.

LLMs also happen to be pretty good at it, but unlike humans they don’t get bored or tired from doing it too much.

basisword

When I first read this a few days ago the scientists explicitly stated that they hadn't published their results yet. Have they changed their story? From the BBC article on it:

"He told the BBC of his shock when he found what it had done, given his research was not published so could not have been found by the AI system in the public domain."

Also:

Prof Penadés' said the tool had in fact done more than successfully replicating his research. "It's not just that the top hypothesis they provide was the right one," he said. "It's that they provide another four, and all of them made sense. "And for one of them, we never thought about it, and we're now working on that."

kedean

They hadn't published the latest paper, but the many papers leading up to this were published, and so its able to work off of those. I'm not an expert in this field, but in this case, it seems they had already published a paper that listed this idea as an option and dismissed it, so in all likelihood Co-Scientist simply ignored the theoretical limitation.

Either way, the headline is garbage. It's like being amazed that your coworker who you've documented every step in your process to managed to solve that problem before you. "I've been working on it for months and they solved it in a day!" is obviously false, they have all of the information and conclusions you've been producing while you worked on it. In this case, telling the coworker every detail is the non-consensual AI training process.

pockmarked19

Yes, but it’s the same level of garbage as “Facebook was built in a week”, where the hidden part is “after The Harvard Connection spent two years iterating on the product and got everything stolen”.

Society and people in general don’t want to hear these “sour grape” gripes. Unless they are the ones affected adversely.

basisword

Interesting. In that case I wouldn't put too much blame on the headline - sounds like the scientist was pretty disingenuous when telling the story initially.

knowitnone

"could not have been found by the AI system in the public domain." could he have been working in the cloud that exposed his unpublished paper to AI?

bawolff

The moment i read the headline i kind of suspected that was the case (either implicitly or explicitly).

The exciting bit will be when AI does something nobody has done before. Then we know for sure it isn't cheating.

wizzwizz4

That milestone was achieved in the… 60s, I want to say? Definitely some time in the 20th century.

j45

I get if LLMs don't read correctly, but reporters?

h1fra

/s Breaking the last person to put the last brick of the house has built a house in just a few minutes

bilekas

It's clear after you read the article but that title is really typical these days. It's frustrating as heck but I will admit, it made me click on the link to see what it was all about. Still super impressed all the same.

rybthrow2

What about the other two use cases it came up repurposing existing drugs and identifying novel treatment targets for liver fibrosis as per the paper?

https://research.google/blog/accelerating-scientific-breakth...

colingauvin

This article claims only 3 of the new materials were actually snythesizable (and all had been previously known of), and that the drug for liver fibrosis had already been investigated for liver fibrosis.

https://pivot-to-ai.com/2025/02/22/google-co-scientist-ai-cr...

rybthrow2

The snythesizable materials is another paper by deepmind not relying on LLMs and unrelated to this one. The article's author briefly mentions one of the other two findings without providing any sources to support the claim that they aren’t novel/useful?

If it helps scientists find answers faster, I don’t see the problem—especially when the alternative is sifting through Google or endless research papers.

furyofantares

It says it took them a decade, but they obviously published loads of intermediate results, as did anyone else working in the space.

I get that they asked it about a new result they hadn't published yet, but the idea that it did it in two days when it took them a decade -- even though it's been trained on everything published in that decade, including whatever intermediate results they published -- probably makes this claim just as absurd as it sounds.

tippytippytango

I’d bet the hypothesis was already in the training data. Someone probably suggested it in the future work section of some other paper.

hinkley

One of the annoying things about the LZW compression patent was that it covered something Lempel and Ziv had already mentioned in the further study section of their original paper. Someone patented an “exercise left to the reader”.

vintagedave

Maybe, but:

> He told the BBC of his shock when he found what it had done, given his research was not published so could not have been found by the AI system in the public domain.

and,

> Critically, this hypothesis was unique to the research team and had not been published anywhere else. Nobody in the team had shared their findings.

furyofantares

Right, the actual result they asked about was not published. But do a search for the author, they indeed have published loads of related stuff in the last decade even if not that actual result and even if not directly pointing at it.

jgalt212

I guess, but pretty much every LLM is trained with data outside the public domain--whether they admit to it, or not.

mossTechnician

Since a Google product output the result, it had access to the same knowledge base as Google itself - Search, Scholar, etc. And use of copyrighted data for research purposes has always been, as far as I know, considered fair use in AI training.

Unfortunately, even if we wanted to, it might be impossible to attribute this success to the authors that it built upon.

fsckboy

>outside the public domain

public domain is a legal term meaning something like "free from copyright or license restrictions"

I believe in this thread people mean not that but "published and likely part of the training corpus"

null

[deleted]

CivBase

"New hammer builds house in hours that took construction workers months." Reads the same to me. They introduced a new tool to help with a task that sounds like it was already near completion.

It is still a big accomplishment if the AI helped them solve the problem faster than they could have without it, but the headline makes it sound like it solved the problem completely from scratch.

This kind of hyperbole is what makes me continue to be skeptical of AI. The technology is unquestionably impressive and I think it's going to play a big role in technology and advancement moving forward. But every breakthrough comes with such a mountain of fluff that it's impossible to sort the mundane from the extraordinary and it all ends up feeling like a marketing-driven bubble waiting to burst.

gota

Tentative rephrasing: "When given all of the facts we uncovered over a decade as premises, the computer system generated the conclusion instantly"

edit - instantly is apparently many hours, hence the 'two days', just to be clear

melagonster

The professor does not publish it yet because they need to do experiments for testing hypotheses.

root_axis

> Critically, this hypothesis was unique to the research team and had not been published anywhere else. Nobody in the team had shared their findings

This seems like the most important detail, but it also seems impossible to verify if this was actually the case. What are the chances that this AI spat out a totally unique hypothesis that has absolutely no corollaries in the training data, that also happens to be the pet hypothesis of this particular research team?

I'm open to being convinced, but I'm skeptical.

card_zero

I wonder about a "clever Hans" effect, where they unwittingly suggest their discovery in their prompt. Also whether they got paid.

cwillu

“

   "It's not just that the top hypothesis they provide was the right one," he said.
   "It's that they provide another four, and all of them made sense.
   And for one of them, we never thought about it, and we're now working on that."”

card_zero

I wonder whether that one is really any good.

graeme

May well be but in this case it would be a two way clever Hans which is very promising.

mtrovo

Define "totally unique hypothesis" in this context. If the training data contains studies with paths like A -> B and C -> D -> E, and the AI independently generates a proof linking B -> C, effectively creating a path from A -> E, is that original enough? At some point, I think we're going to run out of definitions for what makes human intelligence unique.

> It also seems impossible to verify if this was actually the case.

If this is a thinking model, you could always debug the raw output of the model's internal reasoning when it was generating an answer. If the agent took 48 hours to respond and we had no idea what it was doing that whole time, that would be the real surprise to me, especially since Google is only releasing this in a closed beta for now.

TrackerFF

Could some of the scientists have saved their data in the google cloud, say using google drive? And then some internal google crawler went through, and indexed those files?

I don't know that their policy says about that, or if it is even something they do...at least not publicly.

bArray

Google openly train stuff based on your email, they used is specifically to train Smart Compose, but maybe other stuff too. He likely uses multiple Google products. Draft papers in Google Drive perhaps?

These LLM models are essentially trying to produce material that sounds correct, perhaps the hypothesis was a relatively obvious question with the right domain knowledge.

Additionally, he may not have been the first to ask the question. It's entirely possible that the AI chewed up and spat out some domain knowledge from a foreign research group outside of his wheelhouse. This kind of stuff happens all the time.

I personally have accidentally reinvented things without prior knowledge of them. Many years ago in University I remember deriving a PID controller without being aware of what one was. I probably got enough clues from other people/media that were aware of them, that bridging that final gap was made easier.

Jabbles

> We do not use your Workspace data to train or improve the underlying generative AI and large language models that power Gemini, Search, and other systems outside of Workspace without permission.

https://support.google.com/meet/answer/14615114?hl=en#:~:tex...

You may not believe them, but I challenge your description of it as "openly".

collingreen

What constitutes permission? Did we all give full permission (in Google's opinion) without knowing it in the dozens and dozens of pages of Eula and deluge of privacy policy changes? The other comments are valid but irrelevant if Google thinks (rightly or wrongly) that "by using this product you give permission to xyz".

I'm reminded of when Microsoft said you didn't actually buy the Xbox even though you thought you did and they won in court to prevent people from changing or even repairing their own (well, Microsoft's I guess, even though the person paid for it and thought they bought it) machine.

protimewaster

Isn't that implying that they do train on Workspace data, but the results of the training won't be applied outside of the Workspace?

It is an awful sentence, but I'm reading it as:

> We do not use your Workspace data to train or improve the underlying generative AI and large language models [...] outside of Workspace [...].

Which to me makes it sound like if the answer is in your Workspace, and you ask an LLM in your Workspace, it would be able to tell you the answer.

stonogo

Yeah, "Workspace data." If you don't think a scientist has copies of all his/her stuff in a personal Google account you've never met a scientist.

crazygringo

> Google openly train stuff based on your email

They do not.

Many years ago, they served customized ads based on your email. Then they stopped that, even for free accounts, because it led to a lot of unfounded misunderstanding that "Google reads your email, Google trains on your email"...

jsiepkes

Can you prove they don't? Right now Google is losing in the AI space. I wouldn't be surprised if they went with "it's better to ask for forgiveness later than permission upfront".

Lastminutepanic

I mean they never stopped "training on your email"... I'm sure they started of (back when they may not have had the code/compute to trawl and model from the body of the email), it was just the metadata they used to help build data for targeted marketing. And are still absolutely using metadata...

But yeah, in this specific case, it is way less nefarious. Just one, or both, of Google, and the scientist, selling a new AI product, with a sensationalist, unrealistic story, in a huge, publicly funded, "serious" news outlet. At least when the NYtimes drools down it's chin at some AI vaporware, they may be getting a huge advertising buy, or someone there owns a lot of stock in the company... BBC can't even hide behind "well that's capitalism baby".

I will say the prime minister and his red thatcherites have been obsessed with becoming a player in the AI industry... If you want a conspiracy theory i think is more likely haha.

miyuru

> "I wrote an email to Google to say, 'you have access to my computer, is that right?'", he added.

sounds extra fishy, since google does not provide email support normally.

Jimmc414

"Scientists who are part of our Trusted Tester Program will have early access to AI co-scientist"

https://blog.google/feed/google-research-ai-co-scientist/

bArray

Exactly my thought, probably the least likely part of the whole thing. He emailed Google and they replied. Not only that, he asked a question they would really rather not answer.

nurumaik

>He gave "co-scientist" - a tool made by Google - a short prompt asking it about the core problem he had been investigating and it reached the same conclusion in 48 hours.

Could it be the case when asking the right question is the key? When you know the solution already it's actually very easy to accidentally include some hints in your phrasing of question that will make task 10x easier

killerteddybear

This is a very common mistake with LLMs I find. Lots of people who have high domain knowledge will be very impressed by it due to situations where they phrase questions in such a way that it unintentionally leads it to a specific answer which they see as rightfully impressive, not realizing the information which they encoded in the question.

tribler

If you carefully study the actual prompt used: it already mentions the tail as a factor. Answer talks more on the tail. Just confidently!

No double blind methodology protocol.

dmix

The power of suggestion, ala https://en.wikipedia.org/wiki/Mentalism

booleandilemma

So like Clever Hans the horse, in a way? :)

https://en.wikipedia.org/wiki/Clever_Hans

killerteddybear

Very accurate comparison honestly -- pattern recognition without understanding or underlying knowledge.

geophile

So, maybe a much, much more impressive version of Clever Hans? (https://en.wikipedia.org/wiki/Clever_Hans)

This is not to detract from the AIs accomplishment at all. If it read the scientist's prior work, then coming up with the same discovery as the scientist did is still astounding.

manmal

Or, put more negatively, like a mentalist? https://softwarecrisis.dev/letters/llmentalist/

null

[deleted]

monkeydust

Using this, launched yesterday:

Today Google is launching an AI co-scientist, a new AI system built on Gemini 2.0 designed to aid scientists in creating novel hypotheses and research plans. Researchers can specify a research goal — for example, to better understand the spread of a disease-causing microbe — using natural language, and the AI co-scientist will propose testable hypotheses, along with a summary of relevant published literature and a possible experimental approach.

https://blog.google/feed/google-research-ai-co-scientist/

fatbird

So the AI didn't prove anything, it offered a hypothesis that wasn't in the published literature, which happened to match what they'd spent years trying to verify. I can see how that would look impressive, and he says that if he'd had this hypothesis to start with, it would have saved those years.

Without those years spent working the problem, would he have recognized that hypothesis as a valuable road to go down? And wouldn't the years of verifying it still remain?

Lastminutepanic

Here is the scientists Google scholar account. He has been publishing (and so have lots of other scientists, just look at the papers HE cites in previous work) about this exact scenario for years. This is just Google announcing a brand new AI tool (the scientist tool was literally just released in the past few days). And knew mainstream outlets have 1. No clue about how AI works, and 2. A pathological deference to anyone with lots of letters after their name from impressive places.

https://scholar.google.com/citations?hl=en&user=rXUHiP8AAAAJ...

patcon

A thought occurred to me, as someone involved in some projects trying to recalibrate invectives for science funding: Good grad students are usually more intersectional across fields (compared to supervisors) and just more receptive to outsider ideas. They unofficially provide a lot of this same value that AI is about to provide established researchers.

I wonder how AI is going to mess with the calculus of employing grad studies, and if this will affect the pipeline of future senior researchers...

programmertote

When I read "cracks superbug problem", I thought AI solved how to kill superbugs. From reading the article, it seems like AI suggested a few hypotheses and one of which is similar to what the researcher thought of. So in a way, it hasn't cracked the problem although it helped in forming ONE of the hypotheses, which needs to be tested in experiments(?)

Just want to make sure I'm understanding what's written in the article accurately.

newsreaderguy

it suggested other hypotheses as well, including one that they hadn't thought of and are investigating

didntknowyou

why did it take 48 hours? did he give the AI data to process or was it just a prompt. did it spit back out a conclusion or a list of possible scenarios it had scraped? seems like a PR stunt.

tecleandor

While there is some stuff around this that sounds like PR (probably some intermediate results and/or SOTA on that field as of today could help reaching that result), the process seems interesting. Seems like it launches ahem "agents" that do simulations, verifications, refines and iterates over different options... and it takes a while:

https://research.google/blog/accelerating-scientific-breakth...

gptacek

"AI" didn't "crack" anything here. An LLM generated text that's notionally similar to a hypothesis this researcher was interested in but hadn't published. You can read Dr. Penades in his own words on BioRxiv, and if you might have been interested in reading the prompt or the output generated by co-scientist, it's included in the results and SI: https://www.biorxiv.org/content/10.1101/2025.02.19.639094v1

What actually happened here looks more like rubber-ducking. If you look at the prompt (Supplementary Information 1), the authors provide the LLM with a carefully posed question and all the context it needed to connect the dots to generate the hypothesis. The output (Supplementary Information 2) even states outright what information in the prompt led it to the conclusion:

"Many of the hypotheses you listed in your prompt point precisely to this direction. These include, but are not limited to, the adaptable tail-docking hypothesis, proximal tail recognition, universal docking, modular tail adaptation, tail-tunneling complex, promiscuous tail hypothesis, and many more. They collectively underscore the importance of investigating capsid-tail interactions and provide a variety of testable predictions. In addition, our own preliminary data indicate that cf-PICI capsids can indeed interact with tails from multiple phage types, providing further impetus for this research direction."

steeeeeve

I find this odd because that's exactly how I thought viruses worked when crossing species and I have no background that would lead me to that conclusion and have almost nothing in my life that would make me ponder such a thing.

I feel like someone explained this in the 80s to me.