Google DeepMind Releases AlphaGenome

dekhn

When I went to work at Google in 2008 I immediately advocated for spending significant resources on the biological sciences (this was well before DM started working on biology). I reasoned that Google had the data mangling and ML capabilities required to demonstrate world-leading results (and hopefully guide the way so other biologists could reproduce their techniques). We made some progress- we used exacycle to demonstrate some exciting results in protein folding and design, and later launched Cloud Genomics to store and process large datasets for analytics.

I parted ways with Google a while ago (sundar is a really uninspiring leader), and was never able to transfer into DeepMind, but I have to say that they are executing on my goals far better than I ever could have. It's nice to see ideas that I had germinating for decades finally playing out, and I hope these advances lead to great discoveries in biology.

It will take some time for the community to absorb this most recent work. I skimmed the paper and it's a monster, there's just so much going on.

bitpush

> It's nice to see ideas that I had germinating for decades finally playing out

I'm sure you're a smart person, and probably had super novel ideas but your reply comes across as super arrogant / pretentious. Most of us have ideas, even impressive ones (here's an example - lets use LLMs to solve world hunger & poverty, and loneliness & fix capitalism), but it'd be odd to go and say "Finally! My ideas are finally getting the attention".

shadowgovt

[delayed]

CGMthrowaway

Yeah it comes off as braggy, but it’s only natural to be proud of your foresight

sampl3username

This reads like an LLM generated text.

pinoy420

[dead]

nextos

I found it disappointing that they ignored one of the biggest problems in the field, i.e. distinguishing between causal and non-causal variants among highly correlated DNA loci. In genetics jargon, this is called fine mapping. Perhaps, this is something for the next version, but it is really important to design effective drugs that target key regulatory regions.

One interesting example of such a problem and why it is important to solve it was recently published in Nature and has led to interesting drug candidates for modulating macrophage function in autoimmunity: https://www.nature.com/articles/s41586-024-07501-1

rattlesnakedave

Does this get us closer? Pretty uninformed but seems that better functional predictions make it easier to pick out which variants actually matter versus the ones just along for the ride. Step 2 probably is integrating this with proper statistical fine mapping methods?

nextos

Yes, but it's not dramatically different from what is out there already.

There is a concerning gap between prediction and causality. In problems, like this one, where lots of variables are highly correlated, prediction methods that only have an implicit notion of causality don't perform well.

Right now, SOTA seems to use huge population data to infer causality within each linkage block of interest in the genome. These types of methods are quite close to Pearl's notion of causal graphs.

ejstronge

> SOTA seems to use huge population data to infer causality within each linkage block of interest in the genome.

This has existed for at least a decade, maybe two.

> There is a concerning gap between prediction and causality.

Which can be bridged with protein prediction (alphafold) and non-coding regulatory predictions (alphagenome) amongst all the other tools that exist.

What is it that does not exist that you "found it disappointing that they ignored"?

Scaevolus

Naturally, the (AI-generated?) hero image doesn't properly render the major and minor grooves. :-)

jeffbee

And yet still manages to be 4MB over the wire.

HN

Google DeepMind Releases AlphaGenome

Google DeepMind Releases AlphaGenome