I got fooled by AI-for-science hype–here's what it taught me
184 comments
·May 20, 2025plasticeagle
Does anybody else find it peculiar that the majority of these articles about AI say things like "of course I don't doubt that AI will lead to major discoveries", and then go on to explain how they aren't useful in any field whatsoever?
Where are the AI-driven breakthroughs? Or even the AI-driven incremental improvements? Do they exist anywhere? Or are we just using AI to remix existing general knowledge, while making no progress of any sort in any field using it?
ergsef
Speaking out against the hype is frowned upon. I'm sure even this very measured article about "I tried it and it didn't work for me" will draw negative attention from people who think AI is the Second Coming.
It's also very hard to prove a negative. If you predict "AI will never do anything of value" people will point to literally any result to prove you wrong. TFA does a good job debunking some recent hype, but the author cannot possibly wade through every hyperbolic paper in every field to demonstrate the claims are overblown.
currymj
AlphaFold is real.
To the extent you care about chess and Go as human activities, progress there is real.
there are some other scientific computing problems where AI or neural-network-based methods do appear to be at least part of the actual state-of-the-art (weather forecasting, certain single-molecule quantum chemistry simulations).
i would like the hype of the kind described in the article to be punctured, but this is hard to do if critics make strong absolute claims ("aren't useful in any field whatsoever") which are easily disproven. it hurts credibility.
literalAardvark
Even more relvant, AlphaEvolve is real.
Could easily be brick 1 of self-improvement and the start of the banana zone.
gthompson512
> "the start of the banana zone"
What does this mean? Is it some slang for exponential growth, or is it a reference to something like the "paperclip maximizer"?
dirtyhippiefree
Sounds like a routine Bill Hicks might have come up with if he was still with us.
He hated obfuscation.
daveguy
I've never seen an AI critic say AI isn't "useful in any field whatsoever". Especially one that is known as an expert in and a critic of the field. There may be names that aren't coming to mind because that stance would reduce their specific credibility. Do you have some in mind?
strogonoff
There is rarely a constructive discussion around the term “AI”. You can’t say anything useful about what it might lead to or how useful it might be, because it is purely a marketing term that does not have a specific meaning (neither do both of the words in its abbreviation).
Interesting discussions tend to avoid “AI” in favour of specific terms such as “ML”, “LLM”, “GAN”, “stable diffusion”, “chatbot”, “image generation”. These terms refer to specific tech and applications of that tech, and allow to argue about specific consequences for sciences or society (use of ML in biotech vs. proliferation of chatbots).
However, certain sub-industries prefer “AI” precisely because it’s so vague, offers seemingly unlimited potential (please give us more investment money/stonks go up), and creates a certain vibe of a conscious being useful when pretending not to be working around IP laws and creating tools based on data obtained without relevant licensing agreements (cf. the countless “if humans have the freedom to read, therefore it’s unfair to restrict the uses of a software tool” fallacies, often perpetuated even by seemingly technically literate people, in pretty much every relevant forum thread).
rickdeckard
It's not even that certain sub-industries prefer "AI", it's the umbrella term a company can use in Marketing for virtually any automated process that provides a seemingly subjective result.
Case in point:
For a decade the implementation of cameras went through development, testing and tuning of Auto Exposure, Auto Focus and Auto White-Balance ("AAA") engines as well as image post-processing.
These engines ran on a Image Signal Processor (ISP) or sometimes on the Camera sensor itself, extensive work was done by Engineering Teams on building these models in order to optimize them to run on low-latency on an ISP.
Suddenly AI came along and all of these features became "AI features". One company started with "AI assisted Camera" to promote the process everyone was doing all-along. So all had to introduce AI, without any disruptive change in the process.
Aldipower
I remember somethings similar when the term "cloud" came up. It is still someone else's server or datacenter with tooling.
fluidcruft
I agree it's completely meaningless. At this point I think marketing would label a toilet fill valve as "AI".
Closi
I think AI is a useful term which usually means a neural network architecture but without specifying the exact architecture.
I think Machine Learning doesn't mean this as a word, as it can also refer to linear regression, non-linear optimisation, decision trees, bayesian networks etc.
That's not saying that AI isn't abused as a term - but I do think a more general term to describe the latest 5 years advancements in neural networks to solve problems is useful. Particularly as it's not obvious which model architectures would apply to which fields without more work (or even if novel architectures will be required for frontier science applications).
GrantMoyer
The field of neural network research is known as Deep Learning.
daveguy
This is incorrect. Machine Learning is a term that refers to numerical as opposed to symbolic AI. ML is a subset of AI as is Symbolic / Logic / Rule based AI (think expert systems). These are all well established terms in the field. Neural Networks include deep learning and LLMs. Most AI has gone the way of ML lately because of the massive numerical processing capabilities available to those techniques.
AI is not remotely limited to Neural Networks.
roenxi
Also, the strong predictions about AI are using a vague term because the tech often doesn't exist yet. There isn't a chatbot right now that I feel confident can out-perform me at systems design but I'm pretty certain something that can is coming. Odds are also good that in 2-4 years there will be new hotness to replace LLMs that are much more functional (maybe MLLMs, maybe called something else). We can start to predict and respond to their potential even though they don't exist yet; it just takes a little extrapolating. But it doesn't have a name yet.
Which is to agree - obviously if people are talking about "AI" they don't want to talk about something that exists right this second. If they did it'd be better to use a precise word.
Closi
Totally agree.
Also the term 'LLM' is more about the mechanics of the thing than what the user gets. LLM is the technology, but some sort of automated artificial intelligence is what people are generally buying.
As an example, when people use ChatGPT and get an image back, most don't think "oh, so the LLM called out to a diffusion API?" - they just think "oh chat GPT can give me an image if I give it a prompt".
Although again, the term is entirely abused to the extent that washing machines can contain 'AI'. Although just because a term is abused doesn't necessarily mean it's not useful - everything had "Cloud" in it 10 years ago but that term was still useful enough to stick around.
Perhaps there is an issue that AI can mean lots of things, but I don't know yet of another term that encapsulates the last 5 years advancements in automated intelligence, and what that technology is likely to be moving forwards, which people will readily recognise. Perhaps we need a new word, but AI has stuck and there isn't a good alternative yet, so is probably here to stay for a bit!
mnky9800n
This article is all about PINNs being overblown. I think it’s a reasonable take. I’ve seen way too many people dump all their eggs in the PINNs basket when there are plenty of options out there. Those options just don’t include a ticket to the hype train.
simianparrot
It’s why it keeps looking exactly like NFT’s and crypto hype cycles to me: Yes the technology has legitimate uses, but the promises of groundbreaking use cases that will change the world are obviously not materialising and to anyone that understands the tech it can’t.
It’s people making money off hype until it dies and move on to the next scam-with-some-use.
Flamentono2
We already have breakthroughs. Benchmark results which have been unheard of before ML.
Alone language translation got so much better, voice syntesis, voice transcription.
All my meetings now are searchable and i can ask 'ai' to summarize my meetings in a relative accurate way impossible before that.
Alphafold made a breakthrough in protein folding.
Image and Video generation can now do unbelievable things.
Realtime voice communication with computer.
Our internal company search suddenly became usefull.
I have 0 use case for NFT and Crypto. I have tons of use case for ML.
parodysbird
> Alphafold made a breakthrough in protein folding.
Sort of. Alphafold is a prediction tool, or, alternatively framed, a hypothesis generation tool. Then you run an experiment to compare.
It doesn't represent a scientific theory, not in the sense that humans use them. It does not have anywhere near something like the accuracy rate for hypotheses to qualify as akin to the typical scientific testing paradigm. It's an incredibly powerful and efficient tool in certain contexts and used correctly in the discovery phase, but not the understanding or confirmation phase.
It's also got the usual pitfalls with differentiable neural nets. E.g. you flip one amino acid and it doesn't really provide a proper measure of impact.
Ultimately, one major prediction breakthrough is not that crazy. If we compare that to e.g. Random Forest and similar models, the impact in science is infinitely more with them.
Yoric
That is absolutely correct.
The problem is that the hype assumes that all of this is a baseline (or even below the baseline), while there are no signs that it can go much further in the near future – and in some cases, it's actually cutting-edge research. This leads to a pushback that may be disproportionate.
vanattab
Which ai program do you use for live video meeting translation?
exe34
You have to understand, real AI will never exist. AI is that which a machine can't do yet. Once it can do it, it's engineering.
uludag
I'm sure there's many people out there who could say that they hardly use AI but that crypto has made them lots of money.
At the end of the day searching work documents and talking with computers is only desirable inasmuch as they are economically profitable. Crypto at the end of the day is responsible for a lot of people getting wealthy. Was a lot of this wealth obtained on sketchy grounds? probably, but the same could be said AI (for example, the recent sale of windsurf for an obscene amount of money).
StopDisinfo910
I don’t remember when NFTs and cryptos helped me draft an email, wrote my meetings minutes for me or allowed me to easily search information previously locked in various documents.
I think there is this weird take amongst some on HN where LLMs are either completely revolutionary and making break through or utterly useless.
The truth is that they are useful already as a productivity tool.
squidbeak
I think imagination may be the reason for this. Enthusiasts have kept that first wave of amazement at what AI is able to do, and find it easier to anticipate where this could lead. The pessimists on the other hand weren't impressed with its capabilities in the first place - or were, and then became disillusioned for something it couldn't do for them. It's naturally easier to look ahead from the optimistic standpoint.
There's also the other category who are terrified about the consequences for their lives and jobs, and who are driven in a very human way to rubbish the tech to convince themselves it's doomed to failure.
The optimists are right of course. A nascent technology at this scale and with this kind of promise, whose development is spurring a race between nation states, isn't going to fizzle out or plateau, however much its current iterations may come short of any particular person's expectations.
cornholio
For now, the reasoning abilities of the best and largest models are somewhat on par with those of a human crackpot with an internet connection, that misunderstands some wild fact or theory and starts to speculate dumb and ridiculous "discoveries". So the real world application to scientific thought is low, because science does not lack imbeciles.
But of course, models always improve and they never grow tired (if enough VC money is available), and even an idiot can stumble upon low hanging fruits overlooked by the brightest minds. This tireless ability to do systematic or brute-force reasoning about non-frontier subjects is bound to produce some useful results like those you mention.
The comparison with a pure financial swindle and speculative mania like NFTs is of course an exaggeration.
lazide
Having tried to use various tools - in those specific examples - I found them either pointless or actively harmful.
Writing emails - once I knew what I wanted to convey, the rest was so trivial as to not matter, and any LLM tooling just got in the way of actually expressing it as I ended up trying to tweak the junk it was producing.
Meeting minutes - I have yet to see one that didn’t miss something important while creating a lot of junk that no one ever read.
And while I’m sure someone somewhere has had luck with the document search/extract stuff, my experience has been that the hard part was understanding something, and then finding it in the doc or being reminded of it was easy. If someone didn’t understand something, the AI summary or search was useless because they didn’t know what they were seeing.
I’ve also seen a LOT of both junior and senior people end up in a haze because they couldn’t figure out what was going on - and the AI tooling just allowed them to produce more junk that didn’t make any sense, rather than engage their brain. Which causes more junk for everyone to get overwhelmed with.
IMO, a lot of the ‘productivity’ isn’t actually, it’s just semi coherent noise.
apwell23
> wrote my meetings minutes
why is this such a posterchild for llms. everyone always leads with this.
how boring are these meetings and do ppl actually review these notes? i never ever saw anyone reading meeting minutes or even mention them.
Why is this usecase even mentioned in LLM ads.
ktallett
The hype surrounding them is not as a pa and tbh a lot of these use cases already have existing methods that work just fine. There are ways to find key information in files already, and speedy meeting minutes is really just a template away.
bgnn
Exactly this. What we expect from them is our speculation. In reality nobody knows the future and there's no way to know the future.
Voloskaya
> to anyone that understands the tech it can’t.
This is a ridiculous take that makes me think you might not « understand the tech » as much as you think you do.
Is AI useful today ? That depends on the exact use case but overall it seems pretty clear the hype is greater than the use currently. But sometimes I feel like everyone forgets that ChatGPT isn’t even 3 years old, 6 years ago we were stuck with GPT-2 whose most impressive feat was writing a non sense poem about a unicorn, and AlphaGo is not even 10 years old.
If you can’t see the trend and just think that what we have today is the best we will ever achieve, thus the tech can’t do anything useful, you are getting blinded by contrarianism.
jstummbillig
I think people are mostly bad at value judgements, and AI is no exception.
What they naively wished the future was like: Flying cars. What they actually got (and is way more useful but a lot less flashy): Cheap solar energy.
aleph_minus_one
> What they naively wished the future was like: Flying cars.
This future is already there:
We have flying cars: they are called "helicopters" (see also https://xkcd.com/1623/).
helloplanets
I'd be interested in reading some more from the people you're referring to when talking about experts who understand the field. At least to the extent I've followed the discussion, even the top experts are all over the place when it comes to the future of AI.
As a counterpoint: Geoffrey Hinton. You could say he's gone off the deep end on a tangent, but I definitely don't his incentive is to make money off of hype. Then there's Yann LeCun saying AI "could actually save humanity from extinction". [0]
If these guys just out of touch talking heads, who are the new guard people should read up on?
[0]: https://www.theguardian.com/technology/2024/dec/27/godfather...
guardian5x
AI looks exactly like NFTs to you? I don't understand what you mean by that. AI already has tons more uses.
waldrews
One is a technical advance as important as anything in human history, realizing a dream most informed thinkers thought would remain science fiction long past our lifetimes, upending our understanding of intelligence, computation, language, knowledge, evolution, prediction, psychology... before we even mention practical applications.
The other is worse than nothing.
littlestymaar
> It’s why it keeps looking exactly like NFT’s and crypto hype cycles to me: Yes the technology has legitimate uses
AI have legitimate uses, cryptocurrency only has “regulations evasion” and NFT has literally no use at all, though.
But that's very true that the AI ecosystem is crowded with grifters who feed on baseless hype, and many of them actually come from cryptocurrencies.
montebicyclelo
> then go on to explain how they aren't useful in any field whatsoever
> Where are the AI-driven breakthroughs
> are we just using AI to remix existing general knowledge, while making no progress of any sort in any field using it?
The obvious example of a highly significant AI-driven breakthrough is Alphafold [1]. It has already had a large impact on biotech, helping with drug discovery, computational biology, protein engineering...
[1] https://blog.google/technology/ai/google-deepmind-isomorphic...
boxed
I'm personally waiting for the other shoe to drop here. I suspect that, since nature begins with an existing protein and modifies it slightly, AlphaFold is crazy overfitted to the training data. Furthermore, the enormous success of AlphaFold means that the number of people doing protein structure solving has likely crashed.
So not only are we using an overfitting model that probably can't handle truly novel proteins, we have stopped actually doing the research to notice when this happens. Pretty bad.
swyx
> Where are the AI-driven breakthroughs? Or even the AI-driven incremental improvements?
literally last week
https://deepmind.google/discover/blog/alphaevolve-a-gemini-p...
boxed
Yea, except that we see "breakthrough" stuff like this all the time, and it almost always is quickly found out that it's fraudulent in some way. How many times are we to be fooled before we catch on and don't believe press releases with massive selection bias?
dwroberts
But it only seems to be labs and companies that also have a vested interest in selling it as a product that are able to achieve these breakthroughs. Which is a little suspect, right?
swyx
too tinfoil hat. google is perfectly happy to spend billions dogfooding their own TPUs and not give the leading edge to the public.
biophysboy
An example of an "AI" incremental improvement would be Oxford Nanopore sequencing. They extrude DNA through a nanopore, measure the current, and decode the bases using recurrent neural networks.
They exist all over science, but they are just one method among many, and they do not really drive hypotheses or interpretations (even now)
Wilsoniumite
Some new ish maths has been discovered. It's up to you if this is valid or impressive enough, but I think it's significant for things to come: https://youtu.be/sGCmu7YKgPA?si=EG9i0xGHhDu1Tb0O
snodnipper
Personally, I have been very pleased with the results despite the limitations.
Like many (I suspect), I have had several users provide comments that the AI processes I have defined have made meaningful impacts on their daily lives - often saving them double digit hours of effort per week. Progress.
nicoco
I am not a AI booster at all, but the fact that negative results are not published and that everyone is overselling their stuff in research papers is unfortunately not limited to AI. This is just a consequence of the way scientists are evaluated and of the scientific publishing industry, which basically suffers from the same shit than traditional media does (craving for audience).
Anyway, winter is coming, innit?
asoneth
I published my first papers a little over fifteen years ago on practical applications for AI before switching domains. Recently I've been sucked back in.
I agree it's a problem across all of science, but AI seems to attract more than it's fair share of researchers seeking fame and fortune. Exaggerated claims and cherry-picking data seem even more extreme in my limited experience, and even responsible researchers end up exaggerating a bit to try and compete.
moravak1984
Sure, it's not. But often on AI papers one sees remarks that actually mean: "...and if you throw in one zillion GPUs and make them run until the end of time you get {magic_benchmark}". Or "if you evaluate this very smart algo in our super-secret, real-life dataset that we claim is available on request, but we'd ghost you if you dare to ask, then you will see this chart that shows how smart we are".
Sure, it is often flag-planting, but when these papers come from big corps, you cannot "just ignore them and keep on" even when there are obvious flaws/issues.
It's a race over resources, as a (former) researcher on a low-budget university, we just cannot compete. We are coerced to believe whatever figure is passed on in the literature as "benchmark", without possibility of replication.
aleph_minus_one
> It's a race over resources, as a (former) researcher on a low-budget university, we just cannot compete. We are coerced to believe whatever figure is passed on in the literature as "benchmark", without possibility of replication.
The central purpose of university research has basically always been that researchers work on hard, foundational topics that is more long-term so that industry is hardly willing to do it. On the other hand, these topics are very important, that is why the respective country is willing to finance this foundational research.
Thus, if you are at a university, once your research topic becomes an arms race with industry, you simply work either at the wrong place (university instead of industry) or on the wrong topic in the respective research area (look for some much more long-term, experimental topics that, if you are right, might change the whole research area in, say, 15 years, instead of some high resource-intensive, minor improvements to existing models).
nicoco
I agree with that. Classically used "AI benchmarks" need to be questioned. In my field, these guys have dropped a bomb, and no one seem to care: https://hal.science/hal-04715638/document
baxtr
Can you give brief summary why this paper is a breakthrough for an outsider of the field?
KurSix
AI just happens to be the current hype magnet, so the cracks show more clearly
croes
But AI makes it easier to write convincing looking papers
Flamentono2
I'm not sure why people on HN (of all places) are so divided regarding the perception of AI/ML.
I have not seen anything like it before. We literaly had not system or way of even doing things like code generation based on text input.
Just last week i asked for a script to do image segmentation with a basic UI and claude just generated that for me in under 1 Minute.
I could list tons of examples which are groundbreaking. The whole Image generation stack is completly new.
That blog article is fair enough, there is hype around this topic for sure, but alone for every researcher who needs to write code for their research, AI can make them already a lot more efficient.
But i do believe, that we have entered a new ara: An ara were we take data again very serious. A few years back, you said 'the internet doesn't forget' then we realized that yes the internet starts to forget. Google deleted pages, removed the cache feature and it felt like we stoped caring for data because we didn't knew what to do with it.
Then ai came along. And not only is now data king again but we are now in the mids of reinforcment ara: We now give feedback and the systems incorporate that feedback into their training/learning.
And the ai/ml topic is getting worked on on every single aspect of it: Hardware, Algorithm, use cases, data, tools, protocols, etc. We are in the middle of incorporating and building for and on it. This takes a little bit of time. Still the progress is crazy exhausting.
We will only see in a few years if there is a real ceiling. We do need more GPUs, bigger Datacenters to do a lot more experiments on AI architecture and algorithm. We have a clear bottleneck. Big companies train one big model for weeks and month.
whyowhy3484939
> Just last week i asked for a script to do image segmentation with a basic UI and claude just generated that for me in under 1 Minute.
Thing is we just see that it's copy pasting stack overflow, but now in a fancy way so this is sounding like "I asked Google for a nearby restaurant and it found it in like 500ms, my C64 couldn't do that". It sounds impressive (and it is) because it sounds like "it learned about navigating in the real world and it can now solve everything related to that" but what it actually solved is "fancy lookup in a GIS database". It's useful, damn sure it is, but once the novelty wears off you start seeing it for what it is instead of what you imagine it is.
Edit: to drive the point home.
> claude just generated that
What you think happened is AI is "thinking" and building a ontology over which it reasoned and came to the logical conclusion that this script was the right output. What actually happened is your input correlates to this output according to the trillion examples it saw. There is no ontology. There is no reasoning. There is nothing. Of course this is still impressive and useful as hell, but the novelty will wear off in time. The limitations are obvious by this point.
Flamentono2
I'm following LLMs, AI/ML for a few years now and not just on a high level.
There is not a single system out there today which can do what claude can do.
I stil see it for what it is: A technology i can communicate/use with natural language and get a very diverse of tasks done. From writing/generating code, to svgs, to emails, translation etc. etc. etc.
Its a paradigma shift for the whole world literaly.
We finally have a system which encodes not just basic things but high level concepts. And we humans are doing often enough something very similiar.
And what limitations are obvious? Tell me? We have not reached any real ceiling yet. We are limited by GPU capacity or how many architectural experiments a researcher can run. We have plenty of work to do to cleanup the data set we use and have. We need to build more infrastructure, better software support etc.
We have not even reached the phase were we all have local AI/ML chips build in.
We don't even know yet how a system will act if everyone of us has access to very fast inferencing like you already get with groq.
lossolo
> Its a paradigma shift for the whole world literaly.
That's hyperbolic. I use LLMs daily. They speed up tasks you'd normally use Google for and can extrapolate existing code into other languages. They boost productivity for professionals, but it's not like the discovery of the steam engine or electricity.
> And what limitations are obvious? Tell me? We have not reached any real ceiling yet.
Scaling parameters is the most obvious limitation of the current LLM architecture (transformers). That’s why what should have been called GPT-5 is instead named GPT 4.5, it isn’t significantly better than the previous model despite having far more parameters, a lot more cleaned up training data and optimizations.
The low-hanging fruit has already been picked, and most obvious optimizations have been implemented. As a result, almost all leading LLM companies are now operating at a similar level. There hasn’t been a real breakthrough in over two years. And the last huge architectural breakthrough was in 2017 (with paper "Attention is all you need").
Scaling at this point yields only diminishing returns. So no, what you’re saying isn’t accurate, the ceiling is clearly visible now.
whyowhy3484939
> We finally have a system which encodes not just basic things but high level concepts
That's the thing I'm trying to convey: it's in fact not encoding anything you'll recognize and if it is, it's certainly not "concepts" as you understand them. Not saying it cannot correlate text that includes what you call "high level concepts" or do what you imagine to be useful work in that general direction. Again not making claims it's not useful, just saying that it becomes kind of meh once you factor in all costs and not just the hypothetical imaginary future productivity gains. AKA building literal nuclear reactors to do something that basically amounts to filling in React templates or whatever BS needs doing.
If it was reasoning it could start with a small set of bootstrap data and infer/deduce the rest from experience. It cannot. We are not even close as in there is not even theory to get us there forget about the engineering. It's not a subtle issue. We need to throw literally all data we have at it to get it to acceptable levels. At some point you have to retrace some steps and think over some decisions, but I guess I'm a skeptic.
In short it's a correlation engine which, again, is very useful and will go ways to improve our lives somewhat - I hope - but I'm not holding my breath for anything more. A lot of correlation does not causation make. No reasoning can take place until you establish ontology, causality and the whole shebang.
skydhash
Yeah. It’s just fancier techniques than linear regression. Just like the latter takes a set of numbers and produces another set, LLMs takes words and produces another set of words.
The actual techniques are the breakthrough. The result are fun to play with and may be useful in some occasions, but we don’t have to put them on a pedestal.
callc
> “I'm not sure why people on HN (of all places) are so divided regarding the perception of AI/ML.”
Everyone is a rational actor from their individual perspective. The people hyping AI, and the people dismissing the hype both have good reasons.
The is justification to see this new tech as ground breaking. There is justification to be weary about massive theft of data and dismissiveness of privacy.
First, acknowledge and respect that there are so many opinions on any issue. Take yourself out of the equation for a minute. Understand the other side. Really understand it.
Take a long walk in other people’s shoes.
sanderjd
HN is always divided on "how much is the currently hype-y technology real vs just hype".
I've seen this over and over again and been on different sides of the question on different technologies at different times.
To me, this is same as it ever was!
aleph_minus_one
I basically agree, but want to point out two major differences to other "hype-y" topics that existed in the past that in my opinion make the whole AI discussions on HN a little bit more controversial than other older hype discussions:
1. The whole investment volume (and thus hope and expectations) into AI is much larger than into other hype topics.
2. Sam Altman, the CEO of OpenAI, was president of YCombinator, the company begind Hacker News, from 2014 to 2019.
KurSix
But on the flip side, the "AI will revolutionize science" narrative feels way ahead of what the evidence supports
Retr0id
Google never gave a good reason for why they stopped making their cache public, but my theory is that it was because people were scraping it to train their LLMs.
Barrin92
>but alone for every researcher who needs to write code for their research, AI can make them already a lot more efficient.
scientists don't need to be efficient, they need to be correct. Software bugs were already a huge cause of scientific error, and responsible for lack of reproducibility, see for example cases like this (https://www.vice.com/en/article/a-code-glitch-may-have-cause...)
Programming in research environments is done with some notoriously questionably variation in quality, as is the case for the industry to be fair, but in research minor errors can ruin results of entire studies. People are fed up and come to much harsher judgements on AI because in an environment like a lab you cannot write software with the attitude of an impressionist painter or the AI equivalent, you need to actually know what you're typing.
AI can make you more efficient if you don't care if you're right, which is maybe cool if you're generating images for your summer beach volleyball event, but it's a disastrous idea if you're writing code in a scientific environment.
Flamentono2
I do expect a researcher to verify the way the code interacts with the data set.
Still a lot of researchers can benefit from code tools for their daily work to make them a lot faster.
And plenty of strategies exist to saveguard this. Tool use for example, unit tests etc.
raesene9
Interesting article. There is always risk that a new hot technique will get more attention that it ultimately warrants.
For me the key quote in the article is
"Most scientists aren’t trying to mislead anyone, but because they face strong incentives to present favorable results, there’s still a risk that you’ll be misled."
Understanding people's incentives is often very useful when you're looking at what they're saying.
ktallett
There are those who have realised they can make a lot of cash from it and also get funding by using the term AI. But at the end of the day what software doesn't have some machine learning built in. It's nothing new, nor is the current implementations particularly extraordinary or accurate.
rhubarbtree
I think this is mostly just a repeat of the problems of academia - no longer truth-seeking, instead focused on citations and careerism. AI is just a.n.other topic where that is happening.
geremiiah
I don't want to generalize because I do not know how widespread this pattern is, but my job has me hopping between a few HPC centers around Germany, and a pattern I notice is that, a lot of these places are chuck full of reject physicists, and a lot of the AI funding that gets distributed gets gobbled up by these people and the consequence of which is a lot of these ML4Science projects. I personally think it is a bit of a shame, because HPC centers are not there to only serve physicists, and especially with AI funding we in Germany should be doing more AI-core research.
ktallett
HPCs are usually in Collab with universities for specific science research. Using up their resources is hopping on the bandwagon to damage another industry.an industry (AI) which is neither new nor anywhere close to being anything more than an personal assistant at the moment. Not even a great one at that.
shusaku
> a pattern I notice is that, a lot of these places are chuck full of reject physicists
Utter nonsense, these are some of the smartest people in the world who do incredibly valuable science.
barrenko
Seriously don't understand what "no longer" does here.
omneity
The article initially appears to suggest that all AI in science (or at least the author’s field) is hype. But their gripe seems to be specific to an architecture named PINN that seems to be overhyped, as they mention in the end how they end up using other DL models to successfully compute PDEs faster than traditional numerical methods.
geremiiah
It's more widespread than PINNs. PINNs have been widely known to be rubbish a long time ago. But the general failure of using ML for physics problems is much more widespread.
Where ML generally shines is either when you have relatively lots of experimental data with respect to a fairly narrow domain. This is the case for machine learned interatomic potentials MLIPs which have been a thing since the '90s. Also potentially the case for weather modelling (but I do not want to comment about that). Or when you have absolute insane amounts of data, and you train a really huge model. This is what we refer to as AI. This is basically why Alphafold is successful, and Alphafold still fails to produce good results when you query it on inputs that are far from any data points in its training data.
But most ML for physics problems tend to be somewhere in between. Lacking experimental data and working with not enough simulation data because it is so expensive to produce. And also training models that are not large enough, because inference would be too slow, anyway, if they were too big. And then expecting these models to learn a very wide range of physics.
And then everyone jumps in on the hype train, because it is so easy to give it a shot. And everyone gets the same dud results. But then they publish anyway. And if the lab/PI is famous enough or if they formulate the problem in a way that is unique and looks sciency or mathy, they might even get their paper in a good journal/conference and get lots of citations. But in the end, they still only end up with the same results as everyone else: replicates the training data to some extent, somebody else should work on the generalizability problem.
hyttioaoa
He published a whole paper providing a systematic analysis of a wide range of models. There's a whole section on that. So it's not specific to PINN.
nottorp
Replace PINN with any "AI" solution for anything and you'll still find it overhyped.
The only realistic evaluations of "AI" so far are those that admit it's only useful for experts to skip some boring work. And triple check the output after.
i_c_b
I'm probably saying something obvious here, but it seems like there's this pre-existing binary going on ("AI will drive amazing advances and change everything!" "You are wrong and a utopian / grifter!") that takes up a lot of oxygen, and it really distracts from the broader question of "given the current state of AI and its current trajectory, how can it be fruitfully used to advance research, and to what's the best way to harness it?"
This is the sort of thing I mean, I guess, by way of close parallel in a pre-AI context. For a while now, I've been doing a lot of private math research. Whether or not I've wasted my time, one thing I've found utterly invaluable has been the OEIS.org website, where you can just enter sequence of numbers and then search for it to see what contexts it shows up in. It's basically a search engine for numerical sequences. And the reason it has been invaluable is that I will often encounter some sequence of integers, I'll be exploring it, and then when I search for it on OEIS, I'll discover that that sequence shows up in much different mathematical contexts. And that will give me an opening to 1) learn some new things and recontextualize what I'm already exploring and 2) give me raw material to ask new questions. Likewise, Wolfram Mathematica has been a godsend. And it's for similar reasons - if I encounter some strange or tricky or complicated integral or infinite sum, it is frequently handy to just toss it into Mathematica, apply some combination of parameter constraints and Expands and FullSimplify's, and see if whatever it is I'm exploring connects, surprisingly, to some unexpected closed form or special function. And, once again, 1) I've learned a ton this way and gotten survey exposure to other fields of math I know much less well, and 2) it's been really helpful in iteratively helping me ask new, pointed questions. Neither OEIS nor Mathematica can just take my hard problems and solve them for me. A lot of this process has been about me identifying and evolving what sorts of problems I even find compelling in the first place. But these resources have been invaluable in helping me broaden what questions I can productively ask, and it's through something more like a high powered, extremely broad, extremely fast search. There's a way that my engagement with these tools has made me a lot smarter and a lot broader-minded, and it's changed the kinds of questions I can productively ask. To make a shaky analogy, books represent a deeply important frozen search of different fields of knowledge, and these tools represent a different style of search, reorganizing knowledge around whatever my current questions are - and acting in a very complementary fashion to books, too, as a way to direct me to books and articles once I have enough context.
Although I haven't spent nearly as much time with it, what I've just described about these other tools certainly is similar to what I've found with AI so far, only AI promises to deliver even more so. As a tool for focused search and reorganization of survey knowledge about an astonishingly broad range of knowledge, it's incredible. I guess I'm trying to name a "broad" rather than "deep" stance here, concerning the obvious benefits I'm finding with AI in the context of certain kinds of research. Or maybe I'm pushing on what I've seen called, over in the land of chess and chess AI, a centaur model - a human still driving, but deeply integrating the AI at all steps of that process.
I've spent a lot of my career as a programmer and game designer working closely with research professors in R1 university settings (in both education and computer science), and I've particularly worked in contexts that required researchers to engage in interdisciplinary work. And they're all smart people (of course), but the silofication of various academic disciplines and specialties is obviously real and pragmatically unavoidable, and it clearly casts a long shadow on what kind of research gets done. No one can know everything, and no one can really even know too much of anything out of their own specialties within their own disciplines - there's simply too much to know. There are a lot of contexts where "deep" is emphasized over "broad" for good reasons. But I think the potential for researchers to cheaply and quickly and silently ask questions outside of their own specializations, to get fast survey level understandings of domains outside of their own expertise, is potentially a huge deal for the kinds of questions they can productively ask.
But, insofar as any of this is true, it's a very different way of harnessing of AI than just taking AI and trying to see if it will produce new solutions to existing, hard, well-defined problems. But who knows, maybe I'm wrong in all of this.
sublimefire
Great analysis and spot on examples. Another issue with AI related research is that a lot of papers are new and not that many get published in “proper” places, yet being quoted right/left/center, just look at google scholar. It is hard to repro the results and check the validity of some statements, not to mention that research which was done 4 years ago used one set of models and now another set of models with different training data is used in tests. It is hard to establish what really affects the results and if the conclusions are applicable to some specific property of the outdated model or if it is even generalisable.
skydhash
I’m not a scientist or a researcher, but anything based on statistics and data interpretation is immediately subject to my skepticism.
pawanjswal
Appreciate the honesty. AI isn’t magic, and it’s refreshing to see someone actually say it out loud.
wrren
AI companies are hugely motivated to show beyond-human levels of intelligence in their models, even if it means flubbing the numbers. If they manage to capture the news cycle for a bit, it's a boost to confidence in their products and maybe their share price if they're public. The articles showing that these advances are largely junk aren't backed by corporate marketing budgets or the desires of the investor class like the original announcements were.
eviks
> I found that AI methods performed much worse than advertised.
Lesson learned: don't trust ads
> Most scientists aren’t trying to mislead anyone
More learning ahead, the exciting part of being a scientist!
This is the second article in a week where someone is writing about how "AI" has failed them in their field (here it's physics, the other article was radiology), and in both articles they are using now ancient mid 2010's deep learning NNs.
I don't know if it's intentional, but the word "AI" means different things almost every year now. Its worse than papers getting released with "LLMs unable to do basic math" and then you see they used GPT-3 for the study.