A New AI Winter Is Coming

108 comments

·December 1, 2025

stanfordkid

This article uses the computational complexity hammer way to hard, discounts huge progress in every field of AI outside of the hot trend of transformers and LLMs. Nobody is saying the future of AI is autoregressive and this article pretty much ignores any of the research that has been posted here around diffusion based text generation or how it can be combined with autoregressive methods… discounts multi-modal models entirely. He also pretty much discounts everything that’s happened with AlphaFold, Alpha Go etc. reinforcement learning etc.

The argument that computational complexity has something to do with this could have merit but the article certainly doesn’t give indication as to why. Is the brain NP complete? Maybe maybe not. I could see many arguments about why modern research will fail to create AGI but just hand waving “reality is NP-hard” is not enough.

The fact is: something fundamental has changed that enables a computer to pretty effectively understand natural language. That’s a discovery on the scale of the internet or google search and shouldn’t be discounted… and usage proves it. In 2 years there is a platform with billions of users. On top of that huge fields of new research are making leaps and bounds with novel methods utilizing AI for chemistry, computational geometry, biology etc.

It’s a paradigm shift.

andy99

I agree with everything you wrote, the technology is unbelievable and 6 years ago, maybe even 3.1 years would have been considered magic.

A steel man argument for why winter might be coming is all the dumb stuff companies are pushing AI for. On one hand (and I believe this) we argue it’s the most consequential technology in generations. On the other, everybody is using it for nonsense like helping you write an email that makes you sound like an empty suit, or providing a summary you didn’t ask for.

There’s still a ton of product work to cross whatever that valleys called between concept and product, and if that doesn’t happen, money is going to start disappearing. The valuation isn’t justified by the dumb stuff we do with it, it needs PMF.

holri

> The argument that computational complexity has something to do with this could have merit but the article certainly doesn’t give indication as to why.

OP says it is because that predicting the next token can be correct or not, but it always looks plausible because that is what it calculates. Therefore it is dangerous and can not be fixed because it is how it works in essence.

dangus

I just want to point out a random anecdote.

Literally yesterday ChatGPT hallucinated an entire feature of a mod for a video game I am playing including making up a fake console command.

It just straight up doesn’t exist, it just seemed like a relatively plausible thing to exist.

This is still happening. It never stopped happening. I don’t even see a real slowdown in how often it happens.

It sometimes feels like the only thing saving LLMs are when they’re forced to tap into a better system like running a search engine query.

Lerc

To take a different perspective on the same event.

The model expected a feature to exist because it fitted with the overall structure of the interface.

This in itself can be a valuable form of feedback. I currently don't know of any people doing it, but testing interfaces by getting LLMs to use them could be an excellent resource. Th the AI runs into trouble, it might be worth checking your designs to see if you have any inconsistencies, redundancies or other confusion causing issues.

One would assume that a consistent user interface would be easier for both AI and humas. Fixing the issues would improve it for both.

That failure could be leveraged into an automated process that identified areas to improve.

cess11

There is no difference between "hallucination" and "soberness", it's just a database you can't trust.

The response to your query might not be what you needed, similar to interacting with an RDBMS and mistyping a table name and getting data from another table or misremembering which tables exist and getting an error. We would not call such faults "hallucinations", and shouldn't when the database is a pile of eldritch vectors either. If we persist in doing so we'll teach other people to develop dangerous and absurd expectations.

bitwize

Skill issue. Proper prompt engineering reduces the frequency of hallucinations.

cess11

GOFAI was also a paradigm shift, regardless of that winter. For example, banks started automating assessments of creditworthiness.

What we didn't get was what had been expected, namely things like expert systems that were actual experts, so called 'general intelligence' and war waged through 'blackboard systems'.

We've had voice controlled electronics for a long time. On the other hand, machine vision applications have improved massively in certain niches, and also allowed for new forms of intense tyranny and surveillance where errors are actually considered a feature rather than a bug since they erode civil liberties and human rights but are still broadly accepted because 'computer says'.

While you could likely argue "leaps and bounds with novel methods utilizing AI for chemistry, computational geometry, biology etc." by downplaying the first part or clarifying that it is mainly an expectation, I think most people are going to, for the foreseeable future, keep seeing "AI" as more or less synonymous with synthetic infantile chatbot personalities that substitute for human contact.

gishh

> something fundamental has changed that enables a computer to pretty effectively understand natural language.

You understand how the tech works right? It's statistics and tokens. The computer understands nothing. Creating "understanding" would be a breakthrough.

Edit: I wasn't trying to be a jerk. I sincerely wasn't. I don't "understand" how LLMs "understand" anything. I'd be super pumped to learn that bit. I don't have an agenda.

frotaur

It astonishes me how people can make categorical judgements on things as hard to define as 'understanding'.

I would say that, except for the observable and testable performance, what else can you say about understanding?

It is a fact that LLMs are getting better at many tasks. From their performance, they seem to have an understanding of say python.

The mechanistic way this understanding arises is different than humans.

How can you say then it is 'not real', without invoking the hard problem of consciousness, at which point, we've hit a completely open question.

null

[deleted]

LatencyKills

As someone who was an engineer on the original Copilot team, yes I understand how tech works.

You don’t know how your own mind “understands” something. No one on the planet can even describe how human understanding works.

Yes, LLMs are vast statistical engines but that doesn’t mean something interesting isn’t going on.

At this point I’d argue that humans “hallucinate” and/or provide wrong answers far more often than SOTA LLMs.

I expect to see responses like yours on Reddit, not HN.

gishh

> I expect to see responses like yours on Reddit, not HN.

I suppose that says something about both of us.

ilikeatari

We could use a little more kindness in discussion. I think the commenter has a very solid understanding on how computer works. The “understanding” is somewhat complex but I do agree with you that we are not there yet. I do think that the paradigm shift though is more about the fact that now we can interact with the computer in a new way.

pawelduda

The end effect certainly gives off "understanding" vibe. Even if method of achieving it is different. The commenter obviously didn't mean the way human brain understands

Lambdanaut

You understand how the brain works right? It's probability distributions mapped to sodium ion channels. The human understands nothing.

0xdeadbeefbabe

I've heard that this human brain is rigged to find what it wants to find.

aydyn

Thats how the brain works, not how the mind works. We understand the hardware, not the software.

neom

Birds and planes operate using somewhat different mechanics, but they do both achieve flight.

HPsquared

Birds and planes are very similar other than the propulsion and landing gear, and construction materials. Maybe bird vs helicopter, or bird vs rocket.

goncharom

Every time I see comments like these I think about this research from anthropic: https://www.anthropic.com/research/mapping-mind-language-mod...

LLMs activate similar neurons for similar concepts not only across languages, but also across input types. I’d like to know if you’d consider that as a good representation of “understanding” and if not, how would you define it?

gishh

If i could understand what the brain scans actually meant, I would consider it a good representation. I don't think we know yet what they mean. I saw some headline the other day about a person with "low brain activity" and said person was in complete denial about it, I would be too.

Uehreka

“You understand how the brain works right? It’s neurons and electrical charges. The brain understands nothing.”

I’m always struck by how confidently people assert stuff like this, as if the fact that we can easily comprehend the low-level structure somehow invalidates the reality of the higher-level structures. As if we know concretely that the human mind is something other than emergent complexity arising from simpler mechanics.

I’m not necessarily saying these machines are “thinking”. I wish I could say for sure that they’re not, but that would be dishonest: I feel like they aren’t thinking, but I have no evidence to back that up, and I haven’t seen non-self-referential evidence from anyone else.

aoeusnth1

Token economics are very sound - it's the training which is expensive.

Tokens/week have gone up 23x year-over-year according to https://openrouter.ai/rankings. This is probably around $500M-1B in sales per year.

The real question is where the trajectory of this rocket ship is going. Will per-token pricing be a race to the bottom against budget chinese model providers? Will we see another 20x year year over the next 3 years, or will it level out sooner?

cmiles8

LLMs are an amazing advancement. The tech side of things is very impressive. Credit where credit is due.

Where the current wave all falls apart is on the financials. None of that makes any sense and there’s no obvious path forward.

Folks say handwavy things like “oh they’ll just sell ads” but even a cursory analysis shows that math doesn’t ad up relative to the sums of money being invested at the moment.

Tech wise I’m bullish. Business wise, AI is setting up to be a big disaster. Those that aimlessly chased the hype are heading for a world of financial pain.

swalsh

Hard disagree, I'm in the process of deploying several AI solutions in Healthcare. We have a process a nurse usually spends about an hour on, and costs $40-$70 depending on if they are offshore and a few other factors. Our AI can match it at a few dollars often less. A nuse still reviews the output, but its way less time. The economics of those tokens is great. We have another solution that just finds money, $10-$30 in tokens can find hundreds of thousands of dollars. The tech isn't perfect (that's why we have a human in the loop still) but its more than good enough to do useful work, and the use cases are valuable.

Adapse

Are the companies providing these AI services actually profitable? My impression is that AI prices are grossly suppressed and might explode soon.

zzbzq

I think they were referring to the costs of training and hosting the models. You're counting the cost of what you're buying, but the people selling it to you are in the red.

cmiles8

Correct

beachtaxidriver

It's true, but do you really trust the AI generated + Nurse Review output more than Organic Nurse generated?

In my experience, management types use the fact that AI generated + Nurse Review is faster to push a higher quota of forms generated per hour.

Eventually, from fatigue or boredom, the human in the loop just ends up being a rubber stamper. Would you trust this with your own or your children's life?

The human in the loop becomes a lot less useful when it's pressured to process a certain quota against an AI that's basically stochastic "most probable next token", aka professional bullshitter, literally trained to generate plasuible outputs with no responsibility to accurate outputs.

famouswaffles

>Folks say handwavy things like “oh they’ll just sell ads” but even a cursory analysis shows that math doesn’t ad up relative to the sums of money being invested at the moment.

Ok, so I think there's 2 things here that people get mixed on.

First, Inference of the current state of the art is Cheap now. There's no 2 ways about it. Statements from Google, Altman as well as costs of 3rd parties selling tokens of top tier open source models paint a pretty good picture. Ads would be enough to make Open AI a profitable company selling current SOTA LLMs to consumers.

Here's the other thing that mixes things up. Right now, Open AI is not just trying to be 'a profitable company'. They're not just trying to stay where they are and build a regular business off it. They are trying to build and serve 'AGI', or as they define it, 'highly autonomous systems that outperform humans at most economically valuable work'. They believe that, to build and serve this machine to hundreds of millions would require costs order(s) of magnitudes greater.

In service of that purpose is where all the 'insane' levels of money is moving to. They don't need hundreds of billions of dollars in data centers to stay afloat or be profitable.

If they manage to build this machine, then those costs don't matter, and if things are not working out midway, they can just drop the quest. They will still have an insanely useful product that is already used by hundreds of millions every week, as well as the margins and unit economics to actually make money off of it.

cmiles8

If OpenAI was the only company doing this that argument might sort of make sense.

The problem is they have real competition now and that market now looks like an expensive race to an undifferentiated bottom.

If someone truly invents AGI and it’s not easily copied by others then I agree it’s a whole new ballgame.

The reality is that years into this we seem to be hitting a limit to what LLMs can do with only marginal improvements with each release. On that path this get ugly fast.

famouswaffles

As far as Consumer LLMs go, they don't really have competition. Well they do, but it's more Google vs Bing than Android vs Apple. 2nd place is a very very distant 2nd and almost all growth and usage is still being funneled to Open AI. Even if it's 'easily copied', getting there first could prove extremely valuable.

gdulli

> Folks say handwavy things like “oh they’ll just sell ads” but even a cursory analysis shows that math doesn’t ad up relative to the sums of money being invested at the moment.

We should factor in that messaging that's seamless and undisclosed in conversational LLM output will be a lot more valuable that what we think of as advertising today.

xnx

Don't confuse OpenAI financials with Google financials. OpenAI could fold and Google would be fine.

Nevermark

Google would actually be in grave danger… of drowning themselves in champagne.

tarr11

This has convinced many non-programmers that they can program, but the results are consistently disastrous, because it still requires genuine expertise to spot the hallucinations.

I've been programming for 30+ years and now a people manager. Claude Code has enabled me to code again and I'm several times more productive than I ever was as an IC in the 2000s and 2010s. I suspect this person hasn't really tried the most recent generation, it is quite impressive and works very well if you do know what you are doing

stingraycharles

If you’ve been programming for 30+ years, you definitely don’t fall under the category of “non-programmers”.

You have decades upon decades of experience on how to approach software development and solve problems. You know the right questions to ask.

The actual non-programmers I see on Reddit are having discussions about topics such as “I don’t believe that technical debt is a real thing” and “how can I go back in time if Claude Code destroyed my code”.

agubelu

Isn't that what the author means?

"it still requires genuine expertise to spot the hallucinations"

"works very well if you do know what you are doing"

hombre_fatal

But it can work well even if you don't know what you are doing (or don't look at the impl).

For example, build a TUI or GUI with Claude Code while only giving it feedback on the UX/QA side. I've done it many times despite 20 years of software experience. -- Some stuff just doesn't justify me spending my time credentializing in the impl.

Hallucinations that lead to code that doesn't work just get fixed. Most code I write isn't like "now write an accurate technical essay about hamsters" where hallucinations can sneak through lest I scrutinize it; rather the code would just fail to work and trigger the LLM's feedback loop to fix it when it tries to run/lint/compile/typecheck it.

But the idea that you can only build with LLMs if you have a software engineer copilot isn't true and inches further away from true every month, so it kinda sounds like a convenient lie we tell ourselves as engineers (and understandably so: it's scary).

pzo

The author headline starts with "LLMs are a failure", hard to take author seriously with such a hyperbole even if second part of headline ("A new AI winter is coming") might be right.

seaucre

I have a journalist friend with 0 coding experience who has used ChatGPT to help them build tools to scrape data for their work. They run the code, report the errors, repeat, until something usable results. An agent would do an even better job. Current LLMs are pretty good at spotting their own hallucinations if they're given the ability to execute code.

The author seems to have a bias. The truth is that we _do not know_ what is going to happen. It's still too early to judge the economic impact of current technology - companies need time to understand how to use this technology. And, research is still making progress. Scaling of the current paradigms (e.g. reasoning RL) could make the technology more useful/reliable. The enormous amount of investment could yield further breakthroughs. Or.. not! Given the uncertainty, one should be both appropriately invested and diversified.

chomp

For toy and low effort coding it works fantastic. I can smash out changes and PRs fantastically quick, and they’re mostly correct. However, certain problem domains and tough problems cause it to spin its wheels worse than a junior programmer. Especially if some of the back and forth troubleshooting goes longer than one context compaction. Then it can forget the context of what it’s tried in the past, and goes back to square one (it may know that it tried something, but it won’t know the exact details).

asah

That was true six months ago - the latest versions are much better at memory and adherence, and my senior engineer friends are adopting LLMs quickly for all sorts of advanced development.

roadside_picnic

The problem with this type of comment is I know multiple people who would say the exact same thing (I actually double checked to make sure I wasn't responding to someone higher up at my company) but everyone working with what they've produced is constantly fighting against a sea of garbage code while also not wanting to be the first to call out that the emperor (or Director of Engineering/VP/CTO in many cases) has no clothes.

This isn't just a critique of anecdotes: I've noticed that LLMs are specifically good at convincing people of an "overly optimistic" (sometimes bordering on delusional) understanding of the quality of work they are producing.

weare138

..and works very well if you do know what you are doing

That's the issue. AI coding agents are only as good as the dev behind the prompt. It works for you because you have an actual background in software engineering of which coding is just one part of the process. AI coding agents can't save the inexperienced from themselves. It just helps amateurs shoot themselves in the foot faster while convincing them they're a marksman.

Lionga

It seems to work well if you DONT really know what you are doing. Because you can not spot the issues.

If you know what you are doing it works kind of mid. You see how anything more then a prototype will create lots of issues in the long run.

Dunning-Kruger effect in action.

Barathkanna

Interesting take. His argument is basically that LLMs have hit their architectural ceiling and the industry is running on hype and unsustainable economics. I’m not fully convinced, but the points about rising costs and diminishing returns are worth paying attention to. The gap between what these models can actually do and what they’re marketed as might become a real problem if progress slows.

DrewADesign

I think the unsustainably cheap consumer-facing AI products are the spoonful of sugar getting us to swallow a technology that will almost entirely be used to make agents that justify mass layoffs.

buellerbueller

And surveil and rat on us, sometimes incorrectly.

ryanjshaw

The existence of an AI hype train doesn’t mean there isn’t a productive AI no-hype train.

Context: I have been writing software for 30 years. I taught myself assembly language and hacked games/apps as a kid, and have been a professional developer for 20 years. I’m not a noob.

I’m currently building a real-time research and alerting side project using a little army of assistant AI developers. Given a choice, I would never go back to how I developed software before this. That isn’t my mind poisoned by hype and marketing.

jansan

I think we are not even close to using the potential of current LLMs. Even if capabilities of LLMs would not improve, we will see better performance on the software and hardware side. It is no longer a question of "if", but of "when" there will be a Babelfish like device available. And this is only one obvious application, I am 100% sure that people are still finding useful new applications of AI every day.

However, there is a real risk that AI stocks will crash and pull the entire market down, just like it happened in 2000 with the dotcom bubble. But did we see an internet or dotcom winter after 2000? No, everybody kept using the Internet, Windows, Amazon, Ebay, Facebook and all the other "useless crap". Only the stock market froze over for a few years and previously overhyped companies had a hard time, but given the exaggeration before 2000 this was not really a surprise.

What will happen is that the hype train will stop or slow down, and people will no longer get thousands, millions, billions, or trillions in funding just because they slap "AI" to their otherwise worthless project. Whoever is currently working on such a project should enjoy the time while it lasts - and rest assured that it will not last forever.

Nevermark

I am simply stunned at the negativity.

Yes, there is hype.

But if you actually filter it out, instead of (over) reacting to it in either direction, progress has been phenomenal and the fact there is visible progress in many areas, including LLMs, in the order of months demonstrates no walls.

Visible progress doesn’t mean astounding progress. But any tech that is improving year to year is moving at a good speed.

Huge apparent leaps in recent years seem to have spoiled some people. Or perhaps desensitized them. Or perhaps, created frustration that big leaps don’t happen every week.

I can’t fathom anyone not using models for 1000 things. But we all operate differently, and have different kinds of lives, work and problems. So I take claims that individuals are not getting much from models at face value.

But that some people are not finding the value isn’t an argument that those of us getting value, increasing value isn’t real.

wellthisisgreat

Yeah after reading the intro lines I noped out of this garbage blog (?). Maybe it's their SEO strategy to write the flat-out untrue incendiary stuff like "the technology is essentially a failure" - If that's the case, I don't have time for this.

Maybe they actually think this way, then, I certainly have time for this.

what I have time for? virtual bonding with fello HN ppl :)

aroman

When the hype is infinite (technological singularity and utopia), any reality will be a let down.

But there is so much real economic value being created - not speculation, but actual business processes - billions of dollars - it’s hard to seriously defend the claim that LLMs are “failures” in any practical sense.

Doesn’t mean we aren’t headed for a winter of sobering reality… but it doesn’t invalidate the disruption either.

emp17344

Other than inflated tech stocks making money off the promise of AI, what real economic impact has it actually had? I recall plenty of articles claiming that companies are having trouble actually manifesting the promised ROI.

n4r9

> not speculation, but actual business processes

Is there really a clear-cut distinction between the two in today's VC and acquisition based economy?

keybored

Hype Infinity is a form of apologia that I haven’t seen before.

api

This is why I hate hype.

"We just cured cancer! All cancer! With a simple pill!"

"But you promised it would rejuvenate everyone to the metabolism of a 20 year old and make us biologically immortal!"

New headline: "After spending billions, project to achieve immortality has little to show..."

ZeroConcerns

Well, the original "AI winter" was caused by defense contracts running out without anything to show for it -- turns out, the generals of the time could only be fooled by Eliza clones for so long...

The current AI hype is fueled by public markets, and as they found out during the pandemic, the first one to blink and acknowledge the elephant in the room loses, bigly.

So, even in the face of a devastating demonstration of "AI" ineffectiveness (which I personally haven't seen, despite things being, well, entirely underwhelming), we may very well stuck in this cycle for a while yet...

raincole

While both are unlikely, if I have to choose one I would bet on AGI than AI winter in the next five years.

AI just got better and better. People thought it couldn't solve math problems without some human formalizes them first. Then it did. People thought it couldn't generate legible text. Then it did.

All while people swore it had reached a "plateau," "architecture ceiling," "inherent limit," or whatever synonym of the goalpost.

siva7

Is "A New AI Winter Is Coming" the new "The Year of Linux Desktop"?

bdangubic

> it's a horrible liability to have to maintain a codebase that nobody on the team actually authored.

exactly - just like 99.9873% of all codebases currently running in production worldwide :)

andreyk

This blog post is full of bizarre statements and the author seems almost entirely ignorant of the history or present of AI. I think it's fair to argue there may be an AI bubble that will burst, but this blog post is plainly wrong in many ways.

Here's a few clarifications:

"AI winter was over, and a new era was beginning. I should explain for anyone who hasn't heard that term before, that way back in the day, when early AI research was seemingly yielding significant results, there was much hope, as there is now, but ultimately the technology stagnated. "

The term AI winter typically refers to a period of reduced funding for AI research/development, not the technology stagnating (the technology failing to deliver on expectations was the cause of the AI winter, not the definition of AI winter).

"[When GPT3 came out, pre-ChatGPT] People were saying that this meant that the AI winter was over, and a new era was beginning."

People tend to agree there were two AI winters already, one having to do with symbolic AI disappointments/general lack of progress (70s), and the latter related to expert systems (late 80s). That AI winter has long been over. The Deep Learning revolution started in ~2012, and by 2020 (GPT 3) huge amount of talent and money were already going into AI for years. This trend just accelerated with ChatGPT.

"[After symbolic AI] So then came transformers. Seemingly capable of true AI, or, at least, scaling to being good enough to be called true AI, with astonishing capabilities ... This sounds bizarre and probably impossible, but the huge research breakthrough was figuring out that, by starting with essentially random coefficients (weights and biases) in the linear algebra, and during training back-propagating errors, these weights and biases could eventually converge on something that worked."

Transformers came about in 2017. The first wave of excitement about neural nets and backpropagation goes all the way back to the late 80s/early 90s, and AI (computer vision, NLP, to a lesser extent robotics) were already heavily ML-based by the 2000s, just not neural-net based (this changed in roughly 2012).

"All transformers have a fundamental limitation, which can not be eliminated by scaling to larger models, more training data or better fine-tuning. It is fundamental to the way that they operate. On each turn of the handle, transformers emit one new token (a token is analogous to a word, but in practice may represest word parts or even complete commonly used small phrases – this is why chatbots don't know how to spell!). In practice, the transformer actually generates a number for every possible output token, with the highest number being chosen in order to determine the token. This token is then fed back, so that the model generates the next token in the sequence. The problem with this approach is that the model will always generate a token, regardless of whether the context has anything to do with its training data. Putting it another way, the model generates tokens on the basis of what 'looks most plausible' as a next token. If this is a bad choice, and gets fed back, the next token will be generated to match that bad choice ... This is the root of the hallucination problem in transformers, and is unsolveable because hallucinating is all that transformers can do."

The 'highest number' token is not necessarily chosen, this depends on the decoding algorithm. That aside, 'the next token will be generated to match that bad choice' makes it sound like once you generate one 'wrong' token the rest of the output is also wrong. A token is a few characters, and need not 'poison' the rest of the output.

That aside, there are plenty of ways to 'recover' from starting to go down the wrong route. A key aspect of why reasoning in LLMs works well is that it typically incorporates backtracking - going earlier in the reasoning to verify details or whatnot. You can do uncertainty estimation in the decoding algorithm, use a secondary model, plenty of things (here is a detailed survey https://arxiv.org/pdf/2311.05232 , one of several that is easy to find).

"The technology won't disappear – existing models, particularly in the open source domain, will still be available, and will still be used, but expect a few 'killer app' use cases to remain, with the rest falling away."

A quick google search shows ChatGPT currently has 800 million weekly active users who are using it for all sorts of things. AI-assisted programming is certainly here to stay, and there are plenty of other industries in which AI will be part of the workflow (helping do research, take notes, summarize, build presentations, etc.)

I think discussion is good, but it's disappointing to see stuff with this level of accuracy being on front page of HN.