AI has a cargo cult problem

125 comments

·October 17, 2025

empath75

Un-paywalled version.

tra3

If I'm tired of one thing related to AI/llm/chatbots it's the claims that it's not useful. It 100% is. We have to separate the massive financial machinations from the actual tech.

Reading this article though, I'm questioning my decision to avoid hosting open source LLMs. Supposedly the performance of Owen-coder is comparable to the likes of Sonnet4. If I invest in a homelab that can host something like Qwen3 I'll recoup my costs in about 20 months without having to rely on Anthropic.

mynameisash

I don't think I've ever seen anyone say they're not useful. Rather, they don't appear to live up to the hype, and they're sure as hell not a panacea.

I'm pretty bearish on LLMs. I also think they're over-hyped and that the current frenzy will end badly (global economically speaking). Than said, sure, they're useful. Doesn't mean they're worth it.

Agingcoder

To some extent it’s not that they don’t live up to the hype - rather that the gains are hard to measure.

Llms have spared me hours of research on exotic topics actually useful for my day job However, that’s the whole problem - I don’t know how much.

If they had a real price ( accounting for OpenAI losses for example) with ChatGPT at 50 usd/month for everyone, OpenAI being profitable, and people actually paying for this, I think things might self adjust and we’d have some idea.

Right now, we live in some kind of parallel world.

mattlutze

> exotic topics [...] I don't know how much

We also don't know, in situations like this, whether all of or how much of the research is true. As has been regularly and publicly demonstrated [0][1][2], the most capable of these systems still make very fundamental mistakes, misaligned to their goals.

The LLMs really, really want to be our friend, and production models do exhibit tendencies to intentionally mislead when it's advantageous [3], even if it's against their alignment goals.

0: https://www.afr.com/companies/professional-services/oversigh... 1: https://www.nbcnews.com/world/australia/australian-lawyer-so... 2: https://calmatters.org/economy/technology/2025/09/chatgpt-la... 3: https://arxiv.org/pdf/2509.18058?

alganet

> I don’t know how much.

If you're not willing to measure how it helps you, then it's probably not worth it.

I would go even further: if the effort of measuring is not feasible, then it's probably not worth it.

That is more targeted at companies than you specifically, but it also works as an individual reflection.

In the individual reflection, it works like this: you should think "how can I prove to myself that I'm not being bamboozled?". Once you acquire that proof, it should be easy to share it with others. If it's not, it's probably not a good proof (like an anecdote).

I already said this, and I'll say it again: record yourself using LLMs. Then watch the recording. Is it that good? Notice that I am removing myself from the equation here, I will not judge how good is it, you're going to do it yourself.

thatjoeoverthr

The thing with the hype is it's always the same hype. "If you can just 3D print another 3D printer ..." "Apps are dead, everything will be AJAX" etc. I no longer believe the hype itself warrants attention or pushback. Let the hype boys raise money. No need to protect naive VCs.

bigfishrunning

But if the hype boys manage to capture big portions of the market (Microsoft, Amazon, etc...) it starts affecting pensions and retirement accounts. The next few years are gonna be rough because of this hype.

watwut

> Let the hype boys raise money. No need to protect naive VCs.

I genuinely 100% believe that ability of hype boys to raise money is harming the economy and us all. Whatever structural reason for it existing is there, it would be the best to end it.

QuantumGood

Those that claim not useful usually link it to something like "never trust because hallucinations", or backtrack when called out like "yes, I should have added details", or speak of problems outweighing usefulness hence not useful, etc. But online, people do make this statement.

James_K

Something not being useful is distinct from it having no uses. It could well be the case that the use of AI creates more damage than it does good. Many people have found it a useful tool to create the appearance of work where none is happening.

tra3

Fair enough, I may have conflated "there's an AI bubble" with "AIs aren't useful".

My employer pays for Claude pro access, and if they stopped paying tomorrow I'd consider paying for it myself. Although, it's much more likely for me to start self hosting them.

So that's what it's worth to me, say $2500 USD in hardware over the next 3 years.

I'd love to hear what your take on this is.

brailsafe

$2500 is a relatively small investment for any sort of useful tool over 3 years, but that seems very low to me for this specific self-hosting endeavor

null

[deleted]

peteforde

My dude, there is a small but weirdly dedicated group of people on this site that are hellbent on demanding "proof" that the wins we've personally gained from using LLMs in an intelligent way are real. It's actually been kind of exhausting, leading me to not weigh in on many threads.

hatthew

Because there's a lot of evidence that people tend to overestimate/overstate how useful LLMs are.

Everyone says "I wrote this thing using AI" but most of the time reading the prompt would be just as useful as reading the final product.

Everyone says "I wrote this large codebase using AI" but most of the time the code is unmaintainable and probably could have been implemented with much less code by a real human, and also the final software isn't actually ready for prod yet.

Everyone says "I find AI coding very useful" and neglects to mention that they are making small adhoc scripts, or they're in a domain that's mostly boilerplate anyways (e.g. some parts of web dev).

The one killer application of LLMs seems to be text summarization. Everything else that I have seen is either a niche domain that doesn't apply to the vast majority of people, a final product that is slop and shouldn't been made in the first place, or minor gains that are worthwhile but nowhere near as groundbreaking as people claim.

To be clear, I think LLMs are useful, and I personally use them regularly. But I've gained at most 5% productivity from them (likely much less). For me, it's exhausting to keep on trying to realize these gains everyone is talking about, while every time I dig into someone claiming to get massive gains I find that the actual impact is highly questionable.

llm_nerd

>I don't think I've ever seen anyone say they're not useful.

https://news.ycombinator.com/item?id=45577203

There are thousands and thousands of comments just like this on this site. I would dare say tens of thousands. They regularly appear in any AI-related discussion.

I've been involved in many threads on here where devs with Very Important Work announce that none of the AI tools are useful for them or for anyone with Real Problems, and at best they work for copy/paste junior devs who don't know what they're doing and are doing trivial work. This is right after they declare that anyone that isn't building a giant monolithic PHP app just like them are trend-chasers who are "cargo culting, like some tribe or something".

>I also think they're over-hyped and that the current frenzy will end badly (global economically speaking)

In a world where Tesla is a trillion dollar company based upon vapourware, and the president of largest economy (for now) is launching shitcoins and taking bribes through crypto, and every Western country saw a massive real-estate ramp up by unmetered mass migration, and Bitcoin is a $2T "currency" that has literally zero real world use beyond betting on itself, and sites like Polymarket exist for insiders to scam foolish rube outsiders out of their money, and... Dude, the AI bubble doesn't even remotely measure.

didibus

> it's the claims that it's not useful

I think the reason is because it depends what impact metrics you want to measure. "Usefulness" is in the eye of the beholder. You have to decide what metric you consider "useful".

If it's company profit for example, maybe the data shows it's not yet useful and not having impact on profit.

If it's the level of concentration needed by engineers to code, then you probably can see that metric having improved as less mental effort is needed to accomplish the same thing. If that's the impact you care about, you can consider it "useful".

Etc.

Octoth0rpe

> It 100% is [useful]

It's worth disambiguating between "worth $50b of investment" useful versus "worth $1t of investment" useful

pseudosavant

For perspective, there are 10 companies with a market cap over $1T. Is the value of LLMs greater than Tesla? Absolutely.

The problem of course is that plenty of that $1T in investment will go to stupid investments. The people whose investments pan out will be the next generation of Zuckerbergs. The rest will be remembered like MySpace or Webvan.

pseudosavant

I'll add that MSFT, AAPL, GOOGL, AMZN, and META generated >$450B in net income in the last 4 quarters. It can't be overstated how much profits they can burn on AI without losing money.

rz2k

To be fair, while the incremental value of each additional year that Tesla remains in existence may not be so great, it did finally change the conventional wisdom about the viability of electric vehicles which will continue to have substantial impact.

Furthermore the price of the most recently sold share times the number outstanding does not represent the total R&D or spending to make Teslas.

mattlutze

Especially when, as it is currently in vogue to observe, the difference between $50b and $1t is roughly $1t.

marcosdumay

Up to 2 significant figures...

criemen

> Supposedly the performance of Owen-coder is comparable to the likes of Sonnet4. If I invest in a homelab that can host something like Qwen3 I'll recoup my costs in about 20 months without having to rely on Anthropic.

You can always try it via openrouter without investing in the home setup first. That allows you to evaluate whether it hits your quality bar or not, and is much cheaper. It is less fun than self-hosting though.

silversmith

The issue is that the field is still moving too fast - in 20 months, you might break even on costs, but the LLMs you are able to run might be 20 months behind "state of the art". As long as providers keep selling cheap inference, I'm holding out.

ants_everywhere

I agree, but also don't underestimate the value of developing a competency in self-hosting a model.

Dan Luu has a relevant post on this that tracks with my experience https://danluu.com/in-house/

tra3

That's where I am at too. Also it's not clear what's going to happen with hardware prices. I think there's a huge demand for hardware right now, but it should fall off at some point hopefully.

wmf

The gap between local models and SOTA is around 6 months and it's either steady or dropping. (Obviously this depends on your benchmark and preferences.)

criddell

Seriously? So I can run the best models from 2024 at home now?

For example, what would I need to run Open AI's o1 model from 2024 at home? Are there good guides for setting this up?

rz2k

Fortunately the models are increasing in efficiency about as fast as they are increasing in performance, so your homelab surprisingly doesn’t become out of date as fast as you might expect. However, I expect there will also be very capable machines like 1TB or 2TB Mac Studio M5 or M6 Ultras within a year or two.

mrbungie

It's hell useful, I use Cursor several times a week (and I'm not even working as a dev full time rn), and ChatGPT is my daily driver.

Yet, it's weird to me that we're 3 years into this "revolution" and I can't get a decent slideshow from an LLM without having to practically build a framework for doing so.

jacobr1

It is a focus, data, and benchmarking problem. If someone comes up with good benchmarks, which means having a good dataset, and gets some publicility around, they can attract the frontier labs attention to focus training and optimization effort on making the models better for that benchmark. This is how most the capabilities we have today have become useful. Maybe there is some emergent initial detection of utility, but the refinement comes from labs beating others on the benchmarks. So we need a slideshow benchmark and I think we'd see rapid improvement. LLMs are actually ok at a building html decks, not great, but ok. Enough so that if we there was some good objective criteria to tune things toward I think the last-mile kinks would get worked out (formats, object/text overlaps). the raw content is mainly a function of the core intelligence of model, so that wouldn't be impacted (if you get get it to build a good bullet-point markdown of you presentation today it would be just a good as a prezo, but maybe not as visually compelling as you like. Also this might need to be an agentic benchmark to allow for both text and image creation and other considerations like data sourcing. Which is why everyone doing this ends up building their own mini framework.

A ton of the reinforcement type training work really just aligning the vague commands a user would give to the same capability a model would produce with a much more flushed out prompt.

mrdependable

They are useful, but I find it is only slightly more convenient than a Google search. Losing something like GPS on my phone would be a much bigger disruption to my life.

arjie

I used Qwen3-480B-Coder with Cerebras and it was not very good for my use case. You can run these models online first to see if they will work for you. I recommend you try that first.

btucker

https://archive.is/RVTHE

jjangkke

I disagree with labeling AI to be a cargo cult. Crypto fits the description but the definition of a cargo cult has to imply some sort of ultimate end in which its follower's expectations are drastically reduced.

What AI feels like is the early days of the internet. We've seen the dot com bubble but we ultimately live in the internet age. There is no doubt that post-AI bubble will be very much AI orientated.

This is very different from crypto which isn't by any measure a technological leap rather more than a crowd frenzy aimed at self-enrichment via ponzi mechanisms.

moomin

I want to ask ChatGPT to point to a behaviour described in the article that resembles cargo-culting with AI, but I don’t want to waste my future overlord’s time.

johnohara

Not sure "Cargo Cult" is an apt description. Feynman's description of Cargo Cult Science was predicated on the behavior of islanders building structures in expectation it would summon the planes, cargo, personnel, etc. that used the island during WWII.

Without a previous experience they would not have built anything.

There is no previous AI experience behind today's pursuit of the AI grail. In other words, no planes with cargo driving an expectation of success. Instead, the AI pursuit is based upon the probability of success, which is aptly defined as risk.

A correct analog would be the islanders building a boat and taking the risk of sailing off to far away shores in an attempt to procure the cargo they need.

wmf

Arguably AI is already "successful" in terms of funding and press coverage and that's what many people are chasing.

smogcutter

This is a good point as a tangent. “Cargo Cult” is a meaningful phrase for ritualizing a process without understanding it.

Debasing the phrase makes it less useful and informative.

It’s a cargo cult usage of “cargo cult”!

sails

I’m amazed they published it with such a poorly applied analogy.

blamestross

Yeah, "cargo cult" is abused as a term. Those islanders were smarter than what is happening here.

We use it dismissively but "cargo cult" behaviour is entirely reasonable. You know an effect is possible, and you observe novel things corellating with it. You try them to test the causality. It looks silly when you know the lesson already, but it was intelligent and reasonable behaviour the entire way.

The current situation is bubble denial, not cargo culting. Blaming cargo culting is a mechanism of bubble denial here.

o11c

It's called "science experiments", except when it produces the null result.

nextworddev

Yes, this rally seems overextended. But investor sentiment - if anything - has already swung to very negative, which isn't ideal if you want it to crash.

Bubbles don't pop without indiscriminate euphoria (Private markets are a different story, but VCs are fked anyways). If anything, the prices have reflected less than 20% of Capex projections, so the market clearly thinks OpenAI / Stargate / FAANG's capex plans are BS.

p.s. if everyone thinks it's a bubble, it generally rallies even more..

vonneumannstan

>If anything, the prices have reflected less than 20% of Capex projections, so the market clearly thinks OpenAI / Stargate / FAANG's capex plans are BS.

I'd say if anything the market is massively underestimating the scale of their capex plans. These things are using as much electricity as small cities. They are well past breaking ground, the buildings are going up as we speak.

https://www.datacenterdynamics.com/en/news/openai-and-oracle...

https://x.com/sama/status/1947640330318156074/photo/1

There are dozens of these planned.

nextworddev

Think we said the same thing

andoando

Oh jesus. I think AI is useful, but I figure 95%+ of it used on complete nonsense.

lazide

Same thing happened with the dot-com boom and bust, except with fiber (later called dark fiber) and datacenters.

A lot of people lost a lot of money. Post bankruptcy, it also fueled the later tech booms, as now there was a ton of dark fiber waiting to be used at rock bottom prices, and underutilized datacenters and hardware. Google was a major beneficiary.

nextworddev

think you are mis-quoting some of the poorly written vendor financing articles / linkedin posts.

the market hasn't priced in Sam Altman's Capex projections, so it's probably akin to 1998 or 1999

jerf

Cargo cult as a metaphor doesn't work here. That's for when the cargo culters don't understand what is going on, and attempt to imitate the actions without understanding or accuracy. AI investors understand what is going on and understand that this may be a bubble and they may lose their investment. We may disagree with them about the probabilities of such an outcome, perhaps even quite substantially, but that's not the same thing as thinking that if I just write some number-looking-squiggles on a piece of paper and slide it under the door of a building that looks like it has computers on it I will have a pool and a lambo when I get home. That's what "cargo cult" investing would look like.

The AI investors know what they are doing, by which I mean, if this is every bit the bubble some of us think it is and it pops as viciously as it possibly can and these investors lose everything from top to bottom, if they tried to say "I didn't know that could happen!" I simply wouldn't believe them and neither would anyone else. Of course they know it's possible. They may not believe it is likely, but they are 100% operating from a position of knowledge and understanding and taking actions that have a completely reasonable through-line to successfully achieving their goals. Indeed I'm sure some people have sufficiently cashed out of their positions or diversified them such that they have already completely succeeded; worries about the bubble are worries about a sector and a broad range of people but some individuals can and will come out of this successfully even if it completely detonates in the future. If nothing else the people simply drawing salaries against the bubble, even completely normal non-inflated ones, can be called net winners.

alphazard

Ironically, the author of TFA is playing the part of the cargo cult. They don't actually understand the cargo cult metaphor, but since it is a popular metaphor, they reference it in naive imitation hoping that people engage with their content.

NotBillBellaC

and we did! So it works?

leptons

The original "cargo culters" had nothing to lose, so your comment falls apart pretty quickly.

jerf

My comment about how "cargo culting" is not an appropriate metaphor "falls apart" because you named another way in which the metaphor is not appropriate?

This is some bold new definition of "falls apart" with which I am not familiar.

jasonthorsness

Everyone has imperfect information; this isn't a cargo cult situation where it's massively asymmetric, this is more like when you see everyone else running, it's generally a good idea to start running too. But when that heuristic fails it fails in a pretty spectacular way.

gdulli

Maybe it's human nature that has a cargo cult problem and AI is just the current flypaper?

micromacrofoot

Tech has a cargo cult problem

blackoil

Tech has a winner takes all problem. All those billions are chasing trillions of valuation. Many will fail, but some will be ruling(metaphorically) the world

fancyfredbot

Right now the AI business does not looking like a winner takes all situation at all. Everyone is losing money and nobody has any lock-in. Free open source models are only 6-12 months behind the frontier labs. Seems like a tricky business to me with no clear route to metaphorical world domination.

Making stuff for AI companies looks like better business to me!

ctoth

The only cargo cult behavior I see here is Tett's own journalism! She casually drops that same debunked "95% of companies see no AI revenue gains" figure[0] without tracing it to source, performing the ritual of citation while missing the actual mechanism that makes evidence valuable.

[0] https://aiascendant.com/p/why-95-of-ai-commentary-fails

null

[deleted]

burnt-resistor

The problem is the self-reinforcing valuation entanglements that make NVIDIA have a market cap of 4.42 teradollars.. until Meta, Goog, Apple, and Microsoft develop their own custom NPUs or the fragile bubble bursts in some other way.

There's value here, but probably not as much as the market thinks... yet.

HN

AI has a cargo cult problem

AI has a cargo cult problem