Skip to content(if available)orjump to list(if available)

Are OpenAI and Anthropic Losing Money on Inference?

JCM9

These articles (of which there are many) all make the same basic accounting mistakes. You have to include all the costs associated with the model, not just inference.

This article is like saying an apartment complex isn’t “losing money” because the monthly rents cover operating costs but ignoring the cost of the building. Most real estate developments go bust because the developers can’t pay the mortgage payment, not because they’re negative on operating costs.

If this article was correct these companies wouldn’t need to raise money. If you have healthy positive cash flow you have much better mechanisms available to fund capital investment other than selling shares. Eg issue a bond against that healthy cash flow.

Fact remains when all costs are considered these companies are losing money and so long as the lifespan of a model is limited it’s going to stay ugly. Using that apartment building analogy it’s like having to knock down and rebuild the building every 6 months to stay relevant. That’s simply not a viable business model.

martinald

(Author here). Yes I am aware of that and did mention it. However - what I wanted to push back in this article was that claude code was completely unsustainable and therefore a flash in the pan and devs aren't at risk (I know you are not saying this).

The models as is are still hugely useful, even if no further training was done.

Aurornis

> The models as is are still hugely useful, even if no further training was done.

Exactly. The parent comment has an incorrect understanding of what “unit economics” means.

The cost of training is not a factor in the marginal cost of each inference or each new customer.

It’s unfortunate this comment thread is the highest upvoted right now when it’s based on a basic misunderstanding of unit economics.

gruez

I think the point isn't to argue AI companies are money printers or even that they're fairly valued, it's that at least the unit economics work out. Contrast this to something like moviepass, where they were actually losing money on each subscriber. Sure, a company that requires huge capital investments that might never be paid back isn't great either, but at least it's better than moviepass.

JCM9

Unit economics needs to include the cost of manufacturing the thing being sold, not just the direct cost of selling it.

Unit economics is mostly a manufacturing concept and the only reason it looks OK here is because of not really factoring in the cost of building the thing into the cost of the thing.

That would be like saying the unit economics of selling software is good because the only cost is some bandwidth and credit card processing fees. You need to include the cost of making the software to have a realistic picture of the health of a company, just like you need to include the cost of making the models.

voxic11

That isn't what unit economics is. The purpose of unit economics is to answer: "How much money do I make (or lose) if I add one more customer or transaction?". Since adding an additional user/transaction doesn't increase the cost of training the models you would not include the cost of training the models in a unit economics analysis. The entire point of unit economics is that it excludes such "fixed costs".

Aurornis

The cost of “manufacturing” an AI response is the inference cost, which this article covers.

> That would be like saying the unit economics of selling software is good because the only cost is some bandwidth and credit card processing fees. You need to include the cost of making the software

Unit economics is about the incremental value and costs of each additional customer.

You do not amortize the cost of software into the unit economics calculations. You only include the incremental costs of additional customers.

> just like you need to include the cost of making the models.

The cost of making the models is important overall, but it’s not included in the unit economics or when calculating the cost of inference.

barrkel

There is no marginal cost for training, just like there's no marginal cost for software. This is why you don't generally use unit economics for analyzing software company breakeven.

ascorbic

You can amortise the training cost across billions of inference requests though. It's the marginal cost for inference that's most interesting here.

martinald

But what about running Deepseek R1 or (insert other open weights model here)? There is no training cost for that.

crote

Their assumption is that training is a fixed cost: you'll spend the same amount on training for 5 users as you will with 500 million users.

Spending hundreds of millions of dollars on training when you are two guys in a garage is quite significant, but the same amount is absolutely trivial if you are planet-scale.

The big question is: how will training cost develop? Best-case scenario is a one-and-done run. But we're now seeing an arms race between the various AI providers: worst-case scenario, can the market survive an exponential increase in training costs for sublinear improvements?

Aurornis

> You have to include all the costs associated with the model, not just inference.

The title of the article directly says “on inference”. It’s not a mistake to exclude training costs. This is about incremental costs of inference.

artursapek

Hacker News commenters just can't help but critique things even when they're missing the point

Aurornis

The parent commenter’s responses are all based on a wrong understanding of what “unit economics” means.

You don’t include fixed costs in the unit economics. Unit economics is about incremental costs.

kgwgk

Your comment may apply to the original commenter “missing” the point of TFA and to the person replying “missing” the point of that comment. And to my comment “missing” the point of yours - which may have also “missed” the point.

ForHackernews

> if you have healthy positive cash flow you have much better mechanisms available to fund capital investment other than selling shares. Eg issue a bond against that healthy cash flow.

Is that actually true in 2025? Presumably you have to make coupon payments on a bond(?), but shares are free. Companies like Meta have shown you can issue shares that don't come with voting rights and people will buy them, and meme stocks like GME have demonstrated the effectiveness of churning out as many shares as the market will bear.

politelemon

What will be the knock on effect on us consumers?

chasd00

Self hosting LLMs isn’t completely out of the realm of feasibility. Hardware cost may be 2-3x a hardcore gaming rig but it would be neat to see open source, self hosted, coding helpers. When Linux hit the scenes it put UNIX(ish) power in the hands of anyone with no license fee required. Surely somewhere someone is doing the same with LLM assisted coding.

JCM9

Costs will go up to levels where people will no longer find this stuff as useful/interesting. It’s all fina be games until the subsides end.

See the recent reactions to AWS pricing on Kiro where folks had a big WTF reaction on pricing after, it appears, AWS tried to charge realistic pricing based on what this stuff actually costs.

philipallstar

The article is answering a specific question, and has excluded this on purpose. If you have a sunk training cost you still want to know if you can at least operate profitably.

kelp6063

API prices are going up and rate limits are getting more aggressive (see what's going on with cursor and claude code)

osti

This kinda tracks with the latest estimate of power usage of llm inference published by google https://news.ycombinator.com/item?id=44972808. If inference isnt that power hungry like people thought, they must be able to make good money from those subscriptions.

jsnell

I don't believe the asymmetry between prefill and decode is that large. If it were, it would make no sense for most of the providers to have separate pricing for prefill with cache hits vs. without.

Given the analysis is based on R1, Deepseek's actual in-production numbers seem highly relevant: https://github.com/deepseek-ai/open-infra-index/blob/main/20...

(But yes, they claim 80% margins on the compute in that article.)

> When established players emphasize massive costs and technical complexity, it discourages competition and investment in alternatives

But it's not the established players emphasizing the costs! They're typically saying that inference is profitable. Instead the false claims about high costs and unprofitability are part of the anti-AI crowd's standard talking points.

martinald

Yes. I was really surprised at this myself (author here). If you have some better numbers I'm all ears. Even on my lowly 9070XT I get 20x the tok/s input vs output, and I'm not doing batching or anything locally.

I think the cache hit vs miss stuff makes sense at >100k tokens where you start getting compute bound.

jonathan-adly

Basically- the same math as modern automated manufacturing. Super expensive and complex build-out - then a money printer once running and optimized.

I know there is lots of bearish sentiments here. Lots of people correctly point out that this is not the same math as FAANG products - then they make the jump that it must be bad.

But - my guess is these companies end up with margins better than Tesla (modern manufacturer), but less than 80%-90% of "pure" software. Somewhere in the middle, which is still pretty good.

Also - once the Nvidia monopoly gets broken, the initial build out becomes a lot cheaper as well.

hugedickfounder

the difference is you can train on outputs deepseek style, there are not gates in this field profit margins will go to 0

pityJuke

From https://www.theverge.com/command-line-newsletter/759897/sam-..., Sam Altman said:

> “If we didn’t pay for training, we’d be a very profitable company.”

gjsman-1000

If inference is that cheap, why is not even one company profitable yet?

ascorbic

Because they're spending it all on training the next model.

techpineapple

So, if this is true, OpenAI needs much better conversion rates, because they have ~15 million paying users compared to 800 million weekly active users:

https://nerdynav.com/chatgpt-statistics/

martinald

Yeah but they can probably monetize them with ads.

UltraSane

LLM generated ads.

bgwalter

I'm not so sure. Inserting ads into chatbot output is like inserting ads into email. People are more reluctant to tolerate that than web or YouTube ads (which are hated already).

If they insert stealth ads, then after the third sponsored bad restaurant suggestion people will stop using that feature, too.

martinald

Mmm let's see. I think in LLM ads are probably have the most intent (and therefore most value) of any ads. They are like search PPC ads on steroids as you have even more context of what the user is actually looking for.

Hell they could even just add affiliate tracking to links (and not change any of the ranking based on it) and probably make enough money to cover a lot of the inference for free users.

player1234

Input inference i.e. reading is cheaper, output i.e. doing the generating is not, for something called generative AI sounds pretty fucking not profitable.

The cheap usecase from this article is not a trillion dollar industry and absolutely not the usecase hyped as the future by AI companies, that is coming for your job.