Skip to content(if available)orjump to list(if available)

AI founders will learn the bitter lesson

AI founders will learn the bitter lesson

129 comments

·January 12, 2025

CharlieDigital

There's only one core problem in AI worth solving for most startups building AI powered software: context.

No matter how good the AI gets, it can't answer about what it doesn't know. It can't perform a process for which it doesn't know the steps or the rules.

No LLM is going to know enough about some new drug in a pharma's pipeline, for example, because it doesn't know about the internal resources spread across multiple systems in an enterprise. (And if you've ever done a systems integration in any sufficiently large enterprise, you know that this is a "people problem" and usually not a technical problem).

I think the startups that succeed will understand that it all comes down to classic ETL: identify the source data, understand how to navigate systems integration, pre-process and organize the knowledge, train or fine-tune a model or have the right retrieval model to provide the context.

There's fundamentally no other way. AI is not magic; it can't know about trial ID 1354.006 except for what it was trained on and what it can search for. Even coding assistants like Cursor are really solving a problem of ETL/context and will always be. The code generation is the smaller part; getting it right requires providing the appropriate context.

lolinder

This is why I strongly suspect that AI will not play out the way the Web did (upstarts unseat giants) and will instead play out like smartphones (giants entrench and balloon).

If all that matters is what you can put into context, then AI really isn't a product in most cases. The people selling models are actually just selling compute, so that space will be owned by the big clouds. The people selling applications are actually just packaging data, so that space will be owned by the people who already have big data in their segment: the big players in each industry. All competitors at this point know how important data is, and they're not going to sell it to a startup when they could package it up themselves. And most companies will prefer to just use features provided by the B2B companies they already trust, not trust a brand new company with all the same data.

I fully expect that almost all of the AI wins will take the form of features embedded in existing products that already have the data (like GitHub with Copilot), not brand new startups who have to try to convince companies to give them all their data for the first time.

master_crab

Yup. And it’s already playing out that way. Anthropic, OpenAI, Gemini - technically not an upstart. All have hyperscalers backing and subsidizing their model training (AWS, Azure, GCP, respectively). It’s difficult to discern where the segmentation between compute and models are here.

alephnerd

> It’s difficult to discern where the segmentation between compute and models are here.

Startups can outcompete the Foundational Model companies by concentrating on creating a very domain specific model, and providing support and services that comes out of having expertise in that specific domain.

This is why OpenAI chose to invest in Cybersecurity startups with Menlo Ventured in 2022 instead of building their own dedicated cybersecurity vertical, because a partnership driven model nets the most profit with the least resources expended.

This is the same reason why hyperscalers like Microsoft, Amazon, and Google themselves have ownership stakes in the foundational model companies like Anthropic, OpenAI, etc.

Foundational Models are a good first start, but are not 100% perfect in a number of fields and usecases. Ime, tooling built with these models are often used to cut down on headcount by 30-50% for the team using it to solve a specific problem. And this is why domain specific startups still thrive - sales, support, services, etc will still need to be tailored for buyers.

energy123

This problem will be eaten by OpenAI et al. the same way the careful prompting strategies used in 2022/2023 were eaten. In a few years we will have context lengths of 10M+ or online fine tuning, combined with agents that can proactively call APIs and navigate your desktop environment.

Providing all context will be little more than copying and pasting everything, or just letting the agent do its thing.

Super careful or complicated setups to filter and manage context probably won't be needed.

OutOfHere

Context requires quadratic VRAM. It is why OpenAI hasn't even supported 200k context length yet for its 4o model.

Is there a trick that bypasses this scaling constraint while strictly preserving the attention quality? I suspect that most such tricks lead to performance loss while deep in the context.

jbverschoor

And how does that differ from any person without that information?

digitcatphd

I agree with you at this time, but there are a couple things I think will change this:

1. Agentic search can allow the model to identify what context is needed and retrieve the needed information (internally or externally through APIs or search)

2. I received an offer from OpenAI to give me free credits if I shared my API data with it, in other words, it is paying for industry specific data as they are probably fine tuning niche models.

There could be some exceptions to UI/UX going down specific verticals but eventually these fine tuning sector specific instances value will erode over time but this will likely occupy a niche since enterprise wants maximum configuration and more out of box solutions are oriented around SMEs.

est31

It comes down to moats. Does OpenAI have a moat? It's leading the pack, but the competitors always seem to be catching up to it. We don't see network effects with it yet like with social networks, unless OpenAI introduces household robots for everyone or something, builds a leading marketshare in that segment, and the rich data from these household bots is enough training data that one can't replicate with a smaller robot fleet.

And AI is too fundamental of a technology that a "loss leader biggest wallet wins" strategy, used by the likes of Uber, will work.

API access can be restricted. Big part of why Twitter got authwalled was so that AI models can't train from it. Stack overflow added a no AI models clause to their free data dump releases (supposed to be CC licensed), they want to be paid if you use their data for AI models.

CharlieDigital

    > Agentic search
All you've proposed is moving the context problem somewhere else. You still need to build the search index. It's still a problem of building and providing context.

dartos

To your first point, the LLM still can’t know what it doesn’t know.

Just like you can’t google for a movie if you don’t know the genre, any scenes, or any actors in it, and AI can’t build its own context if it didn’t have good enough context already.

IMO that’s the point most agent frameworks miss. Piling on more LLM calls doesn’t fix the fundamental limitations.

TL;DR an LLM can’t magically make good context for itself.

I think you’re spot on with your second point. The big differentiators for big AI models will be data that’s not easy to google for and/or proprietary data.

Lucky they got all their data before people started caring.

mritchie712

I think you're downplaying how well Cursor is doing "code generation" relative to other products.

Cursor can do at least the following "actions":

* code generation

* file creation / deletion

* run terminal commands

* answer questions about a code base

I totally agree with you on ETL (it's a huge part of our product https://www.definite.app/), but the actions an agent takes are just as tricky to get right.

Before I give Cursor, I often doubt it's going to be able to pull it off and I constantly impressed by how deep it can go to complete a complex task.

uxhacker

So isn’t cursor just a tool for Claude or ChatGpt to use? Another example would be a flight booking engine. So why can’t an AI just talk direct to an IDE? This is hard as the process has changed, due to the human needing to be in the middle.

So Isn’t AI useless without the tools to manipulate?

ErikBjare

Is it really that different to Claude with tools via MCP, or my own terminal-based gptme? (https://github.com/ErikBjare/gptme)

TeMPOraL

I thought it's basically a subset of Aider[0] bolted into a VS Code fork, and I remain confused as to why we're talking about it so much now, when we didn't about Aider before. Some kind of startup-friendly bias? I for one would prefer OSS to succeed in this space.

--

[0] - https://aider.chat/

stereobit

It’s not even just the lack of access to the data, so much hidden information to make decisions is not documented at all. It’s intuition, learned from doing something in a specific context for a long time and only a fraction of that context is accessible.

HPsquared

This is where Microsoft has the advantage, all those Teams calls can provide context.

iandanforth

Context is important but it takes about two weeks to build a context collection bot and integrate it into slack. The hard part is not technical, AIs can rapidly build a company specific and continually updated knowledge base, it's political. Getting a drug company to let you tap slack and email and docs etc is dauntingly difficult.

lolinder

Difficult to impossible. Their vendors are already working on AI features, so why would they risk adding a new vendor when a vendor they've already approved will have substantially the same capabilities soon?

abrichr

> No matter how good the AI gets, it can't answer about what it doesn't know. It can't perform a process for which it doesn't know the steps or the rules

This is exactly why I created https://github.com/OpenAdaptAI/OpenAdapt: so that users can demonstrate their desktop workflows to AI models step by step (without worrying about their data being used by a corporation).

jonnycat

I think this argument only makes sense if you believe that AGI and/or unbounded AI agents are "right around the corner". For sure, we will progress in that direction, but when and if we truly get there–who knows?

If you believe, as I do, that these things are a lot further off than some people assume, I think there's plenty of time to build a successful business solving domain-specific workflows in the meantime, and eventually adapting the product as more general technology becomes available.

Let's say 25 years ago you had the idea to build a product that can now be solved more generally with LLMs–let's say a really effective spam filter. Even knowing what you know now, would it have been right at the time to say, "Nah, don't build that business, it will eventually be solved with some new technology?"

jillesvangurp

I don't think it's that binary. We've had a lot of progress over the last 25 years; much of it in the last two. AGI is not a well defined thing that people easily agree on. So, determining whether we have it or not is actually not that simple.

Mostly people either get bogged down into deep philosophical debates or simply start listing things that AI can and cannot do (and why they believe why that is the case). Some of those things are codified in benchmarks. And of course the list of stuff that AIs can't do is getting stuff removed from it on a regular basis at an accelerating rate. That acceleration is the problem. People don't deal well with adapting to exponentially changing trends.

At some arbitrary point when that list has a certain length, we may or may not have AGI. It really depends on your point of view. But of course, most people score poorly on the same benchmarks we use for testing AIs. There are some specific groups of things where they still do better. But also a lot of AI researchers working on those things.

antonvs

Agreed. There's a difference between developing new AI, and developing applications of existing AI. The OP seems to blur this distinction a bit.

The original "Bitter Lesson" article referenced in the OP is about developing new AI. In that domain, its point makes sense. But for the reasons you describe, it hardly applies at all to applications of AI. I suppose it might apply to some, but they're exceptions.

ilaksh

You think it will be 25 years before we have a drop in replacement for most office jobs?

I think it will be less than 5 years.

You seem to be assuming that the rapid progress in AI will suddenly stop.

I think if you look at the history of compute, that is ridiculous. Making the models bigger or work more is making them smarter.

Even if there is no progress in scaling memristors or any exotic new paradigm, high speed memory organized to localize data in frequently used neural circuits and photonic interconnects surely have multiple orders of magnitude of scaling gains in the next several years.

sealeck

I think you're suffering from some survivorship bias here. There are lot of technologies that don't work out.

ilaksh

Computation isn't one of them so far. Do you believe this is the end of computing efficiency improvements?

lolinder

> You seem to be assuming that the rapid progress in AI will suddenly stop.

And you seem to assume that it will just continue for 5 years. We've already seen the plateau start. OpenAI has tacitly acknowledged that they don't know how to make a next generation model, and have been working on stepwise iteration for almost 2 years now.

Why should we project the rapid growth of 2021–2023 5 years into the future? It seems far more reasonable to project the growth of 2023–2025, which has been fast but not earth-shattering, and then also factor in the second derivative we've seen in that time and assume that it will actually continue to slow from here.

harvodex

At this point, the lack of progress since April 2023 is really what is shocking.

I just looked on midjourney reddit to make sure I wasn't missing some new great model.

Instead what I notice is the small variations on the themes I have already seen a thousand times a year ago now. Midjourney is so limited in what it can actually produce.

I am really worried that all this is much closer to a parlor trick than AGI. "simple trick or demonstration that is used especially to entertain or amuse guests"

It all feels more and more like that to me than any kind of progress towards general intelligence.

pgwhalen

> OpenAI has tacitly acknowledged that they don't know how to make a next generation model

Can you provide a source for this? I'm not super plugged into the space.

noch

> You seem to be assuming that the rapid progress in AI will suddenly stop.

> I think if you look at the history of compute, that is ridiculous. Making the models bigger or work more is making them smarter.

It's better to talk about actual numbers to characterise progress and measure scaling:

" By scaling I usually mean the specific empirical curve from the 2020 OAI paper. To stay on this curve requires large increases in training data of equivalent quality to what was used to derive the scaling relationships. "[^2]

"I predicted last summer: 70% chance we fall off the LLM scaling curve because of data limits, in the next step beyond GPT4.

[…]

I would say the most plausible reason is because in order to get, say, another 10x in training data, people have started to resort either to synthetic data, so training data that's actually made up by models, or to lower quality data."[^0]

“There were extraordinary returns over the last three or four years as the Scaling Laws were getting going,” Dr. Hassabis said. “But we are no longer getting the same progress.”[^1]

---

[^0]: https://x.com/hsu_steve/status/1868027803868045529

[^1]: https://x.com/hsu_steve/status/1869922066788692328

[^2]: https://x.com/hsu_steve/status/1869031399010832688

ilaksh

o1 proved that synthetic data and inference time is a new ramp. There will be more challenges and more innovations. There is a lot of room in hardware, software, model training and model architecture left.

GardenLetter27

We already have AGI in some ways though. Like I can use Claude for both generating code and helping with some maths problems and physics derivations.

It isn't a specific model for any of those problems, but a "general" intelligence.

Of course, it's not perfect, and it's obviously not sentient or conscious, etc. - but maybe general intelligence doesn't require or imply that at all?

SecretDreams

For me, general intelligence from a computer will be achieved when it knows when it's wrong. You may say that humans also struggle with this, and I'd agree - but I think there's a difference between general intelligence and consciousness, as you said.

raincole

> AGI in some ways

In other words, just AI, not AGI.

timabdulla

I think one thing ignored here is the value of UX.

If a general AI model is a "drop-in remote worker", then UX matters not at all, of course. I would interact with such a system in the same way I would one of my colleagues and I would also give a high level of trust to such a system.

If the system still requires human supervision or works to augment a human worker's work (rather than replace it), then a specific tailored user interface can be very valuable, even if the product is mostly just a wrapper of an off-the-shelf model.

After all, many SaaS products could be built on top of a general CRM or ERP, yet we often find a vertical-focused UX has a lot to offer. You can see this in the AI space with a product like Julius.

The article seems to assume that most of the value brought by AI startups right now is adding domain-specific reliability, but I think there's plenty of room to build great experiences atop general models that will bring enduring value.

If and when we reach AGI (the drop-in remote worker referenced in the article), then I personally don't see how the vast majorities of companies - software and others - are relevant at all. That just seems like a different discussion, not one of business strategy.

bsenftner

The value of UX is being ignored, as the magical thinking has these AIs being fully autonomous, which will not work. The phrase "the devil's in the details" needs to be imprinted on everyone's screens, because the details of a "drop-in remote worker" are several Grand Canyons yet to be realized. This civilization is vastly more complex than you, dear reader, realize, and the majority of that complexity is not written down.

ilaksh

I guess part of the point is that the value of the UX will quickly start to decrease as more tasks or parts of tasks can be done without close supervision. And that is subject to the capabilities of the models which continues to improve.

I suggest that before we satisfy _everyone_'s definition of AGI, more and more people may decide we are there as their own job is automated.

The UX at that point, maybe in 5 or 10 or X years, might be a 3d avatar that pops up in your room via mixed reality glasses, talks to you, and then just fires off instructions to a small army of agents on your behalf.

Nvidia actually demoed something a little bit like that a few days ago. Except it lives on your computer screen and probably can't manage a lot of complex tasks on it's own. Yet.

Or maybe at some point it doesn't need sub agents and can just accomplish all of the tasks on its own. Based on the bitter lesson, specialized agents are probably going to have a limited lifetime as well.

But I think it's worth having the AGI discussion as part of this because it will be incremental.

Personally, I feel we must be pretty close to AGI because Claude can do a lot of my programming for me. I still have to make important suggestions, and routinely for obvious things, but it is much better at me at filling in all the details and has much broader knowledge.

And the models do keep getting more robust, so I seriously doubt that humans will be better programmers overall for much longer.

hitchstory

A drop in remote worker will still require their work to be checked and their access to the systems they need to do their work secured in case they are a bad actor.

bko

It's a little depressing how many high valued startups are basically just wrappers around LLMs that they don't own. I'd be curious to see what percentage of YC latest batch is just this.

> 70% of Y Combinator’s Winter 2024 batch are AI startups. This is compared to -57% of YC Summer 2023 companies and ~32% from the Winter batch one year ago (YC W23).

The thinking is, the models will get better which will improve our product, but in reality, like the article states, the generalized models get better so your value add diminished as there's no need to fine tune.

On the other hand the crypto fund made a killing off of "me too" block chain technology before it got hammered again. So who knows about 2-5 year term but 10 year almost certainly won't have these billion dollar companies that are wrappers around LLMs

https://x.com/natashamalpani/status/1772609994610835505?mx=2

NameError

I think the core problem at hand for people trying to use AI in user-facing production systems is "how can we build a reliable system on top of an unreliable (but capable) model?". I don't think that's the same problem that AI researchers are facing, so I'm not sure it's sound to use "bitter lesson" reasoning to dismiss the need for software engineering outright and replace it with "wait for better models".

The article sits on an assumption that if we just wait long enough, the unreliability of deep learning approaches to AI will just fade away and we'll have a full-on "drop-in remote worker". Is that a sound assumption?

9dev

Well. We were working on a search engine for industry suppliers since before the whole AI hype started (even applied to YC once), and hit a brick wall at some point were it got too hard to improve search result quality algorithmically. To understand what that means: We gathered lots of data points from different sources, tried to reconcile that into unified records, then find the best match for a given sourcing case based on that. But in a lot of cases, both the data wasn’t accurate enough to identify what a supplier was actually manufacturing, and the sourcing case itself wasn’t properly defined, because users found it too hard to come up with good keywords for their search.

Then, LLMs entered the stage. Suddenly, we became able to both derive vastly better output from the data we got, and also offer our users easier ways to describe what they were looking for, find good keywords automatically, and actually deliver helpful results!

This was only possible because AI augments our product well and really provides a benefit in that niche, something that would just not have been possible otherwise. If you plan on founding a company around AI, the best advice I can give you is to choose a problem that similarly benefits from AI, but does exist without it.

openrisk

> the data wasn’t accurate enough to identify what a supplier was actually manufacturing

how did the LLM help with that challenge?

HeatrayEnjoyer

A guess: Their ability to infer reality from incomplete and indirect information.

leviliebvin

Controversial opinion: I don't believe in the bitter lesson. I just think that the current DNN+SGD approaches are just not that good at learning deep general expressive patterns. With less inductive bias the model memorizes a lot of scenarios and is able to emulate whatever real work scenario you are trying to make the model learn. However it fails to simulate this scenario well. So it's kind of misleading to say that it's generally better to have less inductive bias. That is only true if your model architecture and optimization approach are just a bit crap.

My second controversial point regarding AI research and startups: doing research sucks. It's risky business. You are not guaranteed success. If you make it, your competitors will be hot on your tail and you will have to keep improving all the time. I personally would rather leave the model building to someone else and focus more on building products with the available models. There are exceptions like finetuning for your specific product or training bespoke models for very specific tasks at hand.

resiros

The author discusses the problem from the point of engineering, not from business. When you look at it from business perspective, there is a big advantage of not waiting, and using whatever exists right now to solve the business problem, so that you can get traction, get funding, grab marketshare, build a team, and when the next day a better model will come, you can rewrite your code, and you would be in a much better position to leverage whatever new capabilities the new models provide; you know your users, you have the funds, you built the right UX...

The best strategy from your experience, is to jump on a problem as soon there is opportunity to solve it and generate lots of business value within the next 6 months. The trick is finding that subproblem that is worth a lot right now and could not be resolved 6 months ago. A couple of AI-sales startups "succeeded" quite well doing that (e.g. 11x), now they are in a good position to build from there (whether they will succeed in building a unicorn, that's another question, it just looks like they are in a good position now).

DebtDeflation

>Eventually, you’ll just need to connect a model to a computer to solve most problems - no complex engineering required.

The word "eventually" is doing a lot of work here. Yes, it's true in the abstract, but over what time horizon? We have to build products to solve today's problems with today's technology, not wait for the generalized model that can do everything but may be decades away.

amelius

True, but it tells that if you are a founder of a niche AI company then you should take money out of it instead of investing everything back into the company, because eventually the generalist-AI will destroy your business and you will be left with nothing.

timabdulla

Based on the author's company that be founded, I assume he believes this technology is just years away.

I think with a lot of AI folk in San Francisco, this is a tacit assumption when having these sorts of conversations.

prmph

Anyone that thinks this is just years away is utterly ignoring human history, nature, and relationship with technology. My own view is that this will never be achieved, and it's not even just about the tech.

Let's imagine for a moment that this is even achieved. Then, there is still complex engineering required in the world: to maintain and continually improve the AI engines and their interfaces. Unless you want to say that, past some point, the AI will be self-improving without any human input whatsoever. Unless the AI can read our minds, I'm not sure it can continue to serve human interests without human input.

But, never mind, we will never get there. At this very moment, tech is capable of so much more, but most sites I visit have bad UI, are bloated downloading and executing massive amounts of JS, riddled with annoying ads that serve no real useful purpose to society, and riddled with bugs. Even as an engineer, I really struggle to find any good no-code tools to create anything truly sophisticated without digging into hard-core code. Heck, they are now talking about adding more HTTP methods to HTML forms.

tinco

This might be true on a very long timescale, but that's not really relevant for VC's. Literally every single VC I've talked to raised the question if our moat is not just having better prompts, it's usually the first question. If a VC really invested in a company whose moat got evaporated by O1, that's on the VC. Everyone saw technology like O1 coming from a mile away.

For the slightly more complex stuff, sure at some point some general AI will probably be able to do it. But with two big caveats, the first being: when? and the second being: for how much?.

In theory every deep and wide enough neural network should be able to be trained to do object detection in images, yet no one is doing that. Technologies specifically designed to process images, like CNN's, reign supreme. Likewise for architectures of LLM's.

At some point your specialization might become obsolete, but that point might be a decade or more from now. Until then, specializations will have large economic and performance advantages making the advancements in AI today available to the industry of tomorrow.

I think it's the role of the VC to determine not if there's an AI breakthrough behind a startups technology, but if there's a market disruption and if that market disruption can be leveraged to establish a dominant company. Similar to how Google leveraged a small and easily replicable algorithmic advantage into becoming one of the most valuable companies on earth.

thegeomaster

On your object detection point, Gemini 2.0 Flash has bounding box detection: https://ai.google.dev/gemini-api/docs/models/gemini-v2#bound....

I haven't found it to work particularly well for some more domain-specific things I tried, but it was surprisingly good for an LLM.

iandanforth

I disagree with the author based on timelines and VC behaviour. There is sufficient time to create a product and raise massive capital before the next massive breakthrough hands the value back to OpenAI/Google/Anthropic/MS. Secondly the execution of a solution in a vertical is sufficiently differentiating that even if the underlying problem could be solved by a next gen model there's very little reason to believe it will be. Big Cos don't have the interest to attack niches while there are billion user markets to go after. So don't build "photo share with AI", build "plant fungus photo share for farmers".

jvanderbot

Yeah I can't make this article work as a day to day advice piece. In the time it takes to get a generational change in AI from computational resources, we might find a worthy result for your PhD, business, drug design, or killer application.

It can be simultaneously true that a tech is doomed to obsolescence from over specialization, and that it does incredibly useful things in the mean time.

doctorpangloss

More computation cannot improve the quality or domain of data. Maybe the bitter lesson lesson is, lobby bitterly, for copyright laws that favor what you are doing, and weakened anti trust, to give you the insurmountable moat of exclusive data in a walled garden media network.

guax

A human does not need billions of driving hours to learn how to drive competently. The issue with current method is not quality of data but methodology. More computation might unlock newer approaches that are better with less and worse quality data.

namaria

I think there's a more fundamental problem at play here: what seems to work in 'AI', search, is made better by throwing more data into more compute. You then store the results in a model, that amounts to pre-computed solutions waiting for a problem. Interacting with the model is then asking questions and getting answers that hopefully fit your needs.

So, what we're doing on the whole seems to be a lot of coding and decoding, hoping that the data used in training can be adequately mapped to the problem domain realities. That would mean that the model you end up with is somehow a valid representation of some form of knowledge about the problem domain. Trouble is, more text won't yield higher and higher resolution of some representation of the problem domain. After some point, you start to introduce noise.

Zr01

A human is not a blank slate. There's millennia of evolutionary history that goes into making a brain adapted and capable of learning from its environment.

qeternity

A human is a mostly blank slate...but it's a really sophisticated slate that as you say has taken many millions of years of development.

graycat

> A human does not need billions of driving hours to learn how to drive competently.

But humans DO need ~16 years of growth and development "to learn how to drive competently" and then will also know how to ride a bycycle, mow grass, build shelves, cook pizza, use a smart phone, ...! There's a lesson in that somewhere ....

guax

You don't need the 16, you can get a much younger person to drive too. It only supports the fact that data amount/quality is not the problem.