AI agents are starting to eat SaaS
67 comments
·December 14, 2025andy_ppp
_pdp_
This is only true if you assume that you are producing the same amount of code as today. Though, AI ultimately will produce more code which will require higher maintenance. Your internal team will need to scale up due to the the amount of code they need to maintain. Your security team will have more work to do as well because they will need to review more code which will require scaling that team as well. Your infrastructure costs will start adding up and if you have any DevOps they will need scaling too.
Soon or later the CTO will be dictating which projects can be vibe coded which ones make sense to buy.
SaaS benefits from network effects - your internal tools don't. So overall SaaS is cheaper.
The reality is that software license costs is a tiny fraction of total business costs. Most of it is salaries. The situation you are describing the kind of dead spiral many companies will get into and that will be their downfall not salvation.
technotony
Interesting application. Can you share more about your stack and how you are approaching that build?
Oarch
Earlier this year I thought that rare proprietary knowledge and IP was a safe haven from AI, since LLMs can only scrub public data.
Then it dawned on me how many companies are deeply integrating Copilot into their everyday workflows. It's the perfect Trojan Horse.
phendrenad2
[delayed]
findjashua
providers' ToS explicitly states whether or not any data provided is used for training purposes. the usual that i've seen is that while they retain the right to use the data on free tiers, it's almost never the case for paid tiers
sotrusting
Right, so totally cool to ignore the law but our TOS is a binding contract.
protocolture
Where are they ignoring the law?
mc32
Yes, they can be sued for breach of contract. And it’s not a regular ToS but a signed MSA and other legally binding documents.
bdangubic
it is amazing in almost 2026 there is anyone believing this… amazing
GCUMstlyHarmls
I wonder how much wiggle there is for collect now (to provide service, context history, etc), then later anonymise (some how, to some level) and then train on it?
Also I wonder if the ToS covers "queries & interaction" vs "uploaded data" - I could imagine some tricky language in there that says we wont use your word document, but we may at some time use the queries you put against it, not as raw corpus but as a second layer examining what tools/workflows to expand/exploit.
matt-p
Even if they're were doing this (I highly doubt it) so much would be lost to distillation I'm not convinced there would be much that actually got in, apart from perhaps internal codenames or whatever which will be obvious.
Aurornis
Using an LLM on data does not ingest that data into the training corpus. LLMs don’t “learn” from the information they operate on, contrary to what a lot of people assume.
None of the mainstream paid services ingest operating data into their training sets. You will find a lot of conspiracy theories claiming that companies are saying one thing but secretly stealing your data, of course.
Retric
Companies have already shifting from not using customer data to giving them an option to opt out ex:
“How can I control whether my data is used for model training?
If you are logged into Copilot with a Microsoft Account or other third-party authentication, you can control whether your conversations are used for training the generative AI models used in Copilot. Opting out will exclude your past, present, and future conversations from being used for training these AI models, unless you choose to opt back in. If you opt out, that change will be reflected throughout our systems within 30 days.” https://support.microsoft.com/en-us/topic/privacy-faq-for-mi...
At this point suggesting it has never and will her happen is wildly optimistic.
lwhi
Information about the way we interact with the data (RLHF) can be used to refine agent behaviour.
While this isn't used specifically for LLM training, it can involve aggregating insights from customer behaviour.
null
TheRoque
Just read the ToS of the LLM products please
doctorpangloss
This is so naive. The ToS permits paraphrasing of user conversations, by not excluding it, and then training on THAT. You’d never be able to definitively connected paraphrased data to yours, especially if they only train on paraphrased data that covers frequent, as opposed to rare, topics.
nerdponx
If they weren't, then why would enterprise level subscriptions include specific terms stating that they don't train on user provided data? There's no reason to believe that they don't, and if they don't now then there's no reason to believe that they won't later whenever it suits them.
null
agumonkey
maybe prompts are enough to infer the rest ?
AuthAuth
They are not directly ingesting the data into their trainning sets but they are in most cases collecting it and will be using it to train future models.
leptons
> LLMs don’t “learn” from the information they operate on, contrary to what a lot of people assume.
Nothing is really preventing this though. AI companies have already proven they will ignore copyright and any other legal nuisance so they can train models.
lioeters
They're already using synthetic data generated by LLMs to further train LLMs. Of course they will not hesitate to feed "anonymized" data generated by user interactions. Who's going to stop them? Or even prove that it's happening. These companies have already been allowed to violate copyright and privacy on a historic global scale.
Archelaos
How should they dinstinguish between real and fake data? It would be far to easy to pollute their models with nonesense.
tick_tock_tick
I mean is it really ignoring copyright when copyright doesn't limit them in anyway on training?
gaigalas
What kind of rare proprietary knowledge?
ares623
Maybe someday we'll see job postings for maintaining these in-house SaaS tools. And someday someday, we'll see these in-house SaaS tools being consolidated as its own separate product. Wait what.
Imustaskforhelp
Hey lets hope maybe people will open source the product too :D
redwood
Jamin Ball had a better take on Clouded Judgement https://cloudedjudgement.substack.com/p/clouded-judgement-12... "Long Live Systems of Record"
bigtones
Yeah I think that is a much more accurate take on the same subject.
lwhi
This was a good read.
arealaccount
The where this doesn’t work section is chefs kiss
- anything that requires very high uptime
-very high volume systems and data lakes
-software with significant network effects
-companies that have proprietary datasets
-regulation and compliance is still very important
weitendorf
This is why I started working on an open source, generic protobuf sqlite ORM + CRUD server (with search/filtering) + type/service registry + grpc mesh, recently: https://github.com/accretional/collector Note: collector's docs are mostly from LLMs, partially because it's more of a framework for tool-calling LLMs than humans
Then this project lets you generate static sites from svelte components (matches protobuf structures) and markdown (documentation) and global template variables: https://github.com/accretional/statue
A lot of the SaaS ecosystem actually has rather simple domain logic and oftentimes doesn't even model data very well, or at least not in a way that matches their clients/users mental models or application logic. A lot of the value is in integrations, or the data/scaling, or the marketing and developer experience, or some kind of expertise in actually properly providing a simple interface to a complex solution.
So why not just create a compact universal representation of that? Because it's not so big a leap to go beyond eating SaaS to eating integrations, migration costs/bad moats, and the marketing/documentation/wrapper.
blazespin
There is a significant risk of uncertainty in all of this, the most damaging aspect really. If AI improves, and it is threatening to, then growth in SaaS may decline to a point where investing in it needs to be reconsidered.
The problem is, nobody knows how much and how fast AI will improve or how much it will cost if it does.
That uncertainty alone is very problematic and I think is being underestimated in terms of its impact on everything it can potentially touch.
For now though, I've seen a wall form in benchmarks like swe-rebench and swebench pro. Greenfield is expanding, but maintenance is still a problem.
I think AI needs to get much better at maintenance before serious companies can choose build over buy for anything but the most trivial apps.
_pdp_
I often give the follow analogy which I think is a good proxy to what is going on.
Spreadsheets! They are everywhere. In fact, they are so abundant these days that that many are spawned for a quick job and immediately discarded. In fact, the cost of having these spreadsheets is practically zero so in many cases one may find themselves having hundreds if not thousands of them sitting around with no indication to ever being deleted. Spreadsheets are also personal and annoying especially when forced upon you (since you did not make it yourself). Spreadsheets are also programming for non-programmers.
These new vibe-coded tools are essentially the new spreadsheets. They are useful,... for 5 minutes. They are also easily forgettable. They are also personal (for the person who made them) and hated (by everyone else). I have no doubt in my mind that organisation will start using more and more of these new types of software to automate repetitive tasks, improve existing processes and so on but ultimately, apart from perhaps just a few, none will replace existing, purpose-built systems.
Ultimately you can make your own pretty dashboard that nobody else will see or use because when the cost of production is so low your users will want to create their own version because they would think they could do better.
After all, how hard is to prompt harder then the previous person?
Also, do you really think that SaaS companies are not deploying AI themselves? It is practically an arms race: the non-expert plus some AI vs 10 specialist developers plus their AIs doing this all day long.
Who is going to have the upper-hand?
gijoeyguerra
I’m always skeptical when I see (or say for that matter) phrases that start with “just”.
hyperpape
Note that the author does not mention a single specific SaaS subscription he’s cancelled or seen a team cancel.
The only named product was Retool.
hurturue
Related, Microsoft CEO said that soon the biggest client of Microsoft is going to be agents, not humans.
agumonkey
agent economy .. that's a fun thought
NikolaNovak
I get and agree with a lot of skepticism (and I get where ad-hominem attacks come from:). I have AI shoved my throat at work and at home 24x7 and most of it not for my benefit.
At the same time - do any of us think a small sassy SaaS like Bingo card creator could take off now? :-)
https://training.kalzumeus.com/newsletters/archive/selling_s...
I’m currently working on an in house ERP and inventory system for a specific kind of business. With very few people you can now instead of paying loads of money for some off the shelf solution to your software needs get something completely bespoke to your business. I think AI enables the age of boutique software that works fantastically for businesses, agencies will need to dramatically reduce their price to compete with in house teams.
I’m pretty certain AI quadruples my output at least and facilitates fixing, improving and upgrading poor quality inherited software much better than in the past. Why pay for SaaS when you can build something “good enough” in a week or two? You also get exactly what you want rather than some £300k per year CRM that will double or treble in price and never quite be what you wanted.