Skip to content(if available)orjump to list(if available)

Perverse incentives of vibe coding

Perverse incentives of vibe coding

131 comments

·May 14, 2025

brooke2k

I don't understand the productivity that people get out of these AI tools. I've tried it and I just can't get anything remotely worthwhile unless it's something very simple or something completely new being built from the ground up.

Like sure, I can ask claude to give me the barebones of a web service that does some simple task. Or a webpage with some information on it.

But any time I've tried to get AI services to help with bugfixing/feature development on a large, complex, potentially multi-language codebase, it's useless.

And those tasks are the ones that actually take up the majority of my time. On the occasion that I'm spinning a new thing up quickly, I don't really need an AI to do it for me -- I mean, that's the easy part!

Is there something I'm missing? Am I just not using it right? I keep seeing people talk about how addictive it is, how the productivity boost is insane, how all their code is now written by AI and then audited, and I just don't see how that's possible outside of really simple rote programming.

tptacek

The first and most important question to ask here is: are you using a coding agent? A lot of times, people who aren't getting much out of LLM-assisted coding are just asking Claude or GPT for code snippets, and pasting and building them themselves (or, equivalently, they're using LLM-augmented autocomplete in their editor).

Almost everybody doing serious work with LLMs is using an agent, which means that the LLM is authoring files, linting them, compiling them, and iterating when it spots problems.

There's more to using LLMs well than this, but this is the high-order bit.

lukan

Yesterday I gave cursor a try and made my first (intentionally very lazy) vibe coding approach (a simple threejs project). It accepted the task and did things, failed, did things, failed, did things ... failed for good.

I guess I could work on the magic incantations to tweak here and there a bit until it works and I guess that's the way it is done. But I wasn't hooked.

I do get value out of LLM's for isolated broken down subtasks, where asking a LLM is quicker than googling.

For me, AI will probably become really usefull, once I can scan and integrate my own complex codebase so it gives me solutions that work there and not hallucinate API points or jump between incompatible libary versions (my main issue).

hx8

Probably 80% of the time I spend coding, I'm inside a code file I haven't read in the last month. If I need to spend more than 30 seconds reading a section of code before I understand it, I'll ask AI to explain it to me. Usually, it does a good job of explaining code at a level of complexity that would take me 1-15 minutes to understand, but does a poor job of answering more complex questions or at understanding more complex code.

It's a moderately useful tool for me. I suspect the people that get the most use out of are those that would take more than 1 hour to read code I would take 10 minutes to read. Which is to say the least experienced people get the most value.

Starlevel004

> Is there something I'm missing? Am I just not using it right?

The talk about it makes more sense when you remember most developers are primarily writing CRUD webapps or adware, which is essentially a solved problem already.

tptacek

I'm not doing either of those things with it.

slurpyb

You are not alone! I strongly agree and I feel like I am losing my mind reading some of the comments people have about these services.

jiggawatts

I’ve had good experiences using it, but with the caveat that only Gemini Pro 2.5 has been at all useful, and only for “spot” tasks.

I typically use it to whip up a CLI tool or script to do something that would have been too fiddly otherwise.

While sitting in a Teams meeting I got it to use the Roslyn compiler SDK in a CLI tool that stripped a very repetitive pattern from a code base. Some OCD person had repeated the same nonsense many thousands of times. The tool cleaned up the mess in seconds.

colechristensen

Some people do really repetitive or really boilerplate things, others do not.

Also you have to learn to talk to it and how to ask it things.

erulabs

These perverse incentives run at the heart of almost all Developer Software as a Service tooling. Using someone else's hosted model incentivizes increasing token usage, but it's nothing special about AI.

Consider Database-as-a-service companies: They're not incentivized to optimize on CPU usage, they charge per cpu. They're not incentivized to improve disk compression, they charge for disk-usage. There are several DB vendors who explicitly disable disk compression and happily charge for storage capacity.

When you run the software yourself, or the model yourself, the incentives aligned: use less power, use less memory, use less disk, etc.

jiggawatts

My favourite example of this is the recent trend towards “wide events” replacing logs and metrics… spearheaded and popularised by companies that charge by the gigabytes ingested.

tptacek

Companies that ingest logs generally rip their customers faces off with their pricing. At least oTel spans can be tail-sampled.

andy99

I wish more had been written about the first assertion that using an LLM to code is like gambling and you're always hoping that just one more prompt will get you what you want.

It really captures how little control one has over the process, while simultaneously having the illusion of control.

I don't really believe that code is being made verbose to make more profits. There's probably some element of model providers not prioritizing concise code, but if conciseness while maintaining "quality" was possible is would give one model a sufficient edge over others that I suspect providers would do it.

techpineapple

Something I caught about Andrej Karpathy’s original tweet, was he said “give into the vibes”, and I wonder if he meant that about outcomes too.

andy99

I still think the original tweet was tongue-in-cheek and not really meant to be a serious description of how to do things.

lubujackson

I feel like "vibe coding" as a "no look" sort of way to produce anything is bad and will probably remain bad for some time.

However... "vibe architecting" is likely going to be the way forward. I have had success with generating/tuning an architecture plan with AI, having it create stub files/functions then filling them out individually. I can get pretty much the whole way without typing code, but it does require a fair bit more architectural thinking than usual and a good bit of reading code (then telling the AI to "do better").

I think of it like the analogy of blind men describing an elephant when they can only feel a single part. AI is decent at high level architecture and decent at low level production but you need a human to understand the big picture and how the pieces fit (and which ones are missing).

nowittyusername

What you are talking about is the "proper" way of vibe coding. Most of the issues with vibe coding stem from user misunderstanding the capabilities of the technology they are using. They are overestimating the capabilities of current systems and are essentially asking for magic to happen. They don't give proper guidance, context or anything of value for the coding IDE to work with. They are relying a mindset of the 2030's to work with systems from 2025. We aint there yet folks, give as much guidance and context as you can and you will have a better time.

exiguus

I understand your point. The Vibe approach is IMO only effective when you adopt a software engineering mindset. Here's how it works (at least for me with Copilote agent mode):

1. Develop a Minimum Viable Product (MVP) or prototype that functions.

2. Write tests, either before or after the initial development.

3. Implement coding guidelines, style guides, linter etc. Do code reviews.

4. Continuously adjust, add features, refactor, review and expand your test suite. Iterate and let AI run tests and linters on each change

While this process may seem lengthy, it ensures reliability and efficiency. Experienced engineers might find it as quick as working solo, but the structured approach guarantees success. It feels like pairing with a inexperienced developer.

Also, this process may run you into rate limits with Copilot and might not work with your current codebase due to a lack of tests and the absence of applied coding style guides.

Additionally, it takes time. For example, for a simple to mid-level tool/feature in Go, it might take about 1 hour to develop the MVP or prototype, but another 6 to 10 hours to refine it to a quality that you might want to show to other engineers.

xianshou

Amusingly, about 90% of my rat's-nest problems with Sonnet 3.7 are solved by simply appending a few words to the end of the prompt:

"write minimum code required"

It's not even that sensitive to the wording - "be terse" or "make minimal changes" amount to the same thing - but the resulting code will often be at least 50% shorter than the un-guided version.

panstromek

Well, the article mentions that this reduces accuracy. Do you hit that problem often then?

chaboud

1. Yes. I've spent several late nights nudging Cline and Claude (and other systems) to the right answers. And being able to use AWS Bedrock to do this has been great (note: I work at Amazon).

2. I've had good fortunes keeping the agents to constrained areas, working on functions, or objects, with clearly defined (by me) boundaries. If the measure of a junior engineer is that you correct them once a day, an engineer once a week, a senior once a month, a principal once a quarter... Treat these agents like hyper-energetic interns. Nudge frequently.

3. Standard org management coding practices apply. Force the agents to show work, plan, unit test, investigate.

And, basically, I've described that we're becoming Software Development Managers with teams of on-demand low-quality interns. That's an incredibly powerful tool, but don't expect hyper-elegant and compact code from them. Keep that for the senior engineering staff (humans) for now.

(Note: The AlphaEvolve announcement makes me wonder if I'm going to have hyper-energetic applied science interns next...)

YossarianFrPrez

There are two sets of perverse incentives at play. The main one the author focuses on is that LLM companies are incentivized to produce verbose answers, so that when you task an LLM on extending an already verbose project, the tokens used and therefore cost increases.

The second one is more intra/interpersonal: under pressure to produce, it's very easy to rely on LLMs to get one 80% of the way there and polish the remaining 20%. I'm in a new domain that requires learning a new language. So something I've started doing is asking ChatGPT to come up with exercises / coding etudes / homework for me based on past interactions.

rcarmo

As it happens, I wrote about the need for planning and organizing work (for greenfield or understanding existing projects) only yesterday: https://taoofmac.com/space/blog/2025/05/13/2230

vanschelven

> Its “almost there” quality — the feeling we’re just one prompt away from the perfect solution — is what makes it so addicting. Vibe coding operates on the principle of variable-ratio reinforcement, a powerful form of operant conditioning where rewards come unpredictably. Unlike fixed rewards, this intermittent success pattern (“the code works! it’s brilliant! it just broke! wtf!”), triggers stronger dopamine responses in our brain’s reward pathways, similar to gambling behaviors.

Though I'm not a "vibe coder" myself I very much recognize this as part of the "appeal" of GenAI tools more generally. Trying to get Image Generators to do what I want has a very "gambling-like" quality to it.

Suppafly

>Trying to get Image Generators to do what I want has a very "gambling-like" quality to it.

Especially when you try to get them to generate something they explicitly tell you they won't, like nudity. It feels akin to hacking.

dingnuts

it's not like gambling, it is gambling. you exchange dollars for chips (tokens -- some casinos even call the chips tokens) and insert it into the machine in exchange for the chance of a prize.

if it doesn't work the first time you pull the lever, it might the second time, and it might not. Either way, the house wins.

It should be regulated as gambling, because it is. There's no metaphor, the only difference from a slot machine is that AI will never output cash directly, only the possibility of an output that could make money. So if you're lucky with your first gamble, it'll give you a second one to try.

Gambling all the way down.

NathanKP

This only makes sense if you have an all or nothing concept of the value of output from AI.

Every prompt and answer is contributing value toward your progress toward the final solution, even if that value is just narrowing the latent space of potential outputs by keeping track of failed paths in the context window, so that it can avoid that path in a future answer after you provide followup feedback.

The vast majority of slot machine pulls produce no value to the player. Every single prompt into an LLM tool produces some form of value. I have never once had an entirely wasted prompt unless you count the AI service literally crashing and returning a "Service Unavailable" type error.

One of the stupidest takes about AI is that a partial hallucination or a single bug destroys the value of the tool. If a response is 90% of the way there and I have to fix the 10% of it that doesn't meet my expectations, then I still got 90% value from that answer.

NegativeLatency

> Every prompt and answer is contributing value toward your progress toward the final solution

This has not been my experience, maybe sometimes, but certainly not always.

As an example: asking chatgpt/gemini about how to accomplish some sql data transformation set me back in finding the right answer because the answer it did give me was so plausible but also super duper not correct in the end. Would've been better off not using it in that case.

Brings to mind "You can't build a ladder to the moon"

secabeen

> One of the stupidest takes about AI is that a partial hallucination or a single bug destroys the value of the tool. If a response is 90% of the way there and I have to fix the 10% of it that doesn't meet my expectations, then I still got 90% value from that answer.

That assumes that the value of a solution is linear with the amount completed. If the Pareto Principle holds (80% of effects come from 20% of causes), then not getting that critical 10+% likely has an outsized effect on the value of the solution. If I have to do the 20% of the work that's hard and important after taking what the LLM did for the remainder, I haven't gained as much because I still have to build the state machine in my head to understand the problem-space well enough to do that coding.

PaulDavisThe1st

This assumes you can easily and reliably identify the 10% you need to fix.

rapind

By this logic:

- I buy stock that doesn't perform how I expected.

- I hire someone to produce art.

- I pay a lawyer to represent me in court.

- I pay a registration fee to play a sport expecting to win.

- I buy a gift for someone expecting friendship.

Are all gambas.

You aren't paying for the result (the win), you are paying for the service that may produce the desired result, and in some cases one of may possibly desirable results.

rjbwork

>I buy stock that doesn't perform how I expected.

Hence the adage "sir, this is a casino"

nkrisc

None of those are a games of chance, except the first.

princealiiiii

> It should be regulated as gambling, because it is.

That's wild. Anything with non-deterministic output will have this.

kagevf

> "Anything with non-deterministic output will have this.

Anything with non-deterministic output that charges money ...

Edit Added words to clarify what I meant.

martin-t

That's incorrect, gambling is about waiting.

Brain scans have revealed that waiting for a potential win stimulates the same areas as the win itself. That's the "appeal" of gambling. Your brain literally feels like it's winning while waiting because it _might_ win.

GuinansEyebrows

maybe more accurately anything with non-deterministic output that you have to pay-per-use instead of paying by outcome.

csallen

Books are not like gambling, they are gambling. you exchange dollars for chips (money — some libraries even give you digital credits for "tokens") and spend them on a book in exchange for the chance of getting something good out of it.

If you don't get something good the first time you buy a book, you might with the next book, or you might not. Either way, the house wins.

It should be regulated as gambling, because it is. There's no metaphor — the only difference from a slot machine is that books will never output cash directly, only the possibility of an insight or idea that could make money. So if you're lucky with your first gamble, you'll want to try another.

Gambling all the way down.

squeaky-clean

So how exactly does that work for the $25/mo flat fee that I pay OpenAI for chatgpt. They want me to keep getting the wrong output and burning money on their backend without any additional payment from me?

dwringer

Something of an aside, but this is sort of equivalent to asking "how does that work for the $50 dollars the casino gave me to gamble with for free"? I once made 50 dollars exactly in that way by taking the casino's free tokens and putting them all on black in a single roulette spin. People like that are not the ones companies like that make money off of.

kimixa

For the amount of money OpenAI burns that $25/mo is functionally the same as zero - they're still in the "first one is free" phase.

Though you could say the same thing about pretty much any VC funded sector in the "Growth" phase. And I probably will.

abletonlive

Yikes. The reactionary reach for more regulation from a certain group is just so tiresome. This is the real mind virus that I wish would be contained in Europe.

I almost can't believe this idea is being seriously considered by anybody. By that logic buying any CPU is gambling because it's not deterministic how far you can overclock it.

Just so you know, not every llm use case requires paying for tokens. You can even run a local LLM and use cline w/ it for all your coding needs. Pull that slot machine lever as many times as you like without spending a dollar.

slurpyb

Do you understand what electricity is?

mystified5016

I run genAI models on my own hardware for free. How does that fit into your argument?

codr7

The fact that you can get your drugs for free doesn't exactly make you less of an addict.

null

[deleted]

yewW0tm8

Same with anything though? Startups, marriages, kids.

All those laid off coders gambled on a career that didn’t pan out.

Want more certainty in life, gonna have to get political.

And even then there is no guarantee the future give a crap. Society may well collapse in 30 years, or 100…

This is all just role play to satisfy the prior generations story driven illusions.

samtp

I've pretty clearly seen the critical thinking ability of coworkers who depend on AI too much sharply decline over the past year. Instead of taking 30 seconds to break down the problem and work through assumptions, they immediately copy/paste into an LLM and spit back what it tells them.

This has lead to their abilities stalling while their output seemingly goes up. But when you look at the quality of their output, and their ability to get projects over the last 10% or make adjustments to an already completed project without breaking things, it's pretty horrendous.

Etheryte

My observations align with this pretty closely. I have a number of colleagues who I wager are largely using LLM-s, both by changes in coding style and how much they suddenly add comments, and I can't help but feel a noticeable drop in the quality of the output. Issues that should clearly have no business making it to code review are now regularly left for others to catch, it often feels like they don't even look at their own diffs. What to make of it, I'm not entirely sure. I do think there are ways LLM-s can help us work in better ways, but they can also lead to considerably worse outcomes.

jimbokun

Just replace your colleagues with the LLMs they are using. You will reduce costs with no decrease in the quality of work.

jobs_throwaway

As someone who vibe codes at times (and is a professional programmer), I'm curious how yall go about resisting this? Just avoid LLMs entirely and do everything by hand? Very rigorously go over any LLM-generated code before committing?

It certainly is hard when I'm say writing unit tests to avoid the temptation to throw it into Cursor and prompt until it works.

samtp

I resist it by realizing that while LLM are good at things like decoding obtuse error messages, having them write too much of your code leads to a project becoming almost impossible to maintain or add to. And there are many cases where you spend more time trying to correct errors from the LLM than if you were to slow down and inspect the code yourself.

breckenedge

Set a budget. Get rate limited. Let the experience remind you how much time you’re actually wasting letting the model write good looking but buggy code, versus just writing code responsibly.

andy99

I think lack of critical thinking is the root cause, not a symptom. I think pretty much everyone uses LLMs these days, but you can tell who sees the output and considers it "done" vs who uses LLM output as an input to their own process.

null

[deleted]

mystified5016

I mean, I can tell that I'm having this problem and my critical thinking skills are otherwise typically quite sharp.

At work I've inherited a Kotlin project and I've never touched Kotlin or android before, though I'm an experienced programmer in other domains. ChatGPT has been guiding me through what needs to be done. The problem I'm having is that it's just too damn easy to follow its advice without checking. I might save a few minutes over reading the docs myself, but I don't get the context the docs would have given me.

I'm a 'Real Programmer' and I can tell that the code is logically sound and self-consistent. The code works and it's usually rewritten so much as to be distinctly my code and style. But still it's largely magical. If I'm doing things the less-correct way, I wouldn't really know because this whole process has led me to some pretty lazy thinking.

On the other hand, I very much do not care about this project. I'm very sure that it will be used just a few times and never see the light of day again. I don't expect to ever do android development again after this, either. I think lazy thinking and farming the involved thinking out to ChatGPT is acceptable here, but it's clear how easily this could become a very bad habit.

I am making a modest effort to understand what I'm doing. I'm also completely rewriting or ignoring the code the AI gives me, it's more of an API reference and example. I can definitely see how a less-seasoned programmer might get suckered into blindly accepting AI code and iterating prompts until the code works. It's pretty scary to think about how the coming generations of programmers are going to experience and conceptualize programming.