AI helps ship faster but it produces 1.7× more bugs

47 comments

·December 18, 2025

tyleo

I have a theory that vibe coding existed before AI.

I’ve worked with plenty of developers who are happy to slam null checks everywhere to solve NREs with no thought to why the object is null, should it even be null here, etc. There’s just a vibe that the null check works and solves the problem at hand.

I actually think a few folks like this can be valuable around the edges of software but whole systems built like this are a nightmare to work on. IMO AI vibe coding is an accelerant on this style of not knowing why something works but seeing what you want on the screen.

jmkni

Blindly copying and pasting from StackOverflow until it kinda sorta works is basically vibe coding

AI just automates that

giantg2

Yeah, but you had to integrate it until it at least compiled, which kind of made people think about what they're pasting.

I had a peer who suddenly started completing more stories for a month or two when our output was largely equal before. They got promoted over me. I reviewed one of their PRs... what a mess. They were supposed to implement caching. Their first attempt created the cache but never stored anything in it. Their next attempt stored the data in the cache, but never looked at the cache - always retrieving from the API. They deleted that PR to hide their incompetence and opened a new one that was finally right. He was just blindly using AI to crank out his stories.

That team had something like 40% of capacity being spent on tech debt, rework, and bug fixes. The leadership wanted speed above all else. They even tried to fire me because they thought I was slow, even though I was doing as much or more work than my peers.

palmotea

> I actually think a few folks like this can be valuable around the edges of software but whole systems built like this are a nightmare to work on. IMO AI vibe coding is an accelerant on this style of not knowing why something works but seeing what you want on the screen.

I would correct that: it's not an accelerant of "seeing what you want on the screen," it's an accelerant of "seeing something on the screen."

[Hey guys, that's a non-LLM it's not X, it's Y!]

Things like habitual, unthoughtful null-checks are a recipe for subtle data errors that are extremely hard to fix because they only get noticed far away (in time and space) from the actual root cause.

zipy124

I agree but I'd draw a different comparison. That is vibe coding has accelerated the type of developers who relied on stack overflow to solve all their problems. The kind of dev who doesn't try to solve problems themselves. It has just accelerated this type of working, but is less reliable than before.

whynotmaybe

"on error resume next" has been the first line of many vba scripts for years

eterm

I caught claude trying to sneak in the equivalent to a CI script yesterday as I was wrangling how to run framework and dotnet tests next to each other without slowing down the framework tests horrendously.

It tried to sneak in changing the CI build script to proceed to next step on failure.

It's a bold approach, I'll give it that.

jerf

One of my frustrations with AI, and one of the reasons I've settled into a tab-complete based usage of it for a lot of things, is precisely that the style of code it uses in the language I'm using puts out a lot of things I consider errors based on the "middle-of-the-road" code style that it has picked up from all the code it has ingested. For instance, I use a policy of "if you don't create invalid data, you won't have to deal with invalid data" [1], but I have to fight the AI on that all the time because it is a routine mistake programmers make and it makes the same mistake repeatedly. I have to fight the AI to properly create types [2] because it just wants to slam everything out as base strings and integers, and inline all manipulations on the spot (repeatedly, if necessary) rather than define methods... at all, let alone correctly use methods to maintain invariants. (I've seen it make methods on some occasions. I've never seen it correctly define invariants with methods.)

Using tab complete gives me the chance to generate a few lines of a solution, then stop it, correct the architectural mistakes it is making, and then move on.

To AI's credit, once corrected, it is reasonably good at using the correct approach. I would like to be able to prompt the tab completion better, and the IDEs could stand to feed the tab completion code more information from the LSP about available methods and their arguments and such, but that's a transient feature issue rather than a fundamental problem. Which is also a reason I fight the AI on this matter rather than just sitting back: In the end, AI benefits from well-organized code too. They are not infinite, they will never be infinite, and while code optimized for AI and code optimized for humans will probably never quite be the same, they are at least correlated enough that it's still worth fighting the AI tendency to spew code out that spends code quality without investing in it.

[1]: Which is less trivial than it sounds and violated by programmers on a routine basis: https://jerf.org/iri/post/2025/fp_lessons_half_constructed_o...

[2]: https://jerf.org/iri/post/2025/fp_lessons_types_as_assertion...

tyleo

This is very very close to my approach. I love copilot intellisense at GitHub’s entry tier because I can accept/reject on the line level.

I barely ever use AI code gen at the file level.

Other uses I’ve gotten are:

1. It’s a great replacement for search in many cases

2. I have used it to fully generate bash functions and regexes. I think it’s useful here because the languages are dense and esoteric. So most of my time is remembering syntax. I don’t have it generate pipelines of scripts though.

eurekin

"ship fast, break things"

cmiles8

There are certainly some valid criticisms of vibe coding. That said, it’s not like the quality of most code was amazing before AI came along. In fact, most code is generally pretty terrible and took far too long for teams to ship.

Many folks would say that if shipping faster allows for faster iterations across an idea then the silly errors are worth it. I’ve certainly seen a sharp increase on execs calling BS on dev teams saying they need months to develop some basic thing.

coliveira

When a team says that a "trivial" feature takes months to ship is not because of the complexity of the algorithm. It's because of the infrastructure and coordination work required for the feature to properly work. It is almost aways a failure of the technical infrastructure previously created in the company. An AI will solve the trivial aspects of the problem, not the real problem.

tyleo

I think you need a balance. I’ve seen products fall apart due to high error rate.

I like to think of intentionalists—people who want to understand systems—and vibe coders—people who just want things to work on screen expediently.

I think success requires a balance of both. The current problem I see with AI is that it accelerates the vibe part more than the intentionalist part and throws the system out of balance.

cmiles8

Don’t disagree… I think it’s just applying a lot more pressure on dev teams to do things faster though. Devs tend to be expensive and expectations on productivity have increased dramatically.

Nobody wants teams to ship crap, but also folks are increasingly questioning why a bit of final polishing takes so long.

jmathai

More important than code quality is a joint understanding of the business problem and the technical solution for it. Today, that understanding is spread across multiple parties (eng, pm, etc).

Code quality can be poor as long as someone understands the tradeoffs for why it's poor.

WhyOhWhyQ

And you think people who don't understand the software telling people who do they're doing it wrong is an outright positive?

bogzz

oh wow, an LLM-based company with an article that claims AI is oddly not as bad when it comes to generating gobbledegook as everyday empirical evidence should suggest

jjmarr

Coderabbit is an LLM code review company so their incentives are the opposite. AI is terrible and you need more AI to review it.

fwiw, I agree. LLM-powered code review is a lifesaver. I don't use Coderabbit but all of my PRs go through Copilot before another human looks at it. It's almost always right.

bpicolo

Their incentives are perfectly aligned - you’re making more bugs, surely you need some AI code review to help prevent that.

It’s literally right at the end of their recommendations list in the article

jjmarr

The original comment said:

> an article that claims AI is oddly not as bad when it comes to generating gobbledegook

Ironically, Coderabbit wants you to believe AI is worse at generating gobbledegook.

bodge5000

As has already been said, we've been here before. I could ship significantly faster if I ignored any error handling or edge cases and basically just assumed the data would flow 100% how I expect it to all the time. Of course that is almost never the case, so I'd end up with more bugs.

I'd like to say that AI just takes this to an extreme but I'm not even sure about that, I think it could produce more code and more bugs than I could in the same amount of time but not significantly so if I just gave up on caring about anything

0x3f

At best this would be 1.7x more _discovered_ bugs. The average PR (IMO) is hardly checked. AI could have 10x as many real issues on PRs, but we're just bad at reviewing PRs.

yomismoaqui

Agentic AI coding is a tool, you can use it wrong.

To give an example of how to use AI successfully check the following post:

https://friendlybit.com/python/writing-justhtml-with-coding-...

null

[deleted]

nerdjon

Something I have been very curious about for some time now. We know the quality of the code is not very high and has a high likelihood of bugs.

But, assuming there are not bugs and the code ships. Has there been any study in resource usage creeping up and an impact of this on a whole system. The tests I have done with trying to build things with AI it always seems like there is zero efficiency unless you notice it and can put it in the right direction.

I have been curious about the impact this will have on general computing as more low quality code makes it into applications we use every day.

lherron

They buried the lede. The last half of the article with ways to ground your dev environment to reduce the most common issues should be its own article. (However implementing the proper techniques somewhat obviates the need for CodeRabbit, so guess it’s understandable.)

windex

I think devs have now split into two camps, the kvetchers and the shippers. It's a new tool, it's fresh. Things will work itself out over the next couple of years/months(?). The kvetching helps keep AI research focused on the problem which is good. Meanwhile continue to ship.

strangescript

Do they consider code readability, formatting and variable naming as "errors" for the overall count. That seems dubious given where we are headed.

No one cares what a compiler or js minifier names its variables in its output.

Yes, if you don't believe we will get there ever, then this is totally valid complaint. You are also wrong about the future.

oblio

The "future" is a really long time.

I'll take the other side of your bet for the next 10 years but I won't take it for the next 30 years.

In that spirit, I want my fusion reactor and my flying car.

strangescript

If your outlook is 10 years then for sure, its valid. I am not sure how you come to that conclusion logically though. At the beginning of the year we had 0 code agents. Now we have dozens, some are basically free, (of various degrees of quality, sure).

The last 2-3 months of releases have been an unprecedented whirlwind. Code writing will be solved by the end of 2026. Architecture, maybe not, but formatting issues isn't architecture.

bopbopbop7

Code writing was solved in 1997 when Dreamweaver was released.

oblio

It's similar with every technology, there's a reason we have sigmoids.

In 1960 they were planning nuclear powered cars and nuclear mortars.

phartenfeller

Definitely. But AI can also generate unit tests.

You have to be careful by exactly telling the LLM what to test for and manually check the whole suite of tests. But overall it makese feel way more confident over increasing amounts of generated code. This of course decreases the productivity gains but is necessary in my opinion.

And linters help.

stuaxo

It doesn't generate good tests by default though.

I worked on a team where we had someone come in and help us improve our tests a lot.

The default LLM generated tests are bit like the ones I wrote before that experience.

dnautics

this is solvable by prompting and giving good examples?

SketchySeaBeast

I've been using Claude sonnet 4.5 lately and I've noticed a tendency for it to create tests that prove themselves. Rather than calling the function we're hoping to test, it re-implements the code in the test and then tests it there. It's helpful, and it usually works very well if you have well defined inputs and outputs, I much prefer it over writing tests manually, but you have to be very careful.

HN

AI helps ship faster but it produces 1.7× more bugs

AI helps ship faster but it produces 1.7× more bugs