My AI had fixed the code before I saw it
10 comments
·August 18, 2025dinvlad
Saw this too, but seems a little too good to be true tbh. I don't doubt it helped them, but it's just so hard to make these agents work reliably. And the more "rules" one accumulates, the less reliably they follow all of the rules and details. Heck, Claude can't even follow a single rule of not inserting comments everywhere.. Plus, code often ends up in a mess, without strict human oversight.
Perhaps I'm overlooking something though, it would be great to see more concrete details of their implementation, rather than this high-level inspirational description..
Torn
Yeah as someone who uses Claude code daily this feels like hype and marketing
CLAUDE.md and working memory only goes so far — it never religiously follows them and does not truly 'learn' from past collaboration. In fact the more you put into a CLAUDE.md the more it seems to degrade
throwawaymaths
I don't believe it. If i don't explicitly tell claude to admonish itself in CLAUDE.md, it will quickly revert to making mistakes it was making an hour ago before compaction.
sometimes it even misses things in CLAUDE.md
neom
The CEO of Every was on Lenny's podcast recently talking about this "compound engineering" idea they have at their startup: https://youtu.be/crMrVozp_h8?si=O8Ahy_e2cBXuKXPq&t=2489
(Full disclosure, they use and talk about the agent I'm building)
827a
Why is it always that the companies who seem the "furthest ahead" in adopting AI into their engineering workflows are also the ones building the most simple, boring products? Also, why is it always the CEOs on these podcasts talking about the AI software engineering workflows? Why don't they bring on the actual engineers?
kibibu
> On the next iteration, it’s able to identify a frustrated user nine times out of 10. Good enough to ship.
It's able to identify an ai-simulated frustrated user nine times out of 10.
fudged71
Paywall.
null
> I launched GitHub expecting to dive into my usual routine—flag poorly named variables, trim excessive tests, and suggest simpler ways to handle errors.
If these are routine, in what kind of state is the repository? All of those easily can and should’ve been done properly at write/merge time in any half-decent code base. Sure, sometimes one case slips by, but if these are routine fixes, there is something deeply wrong with the process.
> I can't write a function anymore without thinking about whether I'm teaching the system or just solving today's problem.
This isn’t a positive thing.
> When you're done reading this, you'll have the same affliction.
No, not all. What you have described is a deeply broken system which will lead to worse developers and even worse software. I hope your method propagates as little as possible.
> But AI outputs aren't deterministic—a prompt that works once might fail the next time.
> So I have Claude run the test 10 times. When it only identifies frustration in four out of 10 passes, Claude analyzes why it failed the other six times.
All of this is wasteful and insane. You don’t learn anything or understand your own system, you just throw random crap at it a bunch of times and then pick some of it that sticks. No wonder it’s part of your routine to have to fix basic things. It’s bonkers that this is lauded as an improvement over doing things right.