Skip to content(if available)orjump to list(if available)

AI: Where in the Loop Should Humans Go?

tinco

Coding agents are increasingly moving to have the human after the loop instead of somewhere in it. That's a good thing, because being in the middle of something else's thought process basically turns you into a tool without any real agency. Using tools like Windsurf have that effect when they ask for permission a lot, turbo mode is really liberating in that way.

We've got an OSS autonomous agent at Bosun called Kwaak that works in parallel to the developer (it spawns workspaces in docker containers), basically pulling the human all the way outside the loop. Right now it'll get many basic things right but sometimes it's flaky and it basically makes you carefully read its merge requests. I wonder if at some point the agents become so good that they only rarely make mistakes, making the reviewing of their merge requests ever more tedious as their mistakes become harder to spot and farther apart.

abalashov

Brilliant article, and a welcome dose of much-needed sanity because it asks the right kinds of epistemic questions.

eikenberry

Nice article, thought doesn't it get this wrong...

> By comparison, people in a system start from a broad situation and narrow definitions down and add constraints to make problem-solving tractable.

It's been awhile but back when I was studying this most people were specific-to-general thinkers, i.e. the opposite of this. That people start with specific cases/examples and generalize as they learn more of them, narrow-to-broad. Has the research on this changed?

tsunego

Great reminder that AI is basically overpriced copilot: impressive enough to sell, flawed enough to need babysitting, and guaranteed to make sure humans still get blamed when things inevitably go wrong.

nbzso

Due to reaching the limit of growth of LLM's, the new trendy train is AI Agents. This will fail due to a lot of reasons. Sane people use tools wisely, and skepticism is not a Luddite label. I personally choose to avoid implementing a technology until it is proven reliable and safe. You can vibe code all you want. Just stay away from my repos.:)

abalashov

But bro, the next model is gonna do 10X PhD reasoning, its SWE bench stats are fire bro, it's like, agentic. I really feel the AGI in this one bro.