Parallel AI agents are a game changer

85 comments

·September 2, 2025

ambicapter

Obviously I'm an AI-tools skeptic, but this is hilarious:

> 1. Prepare issues with sufficient context

> Start by ensuring each GitHub issue contains enough context for agents to understand what needs to be built and how it integrates with the system. This might include details about feature behavior, file locations, database structure, or specific requirements such as displaying certain fields or handling edge cases.

> You can’t do half-hearted prompts and fix as you go, because those fixes come an hour later.

> Skills That Become More Important > Full-stack understanding > Problem decomposition > Good writting skills > QA and Code Review skills

This is just software engineering?!?

edit: On the other hand, maybe I can convince people in my org to get better at software engineering by telling them its for the AI to work better.

DetroitThrow

>This is just software engineering?!?

Absolutely. The existence of vibe coding does not mean production code is going to be engineered without the same principles we've always used - even if we're using AI to generate a lot more code than before.

Any crowd suggesting that this was not the case has lost the plot, imo.

Aeolun

People find it a lot more palatable when the AI requires all this information than when software engineers do though. If I ask for clear requirements I’m asked to just figure it out. But if the AI implements nonsense without clear requirements that the fault of the specs.

tjr

I am amazed at how suddenly people are on board with writing clear design documentation now that it means AI can generate the code rather than humans.

I wonder how much better humans would be at generating code given the same abundance of clearly-written design documentation?

lazide

Well, that’s because the software engineers are irritating when they push back and say ‘no’ or ‘wtf’.

When the AI does it, it’s being polite and stuff. /s, kinda.

sfn42

I consider it my job to figure out the requirements. The fact that they aren't specified in detail allows me to do what I think is best rather than being bound by often arbitrary specifications.

I judge which decisions I make and which ones I bring up to my team/PO/whatever. Most of the time I just do what I think is best, some times I'll do something and then bring it up later like "I did this this way but if that doesn't work I can change it", typically for things that will be easy to change later. Some things I ask about before I do them because they won't be easy to change later.

I'll often take technical liberties with frontend designs, for example I'll use a html select rather than reinventing the drop-down just to be able to put rounded corners on the options. I'll style scrollbars using the very limited options css provides rather than reinvent scrollbars just to follow the design exactly. Most of the time nobody cares, we can always go back later and do these types of things if we really want a certain aesthetic.

I have never had the impression that my questions bother people, rather the opposite. I've had multiple designers say they appreciate the way I interact with them, I respect their work and their designs but I ask them if something looks like an oversight or I'm not exactly sure what their intention is. POs and such are always happy to answer simple questions, I make it easy for them: here's a decision we need to make, I want you to make it. Maybe I have a suggestion for what I would prefer and some reasons why I prefer that solution.

I don't expect them to think of everything and answer all my potential questions in advance, that's just unnecessary and difficult work.

skhameneh

> On the other hand, maybe I can convince people in my org to get better at software engineering by telling them its for the AI to work better.

Really good engineering practices are fundamental to get the most out of AI tooling. Convoluted documentation, code fragmentation, etc all pollute context when working with AI tools.

In my experience, just having one outdated word (especially if it's a library name) anywhere in code or documentation can create major ongoing headaches.

The worst part of it is trying to avoid negative assertions. When the AI tooling keeps trying to do "the wrong thing" it's sometimes a challenge to rephrase instructions for "the right thing" to frame a positive assertion.

mensetmanusman

I tend to agree with English being the new programming language. Those with English communication struggles will struggle to code this way.

shiroyasha

Yes. AI assisted software engineering is still software engineering. I don't see that part changing anytime soon.

pvtmert

> This is just software engineering?!?

Indeed yes. Although most places shipping software in a "software development" and/or "programming" fashion for many years.

Many, many places certainly do not do the engineering part, even though resulting product is a software.

wrs

Yeah, it’s funny, we may finally have a way to get developers to write documentation for other developers, it’s just that the other developers aren’t human!

rukuu001

Yes, the ability to clearly and unambiguously communicate what's required works on both humans and machines.

electroglyph

lmao, "good writting skills" =)

null

[deleted]

ambicapter

[sic]

ScotterC

I lol'ed too but then thought - at least he actually wrote this!

shiroyasha

Heh, damn. Made a typo at the worst spot

mmaunder

The author is lying. My team and I are heavy users of Claude code and other agents and it ain’t like this. You need to manage an AI coding agent carefully and course correct frequently. There are cases for parallel agents but they are tasks like parallel document fetches and summarization, and other tasks that don’t require supervision.

The idea of having multiple parallel agents merge pull requests or resolve issues in parallel is still just an idea.

Please don’t post or upvote attention seeking crap like this. It gives a very exciting and promising technology a bad name.

hu3

Your comment is disproportionately rude. Just because your team can't leverage multiple coding agents doesn't mean no one else can.

And even if OP also can't, this is a good place to discuss possible problems and solutions for parallel development using coding agents.

Please refrain from gatekeeping.

mmaunder

“With this approach, I can manage to have 10–20 pull requests open at once, each handled by a dedicated agent.”

A quote from the post. No, I think my post is calibrated quite well considering what OPs post does to our industry.

hu3

Having 20 PRs open at once doesn't necessarily mean managing 20 agents simultaneously.

It can mean, for example, that 2 agents worked for some time in a list of 20 TODO features and produced 20 PRs to be reviewed. They could have worked overnight even.

You're seemingly judging from the least generous interpretation, which is not constructive and is also against HN guidelines fyi.

bn-l

No. Unfortunately there’s a problem now of people blatantly lying about the ability of LLMs to get attention. And it’s extremely effective.

I say this as someone who uses them every day for programming and is also excited now and for the future possibilities. The just blatant lying needs to stop though and needs to be called out.

osn9363739

Can this guy, or someone else post a full days (4-8 hours, or what ever is spent in the weeds) stream of work to youtube or something. I just want to watch the process to see what I'm missing. Or if there is anyone that already does that can they recommend it to me. I would appreciate it.

slig

https://youtu.be/xAKVi_jvvg4

Two hours of Web Dev Cody.

pton_xd

Got through about 45 min at 2x speed / some skipping ahead out of pure fascination. Man that's something else. It's like bug-driven-development. Get the LLM to churn out a huge chunk of text, then skim the code for about 10 seconds and say it looks good. Then spend a while testing and hitting one error after the next until it finally seems to work. Repeat.

ath3nd

Wow, I didnt expect that dystopias can be so boring.

If somebody like that producing code of like that low quality worked with me, I can see myself spilling coffee or acid on them or their laptop.

dehugger

Assaulting underperforming devs is one way to ensure code quality, I guess.

shiroyasha

Web dev cody is great. I recommend him.

I (author) sometimes stream my work here as well https://www.youtube.com/@operatelybackstage.

_345

Are you saying that because you're also skeptical? I haven't had the best time switching to agent coding. I mean for throwaway work its fine but its kind of boring and aider still messes up from time to time

osn9363739

I probably lean on the sceptical side of the spectrum. I'm not against giving it a go if I can get value out of it but I'm not having the wonderful experience that these people are having. - The asynchronous nature of it slows me down and it feels the opposite of what this bloke is saying around getting into a flow. - I miss things because I'm not thinking it all the way through. - The issues with errors or hallucinations. - It does not feel faster (I might blow through a couple of things really fast, but the issues created elsewhere sometimes eat all that saved time up). - The quality of work is all over the shop. Bigger projects just fall apart after a while. I also wonder if the way I think is hindering me. I don't like natural language. I struggle to communicate at the best of times. All my emails are dot points. If someone asks me for a diagram I write it in plantuml or using a python library. I work in DevOps and love declarative manifests and templates.

adriand

Try as an initial step having the agentic AI improve your prompt for you. I have a "prompt improvement prompt template", which is a standardized document (customized for each project I'm working on), that has a bunch of boilerplate instructions in it, along with a section where I paste in my first-draft prompt. I then feed this document (boilerplate + crappy prompt) into the AI and it creates a way better prompt for me. Then I edit that to ensure it's correct, and then that becomes the prompt I use.

pvtmert

Is it me or the post sounding (showing!) that they haven't tried the mentioned approach in real life.

Because in real life, one agent tries to fix build issue with rm -rf node_modules and the other is already running a server (ie: npm server), conflicting with each other nearly all the time!. (even if it's not a destructive action, the second npm server will most likely to fail due to port-allocation conflicts!)

Meanwhile, what I found helpful is that: 1. Clone the same repo twice or three times 2. In each terminal or whatever, `cd` into it 3. Create a branch, run your ~commands~ prompts (each is their own session with their own repo) 4. commit/push then merge/rebase (also resolve conflicts if needed, use LLM again if you want)

Any other way multiple agents work in harmony in a single repo (filesystem/directory) at the same time is a pipe-dream with the current status of the MCP and agents.

Let alone being aware of each other (agents), they don't even have a proper locking mechanism. As soon as you make a new change, most of the editing (search/replace) functionality in the MCPs fail miserably. Then they re-read the entire file, just creating context-rot or over-filling with already-existing stuff. Soon you run out of tokens (or just pay extra for no reason)

> edit: comments mentioned that each agent runs in a VM isolated from others, kinda makes sense but still, there will be massive merge-conflicts unless each agent runs in a completely different set of service/code-base (ie frontend vs backend or couple of micro-services)

abound

The post is about using GitHub's integrated Copilot tooling, where each issue gets its own instance presumably running in a sandbox. This sidesteps the issues you're talking about here.

shiroyasha

I don't claim to have lots of experience with this, I've been only doing it for a couple of weeks, but I do feel that some of your comments are disingenuous.

---

> Any other way multiple agents work in harmony in a single repo (filesystem/directory) at the same time is a pipe-dream with the current status of the MCP and agents.

Every agent runs in a separate VM on GitHub.

> Let alone being aware of each other (agents), they don't even have a proper locking mechanism.

Never claimed this. Feels like a strawman argument.

GZGavinZhao

git workspaces?

lacasito25

u mean worktrees

GZGavinZhao

Ah yes I used much jj lol

furyofantares

The sweet spot for me is 2 agents on different projects. Surprisingly the context switch is easy. It's harder when doing 2 tasks on the same project.

merlincorey

> on different projects

This seems like an important caveat the author of the article failed to mention when they described this:

> you can have several agents running simultaneously - one building a user interface, another writing API endpoints, and a third creating database schemas.

If these are all in the same project then there has to be some required ordering to it or you get a frontend written to make use of a backend that doesn't have the endpoints used, and you get a backend that makes use of a different database schema than the separately generated database schema.

kasey_junk

This is just project management. Teams of software devs have been doing this for decades. And it’s easier with agents because there is no harm in letting one sit idle.

furyofantares

On the same project you can use worktrees or otherwise separate clones of the repo - that part is not that bad. My comment was just about my own context switch.

rcarr

A technique I have found that works well is to have it working on one feature and then to have another session planning the next. Whilst it's busy generating some code, I open up another instance, tell it the next task and instruct it to create a gherkin feature file with an implementation plan. I then go back and forth between reviewing the code for the current feature and the plan for the next one.

modarts

It still amuses me how literally people took Kapathy's famous tweet around vibe coding https://x.com/karpathy/status/1886192184808149383

If people were to actually read beyond the first sentence, it would become clear very quickly that this was meant to be tongue in cheek.

pvtmert

Because most people have the context-window of 10 tokens, they do not read further than the first sentence (or two).

stavros

I don't think it's tongue-in-cheek at all. It refers to a specific type of LLM coding, where you literally don't care about how bad the code is and just code stuff and hope it works. That's how I use the term, and that's why I use it rarely.

krapp

People took it seriously because that's exactly how a lot of LLM users think and exactly what they want 'coding' to be. Honestly I'm not even certain it is satire.

adriand

I'm starting to think that the rather slow nature of Claude Code is a feature. In fact if they suddenly sped things up by 10X, I would want an option to slow it back down. Sometimes I am fine with it working unsupervised while I empty the dishwasher or take a shower, but a lot of the time I watch it work. Not only does this help me stop it from going down rabbit holes / chewing through all of my Opus usage cap, but I have a much better understanding of what it's built, in the same way I might if I was pair-programming with someone and they were driving.

The idea of having multiple instances working in parallel sounds like a nightmare to me. I find this technology works best when it is guided, ideally in real time.

muratsu

I find Codex and Claude Code to have different strength/weaknesses and wanted to be able to use them from a single interface. Currently hacking on https://devfleet.ai to make agent management more easy on myself.

Briefly mentioned on the article but async agents really thrive on small and scoped issues. Imagine hooking them up to your feedback tool (eg canny) and automatically having a PR as you review the customer feedback. Now this would likely not work for large asks but for smaller asks, you can just accept the PR and ship it really fast!

conradkay

Cool project! Do you think a lot of Codex's strengths are just from using GPT-5 as the model?

muratsu

The codex model is trained differently than the normal models. It has extra training on how to use cli and I find it to be better at project scope tasks (eg running tests, migrations, etc). Whereas in my experience Claude is the better coding model.

manveerc

When I read the title, I thought you were referring to https://parallel.ai, which also is a game changer in my opinion :)

PS: I have no affiliation with Parallel the company

epolanski

Look, I like AI coding but we're already way past the need for parallelism.

LLMs write so much code in such a short time that the bottleneck is already the human having to review, correct, rewrite.

Parallel agents working on different parts of the application just compound this problem worse, it's impossible to catch up.

The only far fetched use case I can see is swarming hundreds of solutions against a properly designed test case and spec documents and having an agent selecting the best solutions.

Still, I'm quite convinced humans would be the bottleneck.

rcarr

You are the main thread:

https://www.claudelog.com/mechanics/you-are-the-main-thread/

null

[deleted]

SatvikBeri

It really depends on the project. For example, there's a lot of thorny devops debugging where I can just let Claude spin for 30 minutes and it'll solve the problem (or fail) with a relatively short final answer.

The sweet spot for me tends to be running one of these slower projects on a worktree in the background, and one more active coding project.

epolanski

Yeah sure, I mean, there always be problems you can swarm..

OutOfHere

Exactly. With other models that are not Claude, the code generation for an issue takes a minute at most, whereas writing the detailed specification for it as a human takes me days or longer. Parallel code generation is as relevant to me as having a fast car stuck in traffic at a red light.

tptacek

So:

(1) I feel like most people call these async agents, though maybe "parallel" is the term that will stick.

(2) Async is great for reasons other than concurrent execution.

(3) Concurrent execution is tricky, at least for tightly defined projects, because the PRs will step on each other, and (maybe this is just me) I would rather rewrite an entire project than try to pick through a complicated merge conflict.

CuriouslyC

Nah, I saw this problem a while ago and already spec'd out the solution. First, agents need to be doing atomic commits, and second you can just have a massive merge queue with bisection, if you're using bazel you can handle ci gating on thousands of PRs with very little overhead, and when a merge batch fails you find the bad patch set in O(log(n)) time and dispatch to an agent for reconciliation. I even built a prototype, works great in benchmarks but I don't have a need for it over merge trains in gitlab yet.

shiroyasha

I agree, async does feel like a better description. I wish I used that term for the title.

ravila4

My experience with parallel agents is that the bottleneck is not how fast we can produce code but the speed at which we can review it and context switch. Realistically, I don’t think most people have the mental capacity to supervise more than one simultaneous task of any real complexity.