Developing with Kiro: Amazon's New Agentic IDE

78 comments

·July 21, 2025

agrippanux

From the article:

> What I found interesting is how it forced me to think differently about the development process itself. Instead of jumping straight into code, I found myself spending more time articulating what I actually wanted to build and high level software architectural choices.

This is what I already do with Claude Code. Case in point, I spent 2.5 hours yesterday planning a new feature - first working with an agent to build out the plan, then 4 cycles of having that agent spit out a prompt for another agent to critique the plan and integrate the feedback.

In the end, once I got a clean bill of health on the plan from the “crusty-senior-architect” agent, I had Claude build it - took 12 minutes.

Two passes of the senior-architect and crusty-senior-architect debating how good the code quality was / fixing a few minor issues and the exercise was complete. The new feature worked flawlessly. It took a shade over 3 hours to implement what would have taken me 2 days by myself.

I have been doing this workflow a while, but Claude Code released Agents yesterday (/agents) and I highly recommend them. You can define an agent on the basis of another agent, so crusty-architect is a clone of my senior-architect but it’s never happy unless code was super simple, maintainable, and uses well established patterns. The debates between the two remind me of sitting in conf rooms hashing an issue out with a good team.

pjm331

I love how I am learning about a new claude code feature in a comment on HN - nowhere to be found on their release notes https://docs.anthropic.com/en/release-notes/claude-code

Thanks for the tip!

I've been attempting to do this kind of thing manually w/ mcp - took a look at "claude swarm" https://github.com/parruda/claude-swarm - but in the short time I spent on it I wasn't having much success - admittedly I probably went a little too far into the "build an entire org chart of agents" territory

[EDIT]: looks like I should be paying attention to the changelog on the gh repo instead of the release notes

https://github.com/anthropics/claude-code/blob/main/CHANGELO...

[EDIT 2]: so far this seems to suffer from the same problem I had in my own attempts which is that I need to specifically tell it to use an agent when I would really like it to just figure that out on its own

like if I created an agent called "code-reviewer" and then I say - "review this code" ... use the agent!

shostack

Roo Code has had Orchestrator mode doing this for a while with your models of choice. And you can tweak the modes or add new ones.

What I have noticed is the forcing function of needing to think through technical and business considerations of ones work up front, which can be tedious if you are the type that likes to jump in and hack at it.

For many types of coding needs, that is likely the smarter and ultimately more efficient approach. Measure twice, cut once.

What I have not yet figured out is how to reduce the friction in the UX of that process to make it more enjoyable. Perhaps sprinkling in some dopamine triggering gamification to answering questions.

gooosle

You planned and wrote a feature yesterday that would have taken yourself 2 whole days? And you already got it reviewed and deployed it and know that 'it works flawlessly'?

....

That reminds me of when my manager (a very smart, very AI-bullish ex-IC) told us about how he used AI to implement a feature over the weekend and all it took him was 20 mins. It sounds absolutely magical to me and I make a note to use AI more. I then go to review the PR, and of course there are multiple bugs and unintended side-effects in the code. Oh and there are like 8 commits spread over a 60 hour window... I manually spin up a PR which accomplishes the same thing properly... takes me 30mins.

fcarraldo

This sounds like a positive outcome? A manager built a proof-of-concept of a feature that clearly laid out and fulfilled the basic requirements, and an engineer took 30 mins to rewrite it once it's been specified.

How long does it typically take to spec something out? I'd say more than 20 mins, and typical artifacts to define requirements are much lossier than actual code - even if that code is buggy and sloppy.

gooosle

Not at all.

What was claimed was that a complete feature was built in record time with AI. What was actually built was a useless and buggy piece of junk that wasted reviewer time and was ultimately thrown out, and it took far longer than claimed.

There were no useful insights or speed up coming out of this code. I implemented the feature from scratch in 30 mins - because it was actually quite easy to do manually (<100 loc).

mym1990

This seems more of a process problem than a tooling problem. Without specs on what the feature was, I would be inclined to say you manager had a lapse in his "smartness", there was a lot of miscommunication on what was happening, or you are being overly critical over something that "wasted 30 minutes of your time". Additionally, this seems like a crapshoot work environment...there seems to be resentment for the manager using AI to build a feature that had bugs/didn't work...whereas ideally you two sit down and talk it out and see how it could be managed better next time?

gooosle

Not at all, there is no resentment - that's your imagination. There is nothing about what I described that indicates that it's a bad work environment - I quite like it.

You're bringing up various completely unrelated factors seemingly as a way of avoiding the obvious point of the anecdotal story - that AI for coding just isn't that great (yet).

vitorbaptistaa

Would you mind sharing the prompts you use for your subagents? It sounds very interesting!

ofrzeta

How exactly to you "create an agent" with the personalities you are talking about?

therealbilliam

/agents

ofrzeta

The parent commenter had agents with personalities before the release of the agents feature in Claude Code, that's why I was asking.

hansmayer

> This long (hopefully substantial) blog post was written collaboratively, with Kiro handling the bulk of the content generation while I focused on steering the direction.

So Kiro wrote whatever Kiro "decided", better said, guessed, what to write about and did most of the "content generation" - a weird but fitting term to use by a machine in writing a fake human blog. And the human kind of "directed it", but we dont really know for sure, because language is our primary interface and an author should be able to express their thoughts without using a machine?

I'd happier if the author shared their actual experience in writing the software with this tool.

ako

Why does it matter, as long as the output is of high quality? E.g., a Spielberg directed movie indicates a level of quality, even if Spielberg didn't do everything himself.

jmogly

The words-to-thoughts ratio is way too high, it reads like a elementary school book report, it’s way too long for how dry it is, I could go on but these are just some of my initial thoughts while reading the article. Also, knowing it is mostly written with AI, how do I know if details are real or made up? There’s a reason you are reading my comment: it expressed thoughts or an image that you found captivating. Being able to write well is a privileged skill that improves your communication, ability to express ideas, your humor; the things that make you an interesting person. You should not be outsourcing your voice to AI. Also Spielberg wasn’t writing an article - he was directing a movie.

ako

So we can debate the quality of this particular article, but in general if an author closely instructs the ai what to write, edits it, and uses the ai to express his thoughts faster and better, I have no problem with the use of ai as a tool to write better.

You’re making assumptions on the quality of the article just because it’s written with the help of ai, I think that not justified in general.

hansmayer

It matters as its not about a vague notion of a "level of quality". It's about reading about a personal experience written by an actual person. It's about not insulting the intelligence of one's readers by throwing up a wall of LLM-text and signing oneself, it's about not being intellectually and morally dishonest by kinda mentioning it, but only half-way through the text. The comparison with Spielberg is almost there, but not there yet, as the director does whatever it is the directors do - not outsourcing it to some "Kiro". The right comparison would have been if the AI created the next sequel of E.T., Gremlins or whatever it was that Spielberg became famous for. Who cares? I want new and genuine insights that only another human can create. Not the results of a machine minimising the statistical regression error in order to produce the 100th similar piece of "content" that I have already seen the first 99 times. I have a feeling that none of the ghouls pushing for the AI-generated "content" have ever tryly enjoyed the arts, whether 'popular' or 'noble'. Its about learning something about yourself and the world, trying to graps the author's struggles during the creation. Not about mindless consumption. That's why it matters.

StableAlkyne

> It's about reading about a personal experience written by an actual person

This seems to be the dividing line in the AI writing debate.

If one cares about the interpersonal connection formed with the author, generally they seem to strongly dislike machine-generated content.

If one cares about the content in isolation, then generally the perceived quality is more important than authorship. "The author is dead" and all that.

Both points are valid IMO. It's okay to dislike things, and it's okay to enjoy things.

> I want new and genuine insights that only another human can create.

This is a good illustration of what I mean: you personally value the connection with the author, and you can't get a human connection when there was never a human to begin with.

If you take a look at the others in the thread who had a positive view of the work, they generally focused on the content instead.

sokoloff

> I want new and genuine insights that only another human can create.

I’m not sure that’s true in an iron-clad, first-principles way. I think that many of the insights created by humans are combinations/integrations of existing concepts. That category of insight does not seem to require carbon-based reasoning.

I don’t claim that it can be achieved by statistical text generation, but I doubt the typical blog author is creating something that forever will be human-only.

iamsaitam

It's not high quality output when an opinion based article isn't written by its author. It's fiction.

hansmayer

Thumbs up to this concise comment, but I'd not even honour the blog author by calling it fiction. Probably more of a hallucination.

LeafItAlone

Where is the line?

Is it not high quality output when a ghost writer writes someone’s work? Can I use a thesaurus? What about a translator?

As long as the person who is putting their name on the article has provided the ideas and ensures that the output matches their intent, I see no reason to draw arbitrary lines.

block_dagger

It’s objectively not fiction, whatever it is.

g8oz

If they can't be bothered to write it, I can't bothered to read it.

f1shy

I’m pretty sure Mr Spielberg DOES something. If he does absolutely nothing, I really doubt the phrase “directed movie indicates a level of quality” can have any level of truth in a general case.

ako

Ai is just another tool, it doesn’t do anything by itself, in the hands of a person with a great story it can be a great tool to help write it. In the hands of a bad writer it will probably just create bad content. Ai is a tool controlled by a person.

Do you really care if Spielberg’s team manually edits the movie or uses an AI powered video editing tool? In the end Spielberg is responsible for the end quality of the movie.

rs186

Well, you just answered the question yourself, by inadvertently using a good example.

You see, a movie is fictional work, but a blog article most likely isn't (or shouldn't). In this case, I am reading the article because I want to know an objective, fair assessment of Kiro from a human, not random texts generated from an LLM.

wzdd

The output is not of high quality. It is extremely verbose for what it is trying to say, and I found myself skimming it while dealing with

1. The constant whiplash of paragraphs which describe an amazing feature followed by paragraphs which walk it back ("The shift is subtle but significant / But I want to be clear", "Kiro would implement it correctly / That said, it wasn't completely hands-off", "The onboarding assistance was genuinely helpful / However, this is also where I encountered", "It's particularly strong at understanding / However, there are times when");

2. Bland analogies that detract from, rather than enhance, understanding ("It's the difference between being a hands-on manager who needs to check every detail versus setting clear expectations and trusting the process.", "It's like having a very knowledgeable colleague who..."); and

3. literal content-free filler ("Here's where it got interesting", "Don't let perfect be the enemy of good", "Most importantly / More importantly"), etc etc.

Kiro is a new agentic IDE which puts much more of a focus on detailed, upfront specification than competitors like Cursor. That's great. Just write about that.

heisgone

It's more like if Spielberg produced a documentary about himself.

danr4

Crtl + F -> "Claude Code" -> No Results -> Close tab

Can't really get value out reading this if you don't compare it to the leading coding agent

brunooliv

I think a big reason why Claude Code is winning is because it’s such a thin wrapper over a very strong base model which is why people are afraid of comparing it directly. All these IDE integrations and GUIs and super complex system prompts etc are only bloating all these other solutions with extra complexity, so comparing something inherently different becomes also harder.

redhale

Agree. I stopped reading after the blurb below because it tells me this person has not actually even used Copilot or Cursor to a serious degree. This is an AI-written sentence that seems fine, but is actually complete nonsense.

> Each tool has carved out its own niche in the development workflow: Copilot excels at enhancing your typing speed with intelligent code completion, Cursor at debugging and helping you implement discrete tasks well, and recently pushing more into agentic territory.

Cursor's autocomplete blows Copilot's out of the water. And both Copilot and Cursor have pretty capable agents. Plus, Claude Code isn't even mentioned here.

This blog post is a Kiro advertisement, not a serious comparative analysis.

christophilus

Outside of its excellent capabilities, the thing I most love about Claude Code is that I can run it in my containers. I don’t want Cursor or other bloated, dependency-ridden, high-profile security targets on my desktop.

ofrzeta

Kiro uses Claude Sonnet 4.0 if that matters.

Cu3PO42

I wanted a tiny helper tool to display my global keyboard shortcuts for me on macOS. I gave Kiro a short spec and some TypeScript describing the schema of the input data.

It wrote around 5000 LOC including tests and they... worked. It didn't look as nice as I would have liked, but I wasn't able to break it. However, 5000 lines was way too much code for such a simple task, the solution was over-engineered along every possible axis. I was able to (manually) get it down to ~800LOC without losing any important functionality.

coev

Kiro is from Amazon, so Conway's law in action?

thr0w

> I was able to (manually) get it down to ~800LOC without losing any important functionality.

This is funny. Why would you a) care how many LOC it generated and b) bother injecting tedious, manual process into something otherwise fully automated?

Cthulhu_

What year is it? Back in 2000 or before, the same arguments were made about webpages made in Dreamweaver and Frontpage. Shortly after there was a big push towards making the web faster and more efficient, which included stepping away from web page builders and building tools that optimized and minified all aspects of a webpage.

HighGoldstein

Then, we ended up bundling 50MB of minified frameworks in every page load. Maybe the next part of the cycle will be for optimizing this aspect of LLMs, maybe even to be able to fit more meaningful code in the context windows of these very LLMs.

Cu3PO42

I care about the complexity because I want/need to maintain the code down the line. I find it much easier to maintain shorter, simpler code than long, complex code.

Also because it was an experiment. I wanted to see how it would do and how reasonable the code it wrote was.

x187463

I'm sure somebody is going to point out that it was written by AI and is a toy, therefore it can be maintained by AI. I share your desire to have human maintainable code, but I imagine one of the goals of AI written code is to allow the AI to manage it, end-to-end.

sqrtc

I tried out Kiro last week on quite a gnarly date time parsing problem. I had started with a two hundred-ish word prompt and a few bits of code examples for context to describe the problem. Similar to OP it forced me to stop and think more clearly about the problem I was trying to solve and in the end left my jaw on the floor as I saw it work through the task list.

I think only early bit of feedback I had was in that my tasks were also writing a lot of tests, and if the feedback loop to getting test results was neater this would be insanely powerful. Something like a sandboxed terminal, I am less keen on a YOLO mode and had to keep authorising the terminal to run.

jon-wood

This sort of comment always fascinates me. Having a machine do the last leaps for you is a time saver I guess, but I often wonder whether the real thing people are discovering again is that sitting down and really thinking about the problem you're trying to solve before writing some code results in better solutions when you get to it.

8note

its not that i havent spent time thinking about it - i at least still my thinking first mostly on paper.

the LLM however asks me clarifying questions that i wouldnt have thought about myself. the thinking is a step or two deeper than it was before, if the LLM comes up with good questions

kubb

They wouldn’t have thought more deeply about it if the model didn’t „tell” them to.

gbrindisi

meta: if you use AI to write articles, don’t have them written so that I’m forced to use AI to summarize them

conartist6

Reading the HN comments instead of the article is the best summarizing hack

djeastm

I kind of hate the implications of it, but if HN (or someone else) wanted to add value, they could show one-line sentiment analyses of the comments in the HN articles so you can decide what's what without even clicking.

conartist6

The reason reading comments is so useful is because it's not one summary but a variety of different, unique reactions (and reactions to reactions).

The model I want to train is ME, so a one sentence sentiment analysis offers 0 value to me while a lot of distinct human perspectives are a gold mine.

It's kinda like the difference between being able to look at a single picture of a landscape and being able to walk around it.

Cthulhu_

Ironically there was a tool just the other day that would read HN articles and summarize them.

amarcheschi

Ai decompressor doing their best again. Seriously, though, the article feels too long imho

iamsaitam

Better, it should be compulsory for these to lead with a summarized version

lvl155

I am just gonna say it. This is not something Kiro came up with. People were already using this workflow. Perhaps they should’ve added more features instead of spending time making promo videos of themselves. I fail to see any add value here especially considering it’s Amazon. Sonnet 4 is effectively unlimited for many MAX users so giving that away to work out their list of bugs is a non-starter.

evertedsphere

at this point i think we ought to start having a tag in the submission title for when the submission is (primarily) llm-generated, like we do for video/pdf/nsfw

dalmo3

Too late. YouTube music channels already began explicitly tagging videos as "Human made music".

999900000999

Kiro’s main advantage is Amazon is paying for my LLM usage instead of me.

For the most part it’s unlimited right now. Vs Code’s Copilot Agent mode is basically the same thing , tell it to write a list of tasks , but I have to pay for it.

I’m much happier with both of these options, both are much cheaper than Claude Code.

IMO the real race is to get LLM cost down. Someone smarter than me is going to figure out how to run a top LLM model for next to nothing.

This person will be a billionaire. Nvidia and AMD are probably already working on it. I want Deepseek running on a 100$ computer that uses a nominal amount of power.

brokegrammer

My thoughts exactly. Inference should be dirt cheap for LLMs to truly become powerful.

It's similar to how computing used to be restricted to mega corps 100 years ago, but today, a smartphone has more computing power than any old age mainframe. Today we need Elon Musk to buy 5 million GPUs to train a model. Tomorrow, we should be able to train a top of the line model using a budget RTX card.

999900000999

Tbh, if the model is small enough you can train locally.

I don't need my code assistant to be an expert on Greek myths. The future is probably highly specialized mini llms. I might train a model to code my way.

I'm not that smart enough to figure this out, but the solution can't be to just brute force training with more gpus.

There is another answer.

glietu

Spec based development is a game changer for people with non-coding background to work on side-projects. I’m using this to fork a design flow I use for analog/RFIC design, and I can finally mend together open-source CAD tools to attune to the design flow, that I’d otherwise use in segregation.

iamkonstantin

Apart from the UI being more point and click, what does Kiro (or Cursor) add that for example Claude Code's CLI doesn't do?

> With Kiro, I spend more time upfront articulating what I want to build, but then I can step back and let it execute

This sounds like exactly the kind of exercise one does to /init a project with Claude, define tasks/spec etc.

adamgordonbell

So with Kiro, you iterate on a spec and a task list as the key activity?

It sounds like a very PM type approach to coding.

Does that mean it fits PM types more than IC dev types?

mdaniel

I believe that's why some software engineers dread this new future, because it's no longer software but rather project management and code review, pretty much all day

I personally find code review more exhausting than code writing, and that goes 50x for when I'm reviewing code from an intern because they are trying to cheat me all the time. And I have always hated PM stuff, on both sides of that relationship