Skip to content(if available)orjump to list(if available)

Claude Code: Best practices for agentic coding

jasonjmcghee

Surprised that "controlling cost" isn't a section in this post. Here's my attempt.

---

If you get a hang of controlling costs, it's much cheaper. If you're exhausting the context window, I would not be surprised if you're seeing high cost.

Be aware of the "cache".

Tell it to read specific files (and only those!), if you don't, it'll read unnecessary files, or repeatedly read sections of files or even search through files.

Avoid letting it search - even halt it. Find / rg can have a thousands of tokens of output depending on the search.

Never edit files manually during a session (that'll bust cache). THIS INCLUDES LINT.

The cache also goes away after 5-15 minutes or so (not sure) - so avoid leaving sessions open and coming back later.

Never use /compact (that'll bust cache, if you need to, you're going back and forth too much or using too many files at once).

Don't let files get too big (it's good hygiene too) to keep the context window sizes smaller.

Have a clear goal in mind and keep sessions to as few messages as possible.

Write / generate markdown files with needed documentation using claude.ai, and save those as files in the repo and tell it to read that file as part of a question. I'm at about ~$0.5-0.75 for most "tasks" I give it. I'm not a super heavy user, but it definitely helps me (it's like having a super focused smart intern that makes dumb mistakes).

If i need to feed it a ton of docs etc. for some task, it'll be more in the few $, rather than < $1. But I really only do this to try some prototype with a library claude doesn't know about (or is outdated). For hobby stuff, it adds up - totally.

For a company, massively worth it. Insanely cheap productivity boost (if developers are responsible / don't get lazy / don't misuse it).

sagarpatil

If I have to be so cautious while using a tool might as well write the code myself lol. I’ve used Claude Code extensively and it is one of the best AI IDE. It just gets things done. The only downside is the cost. I was averaging $35-$40/day. At this cost, I’d rather just use Cursor/Windsurf.

BeetleB

Oh wow. Reading your comment guarantees I'll never use Claude Code.

I use Aider. It's awesome. You explicitly specify the files. You don't have to do work to limit context.

jjallen

Not having to specify files is a humongous feature for me. Having to remember which file code is in is half the work once you pass a certain codebase size.

m3kw9

That sometimes work sometimes doesn’t and takes 10x time. Same with codex. I would have both and switch between them depending on what you feel will get it right better

LeafItAlone

Aider is a great tool. I do love it. But I find I have to do more with it to get the same output as Claude Code (no matter what LLM I used with Aider). Sure it may end up being cheaper per run, but not when my time is factored in. The flip side is I find Aider much easier to limit.

Game_Ender

What are those extra things you have to do more of? I only have experience with Aider so I am curious what I am missing here.

simonw

With Claude Code you can at least type "/code" at any point to see how much it's spent, and it will show you when you end a session (with Ctrl+C) too.

The output of /cost looks like this:

  > /cost 
    ⎿  Total cost: $0.1331
       Total duration (API): 1m 13.1s
       Total duration (wall): 1m 21.3s

boredtofears

Yeah, I tried CC out and quickly noticed it was spending $5+ for simple LLM capable tasks. I rarely break $1-2 a session using aider. Aider feels like more of a precision tool. I like having the ability to manually specify.

I do find Claude Code to be really good at exploration though - like checking out a repository I'm unfamiliar with and then asking questions about it.

Jerry2

>I use Aider. It's awesome.

What do you use for the model? Claude? Gemini? o3?

m3kw9

Gemini 2.5 pro is my choice

kiratp

The productivity boost can be so massive that this amount of fiddling to control costs is counterproductive.

Developers tend to seriously underestimate the opportunity cost of their own time.

Hint - it’s many multiples of your total compensation broken down to 40 hour work weeks.

Aurornis

The cost of the task scales with how long it takes, plus or minus.

Substitute “cost” with “time” in the above post and all of the same tips are still valuable.

I don’t do much agentic LLM coding but the speed (or lack thereof) was one of my least favorite parts. Using any tricks that narrow scope, prevent reprocessing files over and over again, or searching through the codebase are all helpful even if you don’t care about the dollar amount.

pizza

Hard agree. Whether it's 50 cents or 10 dollars per session, I'm using it to get work done for the sake of quickly completing work that aims to unblock many orders of magnitude more value. But in so far as cheaper correct sessions correlate with sessions where the problem solving was more efficient anyhow, they're fairly solid tips.

afiodorov

I agree but optimisation often reveals implementation details helping to understand limits of current tech more. It might not be worth the time but part of engineering is optimisation and another part is deep understanding of tech. It is sometimes worth optimising anyway if you want to take the engineering discipline to the next level within yourself.

I myself didn’t think about not running linters however it makes obvious sense now and gives me the insight about how Claude Code works allowing me to use this insight in related engineering work.

pclmulqdq

It's interesting that this is a problem for people because I have never spent more than about $0.50 on a task with Claude Code. I have pretty good code hygiene and I tell Claude what to do with clear instructions and guidelines, and Claude does it. I will usually go through a few revisions and then just change anything myself if I find it not quite working. It's exactly like having an eager intern.

jjmarr

I don't think about controlling cost because I price my time at US$40/h and virtually all models are cheaper than that (with the exception of o1 or Gemini 2.5 pro).

If I spend $2 instead of $0.50 on a session but I had to spend 6 minutes thinking about context, I haven't gained any money.

jasonjmcghee

If you do it a bit, it just becomes habit / no extra time or cognitive load.

Correlation or causation aside, the same people I see complain about cost, complain about quality.

It might indicate more tightly controlled sessions may also produce better results.

Or maybe it's just people that tend to complain about one thing, complain about another.

owebmaster

Important to remind people this is only true if you have a profitable product, otherwise you’re spending money you haven’t earned.

jjmarr

If what I'm doing doesn't have a positive expected value, the correct move isn't to use inferior dev tooling to save money, it's to stop working on it entirely.

jasonjmcghee

If your expectation is to produce the same amount of output, you could argue when paying for AI tools, you're choosing to spend money to gain free time.

4 hours coding project X or 3 hours and a short hike with your partner / friends etc

irthomasthomas

I assume they use a conversation, so if you compress the prompt immediately you should only break cache once, and still hit cache on subsequent prompts?

So instead of Write Hit Hit Hit

It's Write Write Hit Hit Hit

gundmc

Never edit files manually during a session (that'll bust cache). THIS INCLUDES LINT

Yesterday I gave up and disabled my format-on-save config within VSCode. It was burning way too many tokens with unnecessary file reads after failed diffs. The LLMs still have a decent number of failed diffs, but it helps a lot.

chewz

My attempt is - Do not use Claude Code at all, it is terrible tool. It is bad at almost everything starting with making simple edits to files.

And most of all Claude Code is overeager to start messing with your code and run unnecessary $$ instead of making sensible plan.

This isn't problem with Claude Sonnet - it is fundamnetal problem with Claude Code.

winrid

I pretty much one shot a scraper from an old Joomla site with 200+ articles to a new WP site, including all users and assets, and converting all the PDFs to articles. It cost me like $3 in tokens.

hu3

I guess the question the is: can't VScode Copilot do the same for a fixed $20/month? It even has access to all SOTA models like Claude 3.7, Gemini 2.5 Pro and GPT o3

troupo

was it a wget call feeding into html2pdf?

simonw

The "ultrathink" thing is pretty funny:

> We recommend using the word "think" to trigger extended thinking mode, which gives Claude additional computation time to evaluate alternatives more thoroughly. These specific phrases are mapped directly to increasing levels of thinking budget in the system: "think" < "think hard" < "think harder" < "ultrathink." Each level allocates progressively more thinking budget for Claude to use.

I had a poke around and it's not a feature of the Claude model, it's specific to Claude Code. There's a "megathink" option too - it uses code that looks like this:

  let B = W.message.content.toLowerCase();
  if (
    B.includes("think harder") ||
    B.includes("think intensely") ||
    B.includes("think longer") ||
    B.includes("think really hard") ||
    B.includes("think super hard") ||
    B.includes("think very hard") ||
    B.includes("ultrathink")
  )
    return (
      l1("tengu_thinking", { tokenCount: 31999, messageId: Z, provider: G }),
      31999
    );
  if (
    B.includes("think about it") ||
    B.includes("think a lot") ||
    B.includes("think deeply") ||
    B.includes("think hard") ||
    B.includes("think more") ||
    B.includes("megathink")
  )
    return (
      l1("tengu_thinking", { tokenCount: 1e4, messageId: Z, provider: G }), 1e4
    );
Notes on how I found that here: https://simonwillison.net/2025/Apr/19/claude-code-best-pract...

westoncb

That's awesome, and almost certainly an Unreal Tournament reference (when you chain enough kills in short time it moves through a progression that includes "megakill" and "ultrakill").

orojackson

Not gonna lie: the "ultrathink" keyword that Sonnet 3.7 with thinking tokens watches for gives me "doubleplusgood" vibes in a hilarious but horrifying way.

4b11b4

At this point should we get our first knob/slider on a language model... THINK

..as if we're operating this machine as analog synth

soulofmischief

There are already many such adjustable parameters such as temperature and top_k

mr-karan

What I don’t like about Claude Code is why can’t they give command line flags for this stuff? It’s better documented and people don’t have to discover this the hard way.

Similarly, I do miss an —add command line flag to manual specify the context (files) during the session. Right now I pretty much end up copy pasting the relative paths from VSCode and supply to Claude. Aider has much better semantics for such stuff.

anotherpaulg

In aider, instead of “ultrathink” you would say:

  /thinking-tokens 32k
Or, shorthand:

  /think 32k

null

[deleted]

pyfon

Weird code to have in a modern AI system!

Also 14 string scans seems a little inefficient!

Aurornis

14 checks through a string is entirely negligible relative to the amount of compute happening. Like a drop of water in the ocean.

zoogeny

So I have been using Cursor a lot more in a vibe code way lately and I have been coming across what a lot of people report: sometimes the model will rewrite perfectly working code that I didn't ask it to touch and break it.

In most cases, it is because I am asking the model to do too much at once. Which is fine, I am learning the right level of abstraction/instruction where the model is effective consistently.

But when I read these best practices, I can't help but think of the cost. The multiple CLAUDE.md files, the files of context, the urls to documentation, the planning steps, the tests. And then the iteration on the code until it passes the test, then fixing up linter errors, then running an adversarial model as a code review, then generating the PR.

It makes me want to find a way to work at Anthropic so I can learn to do all of that without spending $100 per PR. Each of the steps in that last paragraph is an expensive API call for us ISV and each requires experimentation to get the right level of abstraction/instruction.

I want to advocate to Anthropic for a scholarship program for devs (I'd volunteer, lol) where they give credits to Claude in exchange for public usage. This would be structured similar to creator programs for image/audio/video gen-ai companies (e.g. runway, kling, midjourney) where they bring on heavy users that also post to social media (e.g. X, TikTok, Twitch) and they get heavily discounted (or even free) usage in exchange for promoting the product.

istjohn

Why do you think it's supposed to be cheap? Developers are expensive. Claude doesn't have to be cheap to make software development quicker and cheaper. It just has to be cheaper than you.

There are ways to use LLMs cheaply, but it will always be expensive to get the most out of them. In fact, the top end will only get more and more costly as the lengths of tasks AIs can successfully complete grows.

zoogeny

I am not implying in any sense a value judgement on cost. I'm stating my emotions at the realization of the cost and how that affects my ability to use the available tools in my own education.

It would be no different than me saying "it sucks university is so expensive, I wish I could afford to go to an expensive college but I don't have a scholarship" and someone then answers: why should it be cheap.

So, allow me the space to express my feelings and propose alternatives, of which scholarships are one example and creative programs are another. Another one I didn't mention would be the same route as universities force now: I could take out a loan. And I could consider it an investment loan with the idea it will pay back either in employment prospects or through the development of an application that earns me money. Other alternatives would be finding employment at a company willing to invest that $100/day through me, the limit of that alternative being working at an actual foundational model company for presumably unlimited usage.

And of course, I could focus my personal education on squeezing the most value for the least cost. But I believe the balance point between slightly useful and completely transformative usages levels is probably at a higher cost level than I can reasonably afford as an independent.

qudat

> It just has to be cheaper than you.

Not when you need an SWE in order for it to work successfully.

farzd

general public, ceo, vc consensus is that - if it can understand english, anyone can do it. crazy

Wowfunhappy

> So I have been using Cursor a lot more in a vibe code way lately and I have been coming across what a lot of people report: sometimes the model will rewrite perfectly working code that I didn't ask it to touch and break it.

I don't find this particularly problematic because I can quickly see the unnecessary changes in git and revert them.

Like, I guess it would be nice if I didn't have to do that, but compared to the value I'm getting it's not a big deal.

zoogeny

I agree with this in the general sense but of course I would like to minimize the thrash.

I have become obsessive about doing git commits in the way I used to obsess over Ctrl-S before the days of source control. As soon as I get to a point I am happy, I get the LLM to do a check-point check in so I can minimize the cost of doing a full directory revert.

But from a time and cost perspective, I could be doing much better. I've internalized the idea that when the LLM goes off the rails it was my fault. I should have prompted it better. So I am now consider: how do I get better faster? And the answer is I do it as much as I can to learn.

I don't just want to whine about the process. I want to use that frustration to help me improve, while avoiding going bankrupt.

k__

That's why I like Aider.

You can protect your files in a non-AI way: by simply not giving write access to Aider.

Also, apparently Aider is a bit more economic with tokens than other tools.

zoogeny

I haven't used Aider yet, but I see it show up on HN frequently recently (the last couple of days specifically).

I am hesitant because I am paying for Cursor now and I get a lot of model usage included within that monthly cost. I'm cheap, perhaps to a fault even when I could afford it, and I hate the idea of spending twice when spending once is usually enough. So while Aider is potentially cheaper than Claude Code, it is still more than what I am already paying.

I would appreciate any comments on people who have made the switch from Cursor to Aider. Are you paying more/less? If you are paying more, do you feel the added value is worth the additional cost? If you are paying less, do you feel you are getting less, the same or even more?

Game_Ender

With Aider you pay API fees only. You can get simple tasks done for a few dollars. I suggest budgeting $20 or so dollars and giving it a go.

alchemist1e9

As an Aider user who has never tried Cursor, I’d also be interested in hearing from any Aider users who are using Cursor and how it compares.

flashgordon

So I feel like a grandpa reading this. I gave Claude code a solid shot. Had some wins but costs started blowing up. I switched to Gemini AI where I only upload files I want it to work on and make sure to refactor often so modularity remains fairly high. It's an amazing experience. If this is any measure - I've been averaging about 5-6 "small features" per 10k tokens. And I totally suck at fe coding!! The other interesting aspect of doing it this way is being able to break up problems and concerns. For example in this case I only worked on fe without any backend and flushed it out before starting on an backend.

xpe

by fe the poster means FE (front-end)

flashgordon

Sorry yes. I should have clarified that.

sbszllr

The issue with many of these tips is that they require you use to claude code (or codex cli, doesn't matter) to spend way more time in it, feed it more info, generate more outputs --> pay more money to the LLM provider.

I find LLM-based tools helpful, and use them quite regularly but not 20 bucks+, let alone 100+ per month that claude code would require to be used effectively.

ramoz

Interesting, I have $100 days with Claude Code. Beyond effective.

dist-epoch

> let alone 100+ per month that claude code would require

I find this argument very bizarre. $100 is pay for 1-2 hours of developer time. Doesn't it save at least that much time in a whole month?

nrvn

what happened to the "$5 is just a cup o' coffee" argument? Are we heading towards the everything-for-$100 land?

On a serious note, there is no clear evidence that any of the LLM-based code assistants will contribute to saving developer time. Depends on the phase of the project you are in and on a multitude of factors.

rsyring

I'm a skeptical adopter of new tech. But I cut my teeth on LLMs a couple years ago when I was dropped into a project using an older framework I wasn't familiar with. Even back then, LLMs helped me a ton to get familiar with the project and use best practices when I wasn't sure what those were.

And that was just copy & past into ChatGPT.

I don't know about assistants or project integration. But, in my experience, LLMS are a great tool to have and worth learning how to use well, for you. And I think that's the key part. Some people like heavily integrated IDEs, some people prefer a more minimal approach with VS Code or Vim.

I think LLMs are going to be similar. Some people are going to want full integration and some are just going to want minimal interface, context, and edits. It's going to be up to the dev to figure out what works best for him or her.

fnordpiglet

While I agree, I find the early phases to be the least productive use of my time as it’s often a lot of boilerplate and decisions that require thought but turn to matter very little. Paying $100 to bootstrap to midlife on a new idea seems absurdly cheap given my hourly.

panny

Just a few days ago Cursor saved a lot of developer time by encouraging all the customers to quit using a product.

https://news.ycombinator.com/item?id=43683012

Developer time "saved" indeed ;-)

owebmaster

No, it doesn't. If you are still looking for product market fit, it is just cost.

After 2 years of GPT4 release, we can safely say that LLMs don't make finding PMF that much easier nor improve general quality/UX of products, as we still see a general enshittification trend.

If this spending was really game-changing, ChatGPT frontend/apps wouldn't be so bad after so long.

mikeg8

Finding product market fit is a human directional issue, and LLMs absolutely can help speed up iteration time here. I’ve built two RoR MVPs for small hobbby projects spending ~$75 in Claude code to make something in a day that would have previously taken me a month plus. Again, absolutely bizarre that people can’t see the value here, even as these tools are still working through their kinks.

mrbombastic

Enshittification is the result of shitty incentives in the market not because coding is hard

joshstrange

The most interesting part of this article for me was:

> Have multiple checkouts of your repo

I don’t know why this never occurred to me probably because it feels wrong to have multiple checkouts, but it makes sense so that you can keep each AI instance running at full speed. While LLM‘s are fast, this is one of the annoying parts of just waiting for an instance of Aider or Claude Code to finish something.

Also, I had never heard of git worktrees, that’s pretty interesting as well and seems like a good way to accomplish effectively having multiple checkouts.

m0rde

I've never used Claude Code or other CLI-based agents. I use Cursor a lot to pair program, letting the AI do the majority of the work but actively guiding.

How do you keep tabs on multiple agents doing multiple things in a codebase? Is the end deliverable there a bunch of MRs to review later? Or is it a more YOLO approach of trusting the agents to write the code and deploy with no human in the loop?

oxidant

Multiple terminal sessions. Well written prompts and CLAUDE.md files.

I like to start by describing the problem and having it do research into what it should do, writing to a markdown file, then get it to implement the changes. You can keep tabs on a few different tasks at a time and you don't need to approve Yolo mode for writes, to keep the cost down and the model going wild.

rfoo

In the same way how you manage a group of brilliant interns.

mh-

Really? My LLMs seem entirely uninterested in free snacks and unlimited vacation.

fallinditch

I'm wondering how much of the techniques described in this blog post can be used in an IDE like Windsurf or Cursor with Claude Sonnet?

My 2 cents on value for money and effectiveness of Claude vs Gemini for coding:

I've been using Windsurf, VS Code and the new Firebase Studio. The Windsurf subscription allowance for $15 per month seems adequate for reasonable every day use. I find Claude Sonnet 3.7 performs better for me than Gemini 2.5 pro experimental.

I still like VS Code and its way of doing things, you can do a lot with the standard free plan.

With Firebase Studio, my take is that it should good for building and deploying simple things that don't require much developer handholding.

remoquete

What's the Gemini equivalent of Claude Code and OpenAI's Codex? I've found projects like reugn/gemini-cli, but Gemini Code Assist seems limited to VS Code?

jasir

There's Aider, Plandex and Goose, all of which let you chose various providers and models. Aider also has a well known benchmark[0] that you can check out to help select models.

- Aider - https://aider.chat/ | https://github.com/Aider-AI/aider

- Plandex - https://plandex.ai/ | https://github.com/plandex-ai/plandex

- Goose - https://block.github.io/goose/ | https://github.com/block/goose

[0] https://aider.chat/docs/leaderboards/

boredtofears

I've only user aider (which I like quite a bit more than cursor) but I'm curious how it compares to plandex and goose.

danenania

Hi, creator of Plandex here. In case it's helpful, I posted a comment listing some of the main differences with aider here: https://news.ycombinator.com/item?id=43728977

peterldowns

I would also like to know — I think people are using Cursor/Windsurf/Roo(Cline) for IDEs that let you pick the model, but I don't know of a CLI agentic editor that lets you use arbitrary models.

manojlds

peterldowns

Thanks! Any others, or any thoughts you can share on it?

jhawk28

Junie from Jetbrains was recently released. Not sure what LLM is uses.

nojs

Claude

0x696C6961

I mostly work in neovim, but I'll open cursor to write boilerplate code. I'd love to use something cli based like Claude Code or Codex, but neither of them implement semantic indexing (vector embeddings) the way Cursor does. It should be possible to implement an MCP server which does this, but I haven't found a good one.

isaksamsten

I use a small plugin I’ve written my self to interact with Claude, Gemini 2.5 pro or GPT. I’ve not really seen the need for semantic searching yet. Instead I’ve given the LLM access to LSP symbol search, grep and the ability to add files to the conversation. It’s been working well for my use cases but I’ve never tried Cursor so I can’t comment on how it compares. I’m sure it’s not as smooth though. I’ve tried some of the more common Neovim plugins and for me it works better, but the preference here is very personal. If you want to try it out it’s here: https://github.com/isaksamsten/sia.nvim

xpe

Good point. I largely work in Zed -- looks like it had semantic search for a while but is working on a redesign https://github.com/zed-industries/zed/issues/9564

sqs

Tool-calling agents with search tools do very well at information retrieval tasks in codebases. They are slower and more expensive than good RAG (if you amortize the RAG index over many operations), but they're incredibly versatile and excel in many cases where RAG would fall down. Why do you think you need semantic indexing?

0x696C6961

> Why do you think you need semantic indexing?

Unfortunately I can only give an anecdotal answer here, but I get better results from Cursor than the alternatives. The semantic index is the main difference, so I assume that's what's giving it the edge.

sqs

Is it a very large codebase? Anything else distinctive about it? Are you often asking high-level/conceptual questions? Those are the questions that would help me understand why you might be seeing better results with RAG.

panny

>Use Claude to interact with git

Are they saying Claude needs to do the git interaction in order to work and/or will generate better code if it does?

sagarpatil

It doesn’t need to. Its optional.

panny

I don't see how this is a best practice then. It seems like they are saying "Spend money on something easy to do, but can be catastrophic if the AI screws it up."

bugglebeetle

Claude Code works fairly well, but Anthropic has lost the plot on the state of market competition. OpenAI tried to buy Cursor and now Windsurf because they know they need to win market share, Gemini 2.5 pro is better at coding than their Sonnet models, has huge context and runs on their TPU stack, but somehow Anthropic is expecting people to pay $200 in API costs per functional PR costs to vibe code. Ok.

owebmaster

> but somehow Anthropic is expecting people to pay $200 in API costs per functional PR costs to vibe code. Ok.

Reading the thread, somehow people are paying. It is mindblowing how in place of getting cheaper, development just got more expensive for businesses.

tylersmith

$200 per PR is significantly cheaper development than businesses are paying.

xpe

In terms of short-term outlay, perhaps. But don't forget to factor in the long-term benefits of having a human team involved.

andrewstuart

I’m too scared of the cost to use this.