Skip to content(if available)orjump to list(if available)

Hacking Your Own AI Coding Assistant with Claude Pro and MCP

ezyang

There's a few coding MCPs out there. I have also written one (codemcp) and the pitch for mine is that it DOESN'T provide a bash tool by default and checkpoints your filesystem edits every change in Git, so that it's all about feeling comfortable with letting the agent run to completion and then only inspect the final result. The oldest one in the space, I think, is wcgw.

vessenes

MCP documentation is thin on the ground, but I like what I've seen. I think it provides a nice separation of concerns for using LLMs as Agents.

The article mentions a python MCP library, there seems to be a pretty vibrant golang MCP server as well at https://github.com/mark3labs/mcp-go. No personal experience, but I'd usually rather a binary than a python environment install when I can get it.

The other thing the author mentions is that Claude can author MCP integrations and merge them into your server -- sounds good to me!

chaosprint

I have cancelled the cursor subscription, but it is more for functional reasons.

I am confused that the author said he has privacy concerns about cursor, but the most satisfactory thing about cursor is that it asked me whether I want privacy mode at the beginning.

On the contrary, I am very skeptical about whether these websites with free quotas, not only claude, but also grok, chatgpt, etc., will use chat data for training.

If you are really worried about this, it is not impossible to deploy deepseek r1 locally now.

simonw

Claude won't train on your inputs, even for free trial accounts. That's clearly spelled out in their terms: https://www.anthropic.com/legal/consumer-terms

> We will not train our models on any Materials that are not publicly available, except in two circumstances:

> 1. If you provide Feedback to us (through the Services or otherwise) regarding any Materials, we may use that Feedback in accordance with Section 5 (Feedback).

> 2. If your Materials are flagged for trust and safety review, we may use or analyze those Materials to improve our ability to detect and enforce Acceptable Use Policy violations, including training models for use by our trust and safety team, consistent with Anthropic’s safety mission.

You have to read the terms carefully for everyone though, it's frustratingly hard work to keep up with these policies.

Anthropic do have a very interesting way of analyzing usage of their tools without exposing any data to their researchers: https://simonwillison.net/2024/Dec/12/clio/

bionhoward

Do they even need inputs to train on outputs by inferring inputs? Swear I’ve seen papers about this. Not to mention training reward models to grade the output… they could absolutely get around the “we don’t train on your inputs”

Also, doesn’t Claude Code count every input as “feedback?”

Plus, they forbid us from training on our own chat logs. That’s a form of vendor lock in. Way better to use local LLMs

At some point there ought to be a paradigm shift toward service providers with more reasonable legal terms (IMHO) but too often we pretend nobody reads them or cares

If they’re claiming not to learn from inputs, and we’re subject to prohibition about learning from outputs, what the heck is the point?

[1] https://arxiv.org/abs/2405.15012

soulofmischief

The free LLM service is the new free VPN.

lyime

It seems fair for them to train on how people use their tools, so they can make the tools better for people who use them?

DavidPP

On my end, I use https://github.com/skydeckai/mcp-server-aidd, which has

- File manipulation - Directory manipulation - tree-sitter integration

and more.

I also installed Tavily Search, sequential thinking, and Playwright.

I still use Cursor for development, and I use Claude Desktop for higher-level documentation, testing, etc.

For example, I'll check out a new repo that is lacking in documentation. I'll get the app running, then explain to Claude where the code lives, how to access the real app, and how I want the features documented.

Then Claude will happily scan the codebase, take screenshots of the running app, etc., all by himself, and then create a report (through the artifact system) with visualizations, graphs, etc.

_joel

I built an entire site making it meet the complete spec (with some esoteric tings, it's a non profit) in one day yesterday with Roocode and Claude 3.7 Sonnet. It cost about $30 in total (appreciate I could do it cheaper but was using anthropic api directly). It's amazing, it would have taken me a couple of weeks to get everything right. Getting it to do tests as part of the process and some sublte engineering to get it not to get stuck in loops and ignore files are a must. It's also fun using it in Architect mode and feeding it with Perplexity AI's deep research output as a series of prompt for ingestion into roocode.

ezyang

If you'd done it with MCP it would only have cost you $20 and you would still have had the rest of the month to use your Claude Pro sub :P

_joel

This is true! :)

rahimnathwani

For anyone trying this at home, if you want a starting point, you could use OP's, or you could use the official Filesystem API: https://github.com/modelcontextprotocol/servers/tree/main/sr... There are folks using just this filesystem API for coding: https://x.com/Zirkman/status/1897003182049649000

I'm not sure I would prefer it to using Cursor. I was using Claude Desktop client to edit a bunch of (non-code) text files a couple of days ago, and a couple of times it crashed, and I had to restart it. When this happened, it lost some conversation history, although of course the files it had already edited were fine.

There are other approaches to having Claude edit files, but they're less suited for use with MCP (and hence they don't save cost like the OP does).

A) Aider gives very specific instructions about how Claude should describe patch operations: https://github.com/Aider-AI/aider/blob/dd4d2420df51dc29c2aed...

This works, but I don't know if anyone has tried using this technique in an MCP server. It seems like the way OP and the Filesystem MCP might be superior for small edits, but Aider's approach might be better for multiple edits in a single request-response. It should be possible to do this with an MCP server.

B) Anthropic provides fine-tuned models (text_editor_20250124 and text_editor_20241022) that know a specific text editing protocol: https://docs.anthropic.com/en/docs/build-with-claude/tool-us...

You could build a coding assistant like Aider on top of these, but obviously you're then tied to this implementation and it's harder to switch out models. And it wouldn't work with Claude Desktop.

ezyang

I don't really recommend using filesystem MCP directly, it won't checkpoint changes so it's easy to end up in a state where you can't recover an older working version of the code. Use an actual coding oriented MCP.

rahimnathwani

I'm going to try out yours (https://github.com/ezyang/codemcp). I already pay for Cursor, but I'm curious.

EDIT: I really like the way that each change generates a commit, and that all commits in a single session are squashed into one commit, whilst preserving the hash of each individual change.

cnj

I predict that Anthropic will clamp down on the heavy agentic use of Claude Pro, the cost gap between Claude Code and this approach is HUGE...

Anyway, I'm doing the same - one extra tip is to use the "project" feature of Claude Desktop to give your coding assistant some context - use it similar to .cursorrules.

jasonjmcghee

Why would they? There's already rate limits in place. If you use it too much, you have to stop for a while. Whether you're doing text extraction from large documents, having it write your memoir, or using a diy Claude Code, what's the difference?

ezyang

There's also some fundamental limitations to the Desktop MCP experience that are probably never getting fixed; Claude Code can spin off subagents and play around with the context, I assume that Claude Desktop's form factor is basically going to stay the way it is until the end of time lol.

roger_

I’m interested in using MCP for pair programming.

What tool can I use to point it to a directory and give Claude access to the code? I don’t want to have to write my own server.

ezyang

You can try (self promo) https://github.com/ezyang/codemcp . https://github.com/rusiaaman/wcgw is also quite popular, although they allow unrestricted shell access (that's why it's named wcgw lol).

roger_

Thanks! Does codemcp support having the server on another machine? Maybe communicate over ssh?

ezyang

Not built in, you'll have to use something like https://github.com/sparfenyuk/mcp-proxy

rahimnathwani

The official filesystem MCP Server: https://github.com/modelcontextprotocol/servers/tree/main/sr...

(It's not specifically made for coding.)

delegate

Soon more people will realize that they might not need software written by others at all.

You chat with an AI and have a working app in minutes.

I've been building about 1 app per week lately and they're not trivial apps. They have UI, backend, database, audio, etc.

One fact checks audio in real time, another does semantic search for song lyrics.

A real-time translator, which translates the text to Spanish as I'm typing. I needed it to better communicate with a Tinder date. It has a nice javafx UI. It didn't occur to me to look for a translator app, since those come with restrictions/registration/ads/etc.

This is possible today. Give it a bit more time and it will be able to clone any existing app.

The question 'what are we going to do then?' visits me more often lately.

Elbouyave

Someone know if a good support to learn MCP on Windows ? I can't really find strong comprehensive infos, lectures or video on this topic.

ezyang

Most of the problem is installing the MCP servers, which is more annoying on Windows. https://github.com/ezyang/codemcp#getting-started has instructions that I've personally tested for installation on Windows, which might help you out some.

atxtechbro

Sounds interesting, but I assume it is only available if you have access to Claude Desktop which is not available on Linux if I understand correctly.

knowaveragejoe

It should work on any client that supports MCP, of which there are many:

https://modelcontextprotocol.io/clients

ezyang

But only Claude Desktop gets flat $20 pricing from Claude Pro lol

nemofoo

The cost savings here are pretty wild! I wonder if Anthropic will push back

davely

They are for sure going to lock this down.

I’ve gone through $25 USD in API credits in a single afternoon with Claude Code (I love it, but that thing is thirsty for dollar bills).

I’ve been reluctant to try this sort of thing out because it’s fairly trivial for them to detect this and potentially come down with the ban hammer. I’d rather not risk it.

Anyway, I’ve found myself switching back to Aider as of late because it is much more conservative in how it uses its token budget.

ezyang

IMO, the big problem with Aider is that it's not agentic. This is good because it means costs are down, but most of the edit-test-fix loop magic in coding agents comes from the agent loop.

ToJans

I've been using Claude with the MCP servers daily, and get put on pause a few times a day due to my heavy usage.

However, I do hope they do not plan to use the pricing that they are using for Claude max, as a single prompt usually generates about 50 tool calls for my use case. (In max this would cost me $5.05). I'll easily burn $50 to $100 per hour, and I haven't even added all the tools I'd like to use yet...

If it gets expensive, I'll probably only use it for architectural work, and use my own AI LLM for more tactical tasks.

This will be slower and less powerful, but we already have an AI server for image analysis, so it makes sense to use it.

lyime

You are not getting more than $25 of value?

cruffle_duffle

Maybe, maybe not. Once something stops being free the human brain starts setting expectations (at least mine does.) Sure that last conversation “only” cost $0.08 but since I’m now paying it better damn well work and not fuck up my code. And even though it was only $0.08 I’ll be cranky that I have to undo its changes because it went down a rabbit hole and now need to reprompt it or ask it what the hell it thinks it’s doing… costing yet another $0.08.

Sure it’s a few pennies but it does add up. I’m sure there is some research or term / explanation for this phenomenon.