Hacking Your Own AI Coding Assistant with Claude Pro and MCP
49 comments
·March 19, 2025rahimnathwani
ezyang
I don't really recommend using filesystem MCP directly, it won't checkpoint changes so it's easy to end up in a state where you can't recover an older working version of the code. Use an actual coding oriented MCP.
rahimnathwani
I'm going to try out yours (https://github.com/ezyang/codemcp). I already pay for Cursor, but I'm curious.
EDIT: I really like the way that each change generates a commit, and that all commits in a single session are squashed into one commit, whilst preserving the hash of each individual change.
EDIT2: I also like you only have one function (each command being a 'subtool'), which means Claude Desktop asks for permission only once per session.
dang
Related ongoing thread:
Show HN: Codemcp – Claude Code for Claude Pro subscribers – ditch API bills - https://news.ycombinator.com/item?id=43356016 - March 2025 (3 comments)
ezyang
And if you don't like clicking, even once, check out my other project, Refined Claude :)
ezyang
There's a few coding MCPs out there. I have also written one (codemcp) and the pitch for mine is that it DOESN'T provide a bash tool by default and checkpoints your filesystem edits every change in Git, so that it's all about feeling comfortable with letting the agent run to completion and then only inspect the final result. The oldest one in the space, I think, is wcgw.
vessenes
MCP documentation is thin on the ground, but I like what I've seen. I think it provides a nice separation of concerns for using LLMs as Agents.
The article mentions a python MCP library, there seems to be a pretty vibrant golang MCP server as well at https://github.com/mark3labs/mcp-go. No personal experience, but I'd usually rather a binary than a python environment install when I can get it.
The other thing the author mentions is that Claude can author MCP integrations and merge them into your server -- sounds good to me!
chaosprint
I have cancelled the cursor subscription, but it is more for functional reasons.
I am confused that the author said he has privacy concerns about cursor, but the most satisfactory thing about cursor is that it asked me whether I want privacy mode at the beginning.
On the contrary, I am very skeptical about whether these websites with free quotas, not only claude, but also grok, chatgpt, etc., will use chat data for training.
If you are really worried about this, it is not impossible to deploy deepseek r1 locally now.
simonw
Claude won't train on your inputs, even for free trial accounts. That's clearly spelled out in their terms: https://www.anthropic.com/legal/consumer-terms
> We will not train our models on any Materials that are not publicly available, except in two circumstances:
> 1. If you provide Feedback to us (through the Services or otherwise) regarding any Materials, we may use that Feedback in accordance with Section 5 (Feedback).
> 2. If your Materials are flagged for trust and safety review, we may use or analyze those Materials to improve our ability to detect and enforce Acceptable Use Policy violations, including training models for use by our trust and safety team, consistent with Anthropic’s safety mission.
You have to read the terms carefully for everyone though, it's frustratingly hard work to keep up with these policies.
Anthropic do have a very interesting way of analyzing usage of their tools without exposing any data to their researchers: https://simonwillison.net/2024/Dec/12/clio/
bionhoward
Do they even need inputs to train on outputs by inferring inputs? Swear I’ve seen papers about this. Not to mention training reward models to grade the output… they could absolutely get around the “we don’t train on your inputs”
Also, doesn’t Claude Code count every input as “feedback?”
Plus, they forbid us from training on our own chat logs. That’s a form of vendor lock in. Way better to use local LLMs
At some point there ought to be a paradigm shift toward service providers with more reasonable legal terms (IMHO) but too often we pretend nobody reads them or cares
If they’re claiming not to learn from inputs, and we’re subject to prohibition about learning from outputs, what the heck is the point?
soulofmischief
The free LLM service is the new free VPN.
lyime
It seems fair for them to train on how people use their tools, so they can make the tools better for people who use them?
anderslaub
I haven't tried working with Cursor for real yet but I can vouch for Claude with filesystem, fetch and optionally a few other mcp servers enabled relevant to whatever you are doing works really good.
I still haven't found a setup that can fully comprehend a large "enterprise" codebase though. The scope has to be narrowed in to get something useful.
But for almost everything else and especially to do things where I previously would have stalled due to lack of time when hitting a learning code. It is a game changer. I can do things in hours that would have taken months before - or Claude can really..
I tried to describe a somewhat similar experience that I had last week here. The 4k char limit made me cut most of it away though.
DavidPP
On my end, I use https://github.com/skydeckai/mcp-server-aidd, which has
- File manipulation - Directory manipulation - tree-sitter integration
and more.
I also installed Tavily Search, sequential thinking, and Playwright.
I still use Cursor for development, and I use Claude Desktop for higher-level documentation, testing, etc.
For example, I'll check out a new repo that is lacking in documentation. I'll get the app running, then explain to Claude where the code lives, how to access the real app, and how I want the features documented.
Then Claude will happily scan the codebase, take screenshots of the running app, etc., all by himself, and then create a report (through the artifact system) with visualizations, graphs, etc.
cnj
I predict that Anthropic will clamp down on the heavy agentic use of Claude Pro, the cost gap between Claude Code and this approach is HUGE...
Anyway, I'm doing the same - one extra tip is to use the "project" feature of Claude Desktop to give your coding assistant some context - use it similar to .cursorrules.
jasonjmcghee
Why would they? There's already rate limits in place. If you use it too much, you have to stop for a while. Whether you're doing text extraction from large documents, having it write your memoir, or using a diy Claude Code, what's the difference?
ezyang
There's also some fundamental limitations to the Desktop MCP experience that are probably never getting fixed; Claude Code can spin off subagents and play around with the context, I assume that Claude Desktop's form factor is basically going to stay the way it is until the end of time lol.
_joel
I built an entire site making it meet the complete spec (with some esoteric tings, it's a non profit) in one day yesterday with Roocode and Claude 3.7 Sonnet. It cost about $30 in total (appreciate I could do it cheaper but was using anthropic api directly). It's amazing, it would have taken me a couple of weeks to get everything right. Getting it to do tests as part of the process and some sublte engineering to get it not to get stuck in loops and ignore files are a must. It's also fun using it in Architect mode and feeding it with Perplexity AI's deep research output as a series of prompt for ingestion into roocode.
ezyang
If you'd done it with MCP it would only have cost you $20 and you would still have had the rest of the month to use your Claude Pro sub :P
_joel
This is true! :)
chaosprint
If you use bolt.new or v0's free tier to make a design and perfect it in cursor, the speed will be much faster than roo, and there will probably be a lot of fast credits left. cost $20
_joel
cheers, I'll give it a go
roger_
I’m interested in using MCP for pair programming.
What tool can I use to point it to a directory and give Claude access to the code? I don’t want to have to write my own server.
ezyang
You can try (self promo) https://github.com/ezyang/codemcp . https://github.com/rusiaaman/wcgw is also quite popular, although they allow unrestricted shell access (that's why it's named wcgw lol).
roger_
Thanks! Does codemcp support having the server on another machine? Maybe communicate over ssh?
ezyang
Not built in, you'll have to use something like https://github.com/sparfenyuk/mcp-proxy
rahimnathwani
The official filesystem MCP Server: https://github.com/modelcontextprotocol/servers/tree/main/sr...
(It's not specifically made for coding.)
jasonjmcghee
I think that all of these tools need LSP support- like the author mentions at the very end, but they also need DAP support.
I still almost never see anyone using breakpoints / expression evaluation with LLMs. Feels critical to me.
Really excited about the evolution of these tools. I think LSP + DAP will be huge.
cruffle_duffle
Coding is only half the interesting part. Get that MCP to start looking through your logs and figuring out some issue. Then have it look at your code, your db schema and your platform config in terraform and all of a sudden things get pretty wild.
delegate
Soon more people will realize that they might not need software written by others at all.
You chat with an AI and have a working app in minutes.
I've been building about 1 app per week lately and they're not trivial apps. They have UI, backend, database, audio, etc.
One fact checks audio in real time, another does semantic search for song lyrics.
A real-time translator, which translates the text to Spanish as I'm typing. I needed it to better communicate with a Tinder date. It has a nice javafx UI. It didn't occur to me to look for a translator app, since those come with restrictions/registration/ads/etc.
This is possible today. Give it a bit more time and it will be able to clone any existing app.
The question 'what are we going to do then?' visits me more often lately.
owebmaster
> The question 'what are we going to do then?' visits me more often lately.
We will develop a new layer of profitable apps. Social apps will not stop to be a thing but it will be easy to create engaging ones.
For anyone trying this at home, if you want a starting point, you could use OP's, or you could use the official Filesystem API: https://github.com/modelcontextprotocol/servers/tree/main/sr... There are folks using just this filesystem API for coding: https://x.com/Zirkman/status/1897003182049649000
I'm not sure I would prefer it to using Cursor. I was using Claude Desktop client to edit a bunch of (non-code) text files a couple of days ago, and a couple of times it crashed, and I had to restart it. When this happened, it lost some conversation history, although of course the files it had already edited were fine.
There are other approaches to having Claude edit files, but they're less suited for use with MCP (and hence they don't save cost like the OP does).
A) Aider gives very specific instructions about how Claude should describe patch operations: https://github.com/Aider-AI/aider/blob/dd4d2420df51dc29c2aed...
This works, but I don't know if anyone has tried using this technique in an MCP server. It seems like the way OP and the Filesystem MCP might be superior for small edits, but Aider's approach might be better for multiple edits in a single request-response. It should be possible to do this with an MCP server.
B) Anthropic provides fine-tuned models (text_editor_20250124 and text_editor_20241022) that know a specific text editing protocol: https://docs.anthropic.com/en/docs/build-with-claude/tool-us...
You could build a coding assistant like Aider on top of these, but obviously you're then tied to this implementation and it's harder to switch out models. And it wouldn't work with Claude Desktop.