Everything wrong with MCP

231 comments

·April 13, 2025

dend

Coordinator of the authorization RFC linked in this post[1].

The protocol is in very, very early stages and there are a lot of things that still need to be figured out. That being said, I can commend Anthropic on being very open to listening to the community and acting on the feedback. The authorization spec RFC, for example, is a coordinated effort between security experts at Microsoft (my employer), Arcade, Hellō, Auth0/Okta, Stytch, Descope, and quite a few others. The folks at Anthropic set the foundation and welcomed others to help build on it. It will mature and get better.

[1]: https://github.com/modelcontextprotocol/modelcontextprotocol...

magicalhippo

A nice, comprehensive yet accessible blog post about it can be found here[1], got submitted earlier[2] but didn't gain traction.

[1]: https://aaronparecki.com/2025/04/03/15/oauth-for-model-conte...

[2]: https://news.ycombinator.com/item?id=43620496

dend

Great news - Aaron has been a core reviewer and contributor to the aforementioned RFC.

magicalhippo

Yeah figured he had to be involved and saw his name on the pull request after I posted.

Really enjoyed the article he wrote, just wanted to promote it some more. I learned of several things that will be useful to me beyond MCP.

martypitt

Impressive to see this level of cross-org coordination on something that appears to be maturing at pace (compared to other consortium-style specs/protocol I've seen attempted)

Congrats to everyone.

sshh12

Awesome! Thanks for your work on this.

dend

Can't take any credit - it's a massive effort across many folks much smarter than me.

Y_Y

This reminds me of something Adam Smith said in The Wealth of Nations:

"People of the same trade seldom meet together, even for merriment and diversion, but the conversation ends in a conspiracy against the public, or in some contrivance to raise prices."

Ymmv, but I cannot image that this "innovation" will result in a better outcome for the general public.

EigenLord

The author makes good general points but seems to be overloading MCP's responsibilities imo. My understanding of MCP is that it just provides a ready-made "doorway" for LLMs to enter and interact with externally managed resources. It's a bridge or gateway. So is it really MCP's fault that it:

>makes it easier to accidentally expose sensitive data.

So does the "forward" button on emails. Maybe be more careful about how your system handles sensitive data. How about:

>MCP allows for more powerful prompt injections.

This just touches on wider topic of only working with trusted service providers that developers should abide by generally. As for:

>MCP has no concept or controls for costs.

Rate limit and monitor your own usage. You should anyway. It's not the road's job to make you follow the speed limit.

Finally, many of the other issues seem to be more about coming to terms with delegating to AI agents generally. In any case it's the developer's responsibility to manage all these problems within the boundaries they control. No API should have that many responsibilities.

TeMPOraL

Yeah. That's another in a long line of MCP articles and blogposts that's been coming up over the past few weeks, that can be summarized as "breaking news: this knife is sharp and can cut someone if you swing it at people, it can cut you if you hold it the wrong way, and is not a toy suitable for small children".

Well, yes. A knife cuts things, it's literally its only job. It will cut whatever you swing it at, including people and things you didn't intend to - that's the nature of a general-purpose cutting tool, as opposed to e.g. safety razor or plastic scissors for small children, which are much safer, but can only cut few very specific things.

Now, I get it, young developers don't know that knives and remote access to code execution on a local system are both sharp tools and need to be kept out of reach of small children. But it's one thing to remind people that the tool needs to be handled with care; it's another to blame it on the tool design.

Prompt injection is a consequence of the nature of LLMs, you can't eliminate it without degrading capabilities of the model. No, "in-band signaling" isn't the problem - "control vs. data" separation is not a thing in nature, it's designed into systems, and what makes LLMs useful and general is that they don't have it. Much like people, by the way. Remote MCPs as a Service are a bad idea, but that's not the fault of the protocol - it's the problem of giving power to third parties you don't trust. And so on.

There is technical and process security to be added, but that's mostly around MCP, not in it.

Joker_vD

Well. To repurpose you knife analogy, they (we?) duct-taped a knife on an erratic, PRNG-controlled roomba and now discover that people are getting their Achilles tendons sliced. Technically, it's all functioning exactly as intended, but: this knife was designed specifically to be attached to such roombas, and apparently nobody stopped to think whether it was such a great idea.

And admonishments of "don't use it when people are around, but if you do, it's those people's fault when they get cut: they should've be more careful and probably wore some protective foot-gear" while technically accurate, miss the bigger problem. That is, that somebody decided to strap a sharp knife to a roomba and then let it whiz around in the space full of people.

Mind you, we have actual woodcutting table saws with built-in safety measures: they instantly stop when they detect contact with human skin. So you absolutely can have safe knives. They just cost more, and I understand that most people value (other) people's health and lives quite cheaply indeed, and so don't bother buying/designing/or even considering such frivolities.

dharmab

This is a total tangent, but we can't have 100% safe knives because one of the uses for a knife is to cut meat. (Sawstop the company famously uses hot dogs to simulate human fingers in their demos.)

skybrian

The problem with the “knife is sharp” argument is that it’s too generic. It can be deployed against most safety improvements. The modern world is built on driving accident rates down to near-zero. That’s why we have specialized tools like safety razors. Figuring out what to do to reduce accident rates is what postmortems are for - we don’t just blame human error, we try to fix things systematically.

As usual, the question is what counts as a reasonable safety improvement, and to do that we would need to go into the details.

I’m wondering what you think of the CaMeL proposal?

https://simonwillison.net/2025/Apr/11/camel/#atom-everything

noodletheworld

Some of the other issues are less important than others, but even if you accept “you have to take responsibility for yourself”, let me quote the article:

> As mentioned in my multi-agent systems post, LLM-reliability often negatively correlates with the amount of instructional context it’s provided. This is in stark contrast to most users, who (maybe deceived by AI hype marketing) believe that the answer to most of their problems will be solved by providing more data and integrations. I expect that as the servers get bigger (i.e. more tools) and users integrate more of them, an assistants performance will degrade all while increasing the cost of every single request. Applications may force the user to pick some subset of the total set of integrated tools to get around this.

I will rephrase it in stronger terms.

MCP does not scale.

It cannot scale beyond a certain threshold.

It is Impossible to add an unlimited number of tools to your agents context without negatively impacting the capability of your agent.

This is a fundamental limitation with the entire concept of MCP and needs addressing far more than auth problems, imo.

You will see posts like “MCP used to be good but now…” as people experience the effects of having many MCP servers enabled.

They interfere with each other.

This is fundamentally and utterly different from installing a package in any normal package system, where not interfering is a fundamental property of package management in general.

Thats the problem with MCP.

As an idea it is different to what people trivially expect from it.

weird-eye-issue

I think this can largely be solved with good UI. For example, if an MCP or tool gets executed that you didn't want to get executed, the UI should provide an easy way to turn it off or to edit the description of that tool to make it more clear when it should be used and should not be used by the agent.

Also, in my experience, there is a huge bump in performance and real-world usage abilities as the context grows. So I definitely don't agree about a negative correlation there, however, in some use cases and with the wrong contexts it certainly can be true.

zoogeny

I don't think that could be sufficient to solve the problem.

I'm using Gemini with AI Studio and the size of a 1 million token context window is becoming apparent to me. I have a large conversation, multiple paragraphs of text on each side of the conversation, with only 100k tokens or so. Just scrolling through that conversation is a chore where it becomes easier just to ask the LLM what we were talking about earlier rather than try to find it myself.

So if I have several tools, each of them adding 10k+ context to a query, and all of them reasonable tool requests - I still can't verify that it isn't something "you [I] didn't want to get executed" since that is a vague description of the failure states of tools. I'm not going to read the equivalent of a novel for each and every request.

I say this mostly because I think some level of inspectability would be useful for these larger requests. It just becomes impractical at larger and larger context sizes.

robertlagrant

> For example, if an MCP or tool gets executed that you didn't want to get executed, the UI should provide an easy way to turn it off or to edit the description of that tool to make it more clear when it should be used and should not be used by the agent.

Might this become more simply implemented as multiple individual calls, possibly even to different AI services, chained together with regular application software?

TeMPOraL

Simple: if the choice is getting overwhelming to the LLM, then... divide and conquer - add a tool for choosing tools! Can be as simple as another LLM call, with prompt (ugh, "agent") tasked strictly with selecting a subset of available tools that seem most useful for the task at hand, and returning that to "parent"/"main" "agent".

You kept adding more tools and now the tool-master "agent" is overwhelmed by the amount of choice? Simple! Add more "agents" to organize the tools into categories; you can do that up front and stuff the categorization into a database and now it's a rag. Er, RAG module to select tools.

There are so many ways to do it. Using cheaper models for selection to reduce costs, dynamic classification, prioritizing tools already successfully applied in previous chat rounds (and more "agents" to evaluate if a tool application was successful)...

Point being: just keep adding extra layers of indirection, and you'll be fine.

soulofmischief

The problem is that even just having the tools in the context can greatly change the output of the model. So there can be utility in the agent seeing contextually relevant tools (RAG as you mentioned, etc. is better than nothing) and a negative utility in hiding all of them behind a "get_tools" request.

empath75

"Sequential thinking" is one that I tried recently because so many people recommend it, and I have never, ever, seen the chatbot actually do anything but write to it. It never follows up any of it's chains of thoughts or refers to it's notes.

DavidPP

In which client and with which LLM are you using it?

I use it in Claude Desktop for the right use case, it's much better than thinking mode.

But, I admit, I haven't tried it in Cursor or with other LLMs yet.

kiitos

> It is Impossible to add an unlimited number of tools to your agents context without negatively impacting the capability of your agent.

Huh?

MCP servers aren't just for agents, they're for any/all _clients_ that can speak MCP. And capabilities provided by a given MCP server are on-demand, they only incur a cost to the client, and only impact the user context, if/when they're invoked.

noodletheworld

> they only incur a cost to the client, and only impact the user context, if/when they're invoked.

Look it up. Look up the cross server injection examples.

I guarantee you this is not true.

An MCP server is at it's heart some 'thing' that provides a set of 'tools' that an LLM can invoke.

This is done by adding a 'tool definition'.

A 'tool definition' is content that goes into the LLM prompt.

That's how it works. How do you imagine an LLM can decide to use a tool? It's only possible if the tool definition is in the prompt.

The API may hide this, but I guarantee you this is how it works.

Putting an arbitrary amount of 3rd party content into your prompts has a direct tangible impact on LLM performance (and cost). The more MCP servers you enable the more you pollute your prompt with tool definitions, and, I assure you, the worse the results are as a result.

Just like pouring any large amount of unrelated crap into your system prompt does.

At a small scale, it's ok; but as you scale up, the LLM performance goes down.

Here's some background reading for you:

https://github.com/invariantlabs-ai/mcp-injection-experiment...

https://docs.anthropic.com/en/docs/build-with-claude/tool-us...

Spivak

I think the author's point is that the architecture of MCP is fundamentally extremely high trust between not only your agent software and the integrations, but the (n choose 2) relationships between all of them. We're doing the LLM equivalent of loading code directly into our address space and executing it. This isn't a bad thing, dlopen is incredibly powerful with this power, but the problem being solved with MCP just isn't that level of trust.

The real level of trust is on the order OAuth flows where the data provider has a gun sighted on every integration. Unless something about this protocol and it's implementations change I expect every MCP server to start doing side-channel verification like getting an email "hey your LLM is asking to do thing, click the link to approve." Where in this future it severely inhibits the usefulness of agents in the same vein as Apple's "click the notification to run this automation."

zoogeny

Sure, at first, until the users demand a "always allow this ..." kind of prompt and we are back in the same place.

A lot of these issues seem trivial when we consider having a dozen agents running on tens of thousands of tokens of context. You can envision UIs that take these security concerns into account. I think a lot of the UI solutions will break down if we have hundreds of agents each injecting 10k+ tokens into a 1m+ context. The problems we are solving for today won't hold as LLMs continue to increase in size and complexity.

ZiiS

> Rate limit and monitor your own usage. You should anyway. It's not the road's job to make you follow the speed limit.

A better metaphor is the car, not the road. It is legally required to accurately tell you your speed and require deliberate control to increase it.

Even if you stick to a road; whoever made the road is required to research and clearly post speed limits.

jacobr1

Exactly. It is pretty common for APIs to actually signal this too. Headers to show usage limits or rates. Good error codes (429) with actual documentation on backoff timeframes. If you use instrument your service to respect read and respect the signals it gets, everything moves smoother. Backing stuff like that back into the MCP spec or at least having common conventions that are applied on top will be very useful. Similarly for things like tracking data taint, auth, tracing, etc ... Having a good ecosystem makes everything play together much nicer.

TeMPOraL

Also extending the metaphor, you can make a road that controls where you go and makes sure you don't stray from it (whether by accident or on purpose): it's called rail, and its safety guarantees come with reduced versatility.

Don't blame roads for not being rail, when you came in a car because you need the flexibility that the train can't give you.

fsndz

why would anyone accept to expose sensitive data so easily with MCP ? also MCP does not make AI agents more reliable, it just gives them access to more tools, which can decrease reliability in some cases:https://medium.com/thoughts-on-machine-learning/mcp-is-mostl...

Eisenstein

People accept lots of risk in order to do things. LLMs offer so much potential that people want to use so they will try, and it is but through experience that we can learn to mitigate any downsides.

sshh12

Totally agree, hopefully it's clear closer to the end that I don't actually expect MCP to solve and be responsible for a lot of this. More so MCP creates a lot of surface area for these issues that app developers and users should be aware of.

peterlada

Love the trollishness/carelessness of your post. Exactly as you put it: "it is not the road's job to limit your speed".

Like a bad urban planner building a 6 lane city road with the 25mph limit and standing there wondering why everyone is doing 65mph in that particular stretch. Maybe sending out the police with speed traps and imposing a bunch of fines to "fix" the issue, or put some rouge on that pig, why not.

Someone

> Rate limit and monitor your own usage. You should anyway. It's not the road's job to make you follow the speed limit.

In some sense, urban planners do design roads to make you follow the speed limit. https://en.wikipedia.org/wiki/Traffic_calming:

“Traffic calming uses physical design and other measures to improve safety for motorists, car drivers, pedestrians and cyclists. It has become a tool to combat speeding and other unsafe behaviours of drivers”

pgt

MCP is just a transport + wire format with request/response lifecycle and most importantly: tool-level authorization.

The essay misses the biggest problem with MCP:

  1. it does not enable AI agents to functionally compose tools.

  2. MCP should not exist in the first place.

LLMs already know how to talk to every API that documents itself with OpenAPI specs, but the missing piece is authorization. Why not just let the AI make HTTP requests but apply authorization to endpoints? And indeed, people are wrapping existing APIs with thin MCP tools.

Personally, the most annoying part of MCP is the lack of support for streaming tool call results. Tool calls have a single request/response pair, which means long-running tool calls can't emit data as it becomes available – the client has to repeat a tool call multiple times to paginate. IMO, MCP could have used gRPC which is designed for streaming. Need an onComplete trigger.

I'm the author of Modex[^1], a Clojure MCP library, which is used by Datomic MCP[^2].

[^1]: Modex: Clojure MCP Library – https://github.com/theronic/modex

[^2]: Datomic MCP: Datomic MCP Server – https://github.com/theronic/datomic-mcp/

pgt

Previous thoughts on MCP, which I won't rehash here:

- "MCP is a schema, not a protocol" – https://x.com/PetrusTheron/status/1897908595720688111

- "PDDL is way more interesting than MCP" – https://x.com/PetrusTheron/status/1897911660448252049

- "The more I learn about MCP, the less I like it" https://x.com/PetrusTheron/status/1900795806678233141

- "Upon further reflection, MCP should not exist" https://x.com/PetrusTheron/status/1897760788116652065

- "in every new language, framework or paradigm, there is a guaranteed way to become famous in that community" – https://x.com/PetrusTheron/status/1897147862716457175

I don't know if it's taboo to link to twitter, but I ain't gonna copypasta all that.

mdaniel

I hadn't heard of PDDL before but to save others the click on x.com, this is the Xeet:

> This PDDL planning example is much more interesting than what MCP purports to be: https://en.wikipedia.org/wiki/Planning_Domain_Definition_Lan...

> Imagine a standard planning language for model interconnect that enables collaborative goal pursuit between models.

> Maybe I'll make one.

kiitos

MCP is literally defined as a protocol.

It doesn't have anything to say about the transport layer, and certainly doesn't mandate stdio as a transport.

> The main feature of MCP is auth

MCP has no auth features/capabilities.

I think you're tilting at windmills here.

pgt

I regret to inform you that you are the victim of quality control, Sir @kiitos:

1. MCP specifies two transport layers: stdio/stdout + HTTP w/SSE [^1]

2. MCP specifies JSON-RPC as the wire format [^2].

In my opinion, this is a schema on top of a pre-existing RPC protocol, not a new protocol.

I implemented the stdio transport, the JSON-RPC wire format & Tools support of the spec in Modex[^3].

- [^1]: https://modelcontextprotocol.io/docs/concepts/transports

- [^2]: https://modelcontextprotocol.io/specification/2025-03-26

- [^3]: https://github.com/theronic/modex

pgt

Re: Auth, you are correct that MCP does not specify auth (aside from env vars for e.g. API keys which is host-specific – another gripe of mine).

However, practically the host (e.g. Claude Desktop by Anthropic) asks for permission before calling specific MCP tools.

It is not part of the MCP spec, but it's part of most host implementations of MCP and one of the big practical reasons for MCP's existence is to avoid giving models carte blanche HTTP access.

IMO this should be part of the MCP spec, e.g. "you can call this GET /weather endpoint any time, but to make payments via this POST /transactions request, ask for permission once or always."

Aside: just because someone "defines <X> as something" does not make it true.

cruffle_duffle

There are plenty of things out there that don’t use OpenAPI. In fact most things aren’t.

Even if the universe was all OpenAPI, you’d still need a lower level protocol to define exactly how the LLM reaches out of the box and makes the OpenAPI call in the first place. That is what MCP does. It’s the protocol for calling tools.

It’s not perfect but it’s a start.

pgt

AI can read docs, Swagger, OpenAI and READMEs, so MCP adds nothing here. All you need is an HTTP client with authorization for endpoints.

E.g. in Datomic MCP[^1], I simply tell the model that the tool calls datomic.api/q, and it writes correct Datomic Datalog queries while encoding arguments as EDN strings without any additional READMEs about how EDN works, because AI knows EDN.

And AI knows HTTP requests, it just needs an HTTP client, i.e. we don't need MCP.

So IMO, MCP is an Embrace, Extend (Extinguish?) strategy by Anthropic. The arguments that "foundational model providers don't want to deal with integration at HTTP-level" are uncompelling to me.

All you need is an HTTP client + SSE support + endpoint authz in the client + reasonable timeouts. The API docs will do the rest.

Raw TCP/UDP sockets more dangerous, but people will expose those over MCP anyway.

[^1]: https://github.com/theronic/datomic-mcp/blob/main/src/modex/...

taeric

I mean... you aren't wrong that OpenAPI doesn't have universal coverage. This is true. Neither did WSDL and similar things before it.

I'm not entirely clear on why it make sense to jump in with a brand new thing, though? Why not start with OpenAPI?

cruffle_duffle

Because OpenAPI doesn’t solve the problem that MCP does. How does the LLM and its host make tool calls?

OpenAPI doesn’t solve that at all.

resters

> 1. it does not enable AI agents to functionally compose tools.

Is there something about the OpenAI tool calling spec that prevents this?

pgt

I haven't looked at the OpenAI tool calling spec, but the lack of return types in MCP, as reported by Erik Meijers, makes composition hard.

Additionally, the lack of typed encodings makes I/O unavoidable because the model has to interpret the schema of returned text values first to make sense of it before passing it as input to other tools. Makes it impossible to pre-compile transformations while you wait on tool results.

IMO endgame for MCP is to delete MCP and give AI access to a REPL with eval authorized at function-level.

This is why, in the age of AI, I am long dynamic languages like Clojure.

keithwhor

I mean you don’t need gRPC. You can just treat all tool calls as SSEs themselves and you have streaming. HTTP is pretty robust.

pgt

HTTP Server-Sent Events (SSE) does not natively support batched streaming with explicit completion notifications in its core specification.

serbuvlad

This article reads less like a criticism of MCP, the internal technical details of which I don't know that much about, and they make the subject of but a part of the srticle, but a general criticism of the general aspect of "protocol to allow LLM to run actions on services"

A large problem in this article stems from the fact that the LLM may take actions I do not want it to take. But there are clearly 2 types of actions the LLM can take: those I want it to take on it's own, and those I want it to take only after prompting me.

There may come a time when I want the LLM to run a business for me, but that time is not yet upon us. For now I do not even want to send an e-mail generated by AI without vetting it first.

But the author rejects the solution of simply prompting the user because "it’s easy to see why a user might fall into a pattern of auto-confirmation (or ‘YOLO-mode’) when most of their tools are harmless".

Sure, and people spend more on cards than they do with cash and more on credit cards than they do on debit cards.

But this is a psychological problem, not a technological one!

jwpapi

I have read 30 MCP articles now and I still don’t understand why we not just use API?

serverlessmania

MCP allows LLM clients you don’t control—like Claude, ChatGPT, Cursor, or VSCode—to interact with your API. Without it, you’d need to build your own custom client using the LLM API, which is far more expensive than just using existing clients like ChatGPT or Claude with a $20 subscription and teaching them how to use your tools.

I built an MCP server that connects to my FM hardware synthesizer via USB and handles sound design for me: https://github.com/zerubeus/elektron-mcp.

jonfromsf

But couldn't you just tell the LLM client your API key and the url of the API documentation? Then it could interact with the API itself, no?

serverlessmania

Not all clients support that—currently limited to ChatGPT custom GPT actions. It’s not a standard. Fortunately, Anthropic, Google, and OpenAI have agreed to adopt MCP as a shared protocol to enable models to use tools. This protocol mainly exists to simplify things for those building LLM-powered clients like Claude, ChatGPT, Cursor, etc. If you want an LLM (through API calls) to interact with an your API, you can’t just hand it an API key and expect it to work—you need to build an Agent for that.

jacobr1

In some sense that is actually what MCP is. A way to document APIs and describe how to call them, along with some standardized tooling to expose that documentation and make the calls. MCP hit a sweet spot of just enough abstraction to wrap APIs without complicating things. Of course, since they didn't add a bunch of extra stuff ... that leads allowing users to footgun themselves per the article.

mhast

You could do that. But then you need to explain to the LLM how to do the work every time you want to use that tool.

And you also run into the risk that the LLM will randomly fail to use the tool "correctly" every time you want to invoke it. (Either because you forgot to add some information or because the API is a bit non-standard.)

All of this extra explaining and duplication is also going to waste tokens in the context and cost you extra money and time since you need to start over every time.

MCP just wraps all of this into a bundle to make it more efficient for the LLM to use. (It also makes it easier to share these tools with other people.)

Or if you prefer it. Consider that the first time you use a new API you can give these instructions to the LLM and have it use your API. Then you tell it "make me an MCP implementation of this" and then you can reuse it easily in the future.

zoogeny

Yes, but then you have to add that yourself to every prompt. It would be nice to tell your LLM provider just once "here is a tool you can use" along with a description of the API documentation so that you could use it in a bunch of different chat's without having to remind it every single time. That way, when you want to use the tool you can just ask for the tool without having to provide that detail again and again.

Also, it would be kind of cool if you could tell a desktop LLM client how it could connect to a program running on your machine. It is a similar kind of thing to want to do, but you have to do a different kind of processes exec depending on what OS you are running on. But maybe you just want it to ultimately run a Python script or something like that.

MCP addresses those two problems.

romanovcode

Yes but it's not as revolutionary as MCP. You don't get it...

12ian34

elektron user here. wow thank you :)

null

[deleted]

siva7

ChatGPT still doesn't support MCP. It really fell behind Google or Anthropic in the last months in most categories. Gemini pro blows o1 pro away.

nzach

> why we not just use API

Did you meant to write "a HTTP API"?

I asked myself this question before playing with it a bit. And now I have a slightly better understanding, I think the main reason was created as a way to give access of your local resources (files, envvars, network access...) to your LLM. So it was designed to be something you run locally and the LLM has access.

But there is nothing preventing you making an HTTP call from a MCP server. In fact, we already have some proxy servers for this exact use-case[0][1].

[0] - https://github.com/sparfenyuk/mcp-proxy

[1] - https://github.com/adamwattis/mcp-proxy-server

throw310822

I'm not sure I get it too. I get the idea of a standard api to connect one or more external resources providers to an llm (each exposing tools + state). Then I need one single standard client-side connector to allow the llm to talk to those external resources- basically something to take care of the network calls or other forms of i/o in my local (llm-side) environment. Is that it?

lsaferite

Sounds mostly correct. The standard LLM tool call 'shape' matches the MCP tool call 'shape' very closely. It's really just a simple standard to support connecting a tool to an agent (and by extension an LLM).

There are other aspects, like Resources, Prompts, Roots, and Sampling. These are all relevant to that LLM<->Agent<->Tools/Data integration.

As with all things AI right now, this is a solution to a current problem in a fast moving problem space.

yawnxyz

I have an API, but I built an MCP around my API that makes it easier for something like Claude to use — normally something that's quite tough to do (giving special tools to Claude).

mehdibl

Because you need mainly a bridge between the Function calling schema defined that you expose to the AI model so you can leverage them. The model need a gateway as API can't be used directly.

MCP core power is the TOOLS and tools need to translate to function calls and that's mainly what MCP do under the hood. Your tool can be an API, but you need this translation layer function call ==> Tool and MCP sits in the middle

https://platform.openai.com/docs/guides/function-calling

jasondigitized

I played around with MCP this weekend and I agree. I just want to get a users X and then send X to my endpoint so I can do something with it. I don't need any higher level abstraction than that.

null

[deleted]

null

[deleted]

aoeusnth1

If you are a tool provider, you need a standard protocol for the AI agent frontends to be able to connect to your tool.

geysersam

I think the commenter is asking "why can't that standard protocol be http and open api?"

stevenhuang

MCP is a meta-api and it basically is that, but with further qualifications that the endpoints and how they work themselves are part of the spec so LLMs can work with them better.

kristoff200512

I think it's fine if you only need a standalone API or know exactly which APIs to call. But when users ask questions or you're unsure which APIs to use, MCP can solve this issue—and it can process requests based on your previous messages.

mlenhard

One of the biggest issues I see, briefly discussed here, is how one MCP server tool's output can affect other tools later in the same message thread. To prevent this, there really needs to be sandboxing between tools. Invariant labs did this with tool descriptions [1], but I also achieved the same via MCP resource attachments[2]. It's a pretty major flaw exacerbated by the type of privilege and systems people are giving MCP servers access to.

This isn't necessarily the fault of the spec itself, but how most clients have implemented it allows for some pretty major prompt injections.

[1] https://invariantlabs.ai/blog/mcp-security-notification-tool... [2] https://www.bernardiq.com/blog/resource-poisoning/

cyanydeez

Isn't this basically a lot of hand waving that ends up being isomorphic to SQL injection?

Thats what we're talking about? A bunch of systems cobbled together where one could SQL inject at any point and there's basically zero observability?

seanhunter

Yes, and the people involved in all this stuff have also reinvented SQL injection in a different way in the prompt interface, since it's impossible[1] for the model to tell what parts of the prompt are trustworthy and what parts are tainted by user input, no matter what delimeters etc you try to use. This is because what the model sees is just a bunch of token numbers. You'd need to change how the encoding and decoding steps work and change how models are trained to introduce something akin to the placeholders that solve the sql injection problem.

Therefore it's possible to prompt inject and tool inject. So you could for example prompt inject to get a model to call your tool which then does an injection to get the user to run some untrustworthy code of your own devising.

[1] See the excellent series by Simon Willison on this https://simonwillison.net/series/prompt-injection/

mlenhard

Yeah, you aren't far off with SQL injection comparison. That being said it's not really a fault of the MCP spec, more so with current client implementations of it.

jeswin

> MCP servers can run (malicious code) locally.

I wrote an MCP Server (called Codebox[1]) which starts a Docker container with your project code mounted. It works quite well, and I've been using it with LibreChat and vscode. In my experience, Agents save 2x the time (over using an LLM traditionally) and is less typing, but at roughly 3x the cost.

The idea is to make the entire Unix toolset available to the LLM (such as ls, find), along with project specific tooling (such as typescript, linters, treesitter). Basically you can load whatever you want into the container, and let the LLM work on your project inside it. This can be done with a VM as well.

I've found this workflow (agentic, driven through a Chat based interface) to be more effective compared to something like Cursor. Will do a Show HN some time next week.

[1]: https://github.com/codespin-ai/codebox-js

jillesvangurp

Any interpreter can run malicious code. Mostly the guidance is: don't run malicious code if you don't want it to run. The problem isn't the interpreter/tool but the entity that's using it. Because that's the thing that you should be (mis)-trusting.

The issue is two fold:

- models aren't quite trustworthy yet.

- people put a lot of trust in them anyway.

This friction always exist with security. It's not a technical problem that can or should be solved on the MCP side.

Part of the solution is indeed going to come from containerization. Give MCP agents access to what they need but not more. And part of it is going to come from some common sense and the tool UX providing better transparency into what is happening. Some of the better examples I've seen of Agentic tools work like you outline.

I don't worry too much about the cost. This stuff is getting useful enough that paying a chunk of what normally would go into somebody's salary actually isn't that bad of a deal. And of course cost will come down. My main worry is actually speed. I seem to spend a lot of time waiting for these tools to do their thing. I'd love this stuff to be a bit zippier.

jeswin

> Give MCP agents access to what they need but not more.

My view is that you should give them (Agents) a computer, with a complete but minimal Linux installation - as a VM or Containerized. This has given me better results, because now it can say fetch information from the internet, or do whatever it wants (but still in the sandbox). Of course, depending on what you're working on, you might decide that internet access is a bad idea, or that it should just see the working copy, or allow only certain websites.

jacobr1

If you give it access to the internet ... it can basically do anything, exfil all your code, receive malicious instructions. The blast radius (presuming it doesn't get out of your sandbox) is limited to loss of whatever your put in (source code) and theft of resources (running a coinminer, host phishing attacks, etc ...). As you say, you can limit things to trusted websites which helps .. but even then, if you trust, say github, anyone can host malicious instructions. The risk tradeoffs (likelihood of of hitting malicious instruction, vs productivity benefit) might nevertheless be worth it ... not to much targetted maliciousness in wild yet. And just a bit more gaurdrailing and logging can go a long way.

ycombinatrix

>now it can say fetch information from the internet...(but still in the sandbox)

If it is talking to the internet, it is most definitely not sandboxed.

peterlada

Let me give you some contrast here:

- employees are not necessarily trustworthy

- employers place a lot of trust in them anyway

mdaniel

This argument comes up a lot, similar to the "humans lie, too" line of reasoning

The difference in your cited case is that employees are a class of legal person which is subject to the laws of the jurisdiction in which they work, along with any legal contracts they signed as a condition of their employment. So, that's a shitload of words to say "there are consequences" which isn't true of a bunch of matrix multiplications that happen to speak English and know how to invoke RPCs

sunpazed

Let’s remind ourselves that MCP was announced to the world in November 2024, only 4 short months ago. The RFC is actively being worked on and evolving.

sealeck

It's April 2025

marcellus23

Yes, and it's been about 4 and a half months since Nov 25, 2024.

klntsky

MCP is a dead end for chatbots. Building valuable workflows requires more than tool calling, most importantly, understanding the context of a conversation to adjust the tools dynamically.

simonw

What does that have to do with MCP? Those sound like things you would build on a separate layer from MCP.

klntsky

MCP tools are completely static. If you have to expose some APIs bypassing MCP, then you don't need MCP in the first place, because you don't have the tools abstracted anymore.

I believe it is possible to build nuanced workflows with well abstracted / reusable / pluggable tools, it's just not as simple as implementing a static discovery / call dispatch layer.

cmsparks

MCP isn't static. It explicitly includes support for dynamically modifying tools, resources, etc via it's client notifications[0]. Sure, context is usually opaque to the server itself (unless you use the sampling feature[1]), but there's nothing preventing MCP clients/hosts from adjusting or filtering tools on their own.

[0] https://modelcontextprotocol.io/specification/2025-03-26/ser...

[1] https://modelcontextprotocol.io/specification/2025-03-26/cli...

chipgap98

Static in what sense?

aledalgrande

> understanding the context of a conversation to adjust the tools dynamically

This just sounds like a small fine tuned model that knows hundreds of MCP tools and chooses the right ones for the current conversation.

aoeusnth1

Can you give an example where you would need to adjust tools dynamically based on context? Is that for all tools or just for some?

For example, why does a “Google search” tool need to change from context to context?

lyu07282

It's great for general and open ended conversation systems where there is no predefined flow or process to follow. Where you have general capabilities you add to the agent (web search, code execution, web drivers, etc.). But a lot of agentic architecture patterns aren't like that, there you want to closely model the flow, guard rails and constraints of the agent. You can't just take a bunch of MCP services and plug them together to solve every business problem. I think it's a bit unfair to MCP because it doesn't seem to even attempt to solve these problems in the first place. It's not replacing things like LangChain.

Everyone should welcome MCP as an open community-driven standard, because the alternative are fractured, proprietary and vendor-locked protocols. Even if right now MCP is a pretty bad standard over time it's going to improve. I take a bad standard that can evolve with time, over no standard at all.

jacobr1

Right ... the best way to think about MCP right now flipping the switch from every app/agent developer building their own client wrappers around 3rd party APIs to either the API providers building them or community maintained wrappers. But all that gives it is a large toolbox. Like NPM or PyPi or any Apt or any our package maintainer. You still need systems to orchestrate them, secure them, define how to use them, etc ... The next-gen of langchains will be all the much better for having a big toolchest ... but it doesn't negate the need for us to innovate and figure out what that will look like

sshh12

I could see some cases were the tools are user data specific. You upload some csv and now there are some tools customized for slicing and manipulating that data.

It's totally possible to build tools in way that everything is static but might be less intuitive for some use cases.

torginus

Is it just me, or is this thing really, really stupid?

I mean the whole AI personal assistant shebang from all possible angles.

Imagine, for example if booking.com built an MCP server allowing you to book a hotel room, query all offers in an area in a given time, quickly, effortlessly, with a rate limit of 100 requests/caller/second, full featured, no hiding or limiting data.

That would essentially be asking them to just offer you their internal databases, remove their ability to show you ads, remove the possibility to sell advertisers better search rankings, etc.

It would be essentially asking them to keel over and die, and voluntarily surrender all their moat.

But imagine for a second they did do that. You get the API, all the info is there.

Why do you need AI then?

Let's say you want to plan a trip to Thailand with your family. You could use the fancy AI to do it for you, or you could build a stupid frontend with minimal natural language understanding.

It would be essentially a smart search box, where you could type in 'book trip to Thailand for 4 people, 1 week, from July 5th', and then it would parse your query, call out to MCP, and display the listings directly to you, where you could book with a click.

The AI value add here is minimal, even non-existent.

This applies to every service under the sun, you're essentially creating a second Internet just for AIs, without all the BS advertising, fluff, clout chasing and time wasting. I, as a human am dying to get access to that internet.

Edit: I'm quite sure this AI MCP future is going to be enshittified in some way.

sgt101

This is what torpedoed the first AI Assistant push in the late 1990's and early 2000s (see electric elves as an example). Basically we thought that personal travel assistants and open trading platforms would be cool things, but then discovered that content aggregators a) had a business model and b) could offer vertical integration and bulk buys, & discounts to goods providers and consumers, while also policing the platform. So we got Expedia and Ebay.

There is a more fundamental problem as well. Multi-agent systems require the programmer of the agent to influence the state of the agent in order that the agent can act to influence the states of other agents in the system. This is a very hard programming task.

viraptor

You're arguing against your own straw man. "Imagine, for example if booking.com built an MCP server allowing you to book a hotel room (…) no hiding or limiting data." They're not doing that and they're not interested in that. Yes, they would need a public API for that and people could use it directly. But that's not happening.

MCP makes sense where the API already exists and makes the biggest difference if the API call is just a part of the process, not the only and final action.

Even in the booking example, you could push much more of your context into the process and integrate other tools. Rank the results by parking availability or distance to the nearest public/street parking, taking your car's height into account, looking through reviews for phrases you care about (soft bed, child care, ...) and many others things. So now we've got already 4+ different tools that need to work together, improving the results.

Xelynega

I think you missed the point.

To "rank the results by parking availability" you need the results. Currently these are behind paid API keys or frontends with ads.

Why would booking.com allow you to download their entire set of results multiple times through an API for free, when they charge people for that?

anon7000

So what? You’re basically claiming that it’ll fail because some companies won’t want to provide too much value for free.

But that’s such a small part of the equation here. If GitHub has an MCP server, you’re still paying them to host your code (potentially), and you get the benefit of agents being able to access GitHub in your development workflow (say, to look for similar issues or start work on things).

Yes, not every company will shove their data into AI agents. But can you take various tools and plug them together using agents to power up your workflows? That’s what these projects are thinking about. And there are vast numbers of tools which would happily integrate into this process.

tomjen3

I don't buy it.

If they had an API Booking would not likely return their data to you, they would almost certainly have an API that you would search and which would then return the same result you get on their website. Probably with some nice JSON or XML formatting.

Booking makes a small amount of ads, but they are paid by the hotels that you book with. And yes, today they already have to compete with people who go there see a hotel listing and go find the actual hotel off-site. That would not really change if they create an MCP.

It might make it marginally more easy to do, especially automatically. But I suspect the real benefits of booking.com is: A) that you are perceived to get some form of discount and B) you get stamps toward the free stay. And of course the third part which is you trust Booking more than some random hotel.

I actually think it would be a good idea for Booking to have an API. What is the alternative?

I can right now run a Deep search for great hotels in Tokyo - that will probably go through substantially all hotels in Tokyo. Go to the hotel's website and find the information, then search through and find exactly what I want.

Booking.com might prefer I go go to their website, but I am sure they would prefer above all that you book through them.

In fact I think the idea of advertisement is given above impact here, possibly because its a popular way for the places that employ the kind of people who post here to make money, but substantially all businesses that are not web-based and that do not sell web-based services for free don't make their money through ads (at least not directly). For all those places ads are an expense and they would much prefer your AI search their (comparably cheap to serve) websites.

Basically, the only website owners who should object to you going to their website through an AI agent are those who are in the publishing industry and who primarily make money through ads. That is a small number of all possible businesses.

criley2

It doesn't make sense for a middleman like booking.com to let you completely bypass everything they offer.

However it certainly might make sense for an individual hotel to let you bypass the booking.com middleman (a middleman that the hotel dislikes already).

Scenario 1: You logon to booking.com, deal with a beg to join a subscription service (?), block hundreds of ads and trackers, just to search searching through page after page of slop trying to find a matching hotel. You find it, go to the hotels actual webpage and book there, saving a little bit of money.

Scenario 2: You ask your favorite Deep Research AI (maybe they've come up with Diligent Assistant mode) to scan for Thai hotels meeting your specific criteria (similar to the search filters you entered on booking.com) and your AI reaches out to Hotel Discovery MCP servers run by hotels, picks a few matches, and returns them to you with a suggestion. You review the results and select one. The AI agent points out some helpful deals and rewards programs that might apply. Your AI completes the booking.

The value that AI gave you is you no longer did the searching, dealt with the middleman, viewed the ads, got begged to join a subscription service, etc.

However to the hotel, they already don't really like booking.com middleman. They already strongly prefer you book directly with them and give you extra benefits for doing so. From the hotel's perspective, the AI middleman is cheaper to them than booking.com and still preserves the direct business relationship.

bob1029

It's not just you.

However, I would say that I've grown to accept that most people prefer these more constrained models of thinking. Constraints can help free the mind up in other ways. If you do not perceive MCP as constraining, then you should definitely use it. Wait until you can feel the pain of its complexity and become familiar with it. This will be an excellent learning experience.

Also consider the downstream opportunities that this will generate. Why not plan a few steps ahead and start thinking about a consultancy for resolving AI microservices clusterfucks.

dboreham

Yes, the metaproblem is: it has to make money. It turns out that "doing genuinely useful things for end users" almost never makes money. I found this out long ago when I had the experience that booking air travel for optimized cost/convenience was a total pain. I figured software can solve this, so built a ticket search engine that supported the query semantics a human typically wants. Dumb idea because you can't get the data over which to search except from airlines, and airlines know that making it convenient to find low convenient fares makes them less money. So they won't give you the data. In fact the entire problem "finding cheap convenient air fare" is actually there in support of someone else's (the airlines) business model.

FirmwareBurner

"What is Hooli? Excellent question. Hooli isn't just another high tech company. Hooli isn't just about software. Hooli...Hooli is about people. Hooli is about innovative technology that makes a difference, transforming the world as we know it. Making the world a better place, through minimal message oriented transport layers. I firmly believe we can only achieve greatness if first we achieve goodness."

jacobr1

If computer use gets good enough (it isn't just yet ... but it is getting better _fast_) then it doesn't matter, you'll be able to browse the sites the same way a human does to bypass whatever shit they want to toss in your way. This makes their business models much more precarious. Also it makes removing the aggregator easier - you could search for the hotels, query their sites directly. Hotels and airlines have been trying to cut out middlemen for years, they just can't afford to lose the inbound traffic/sales it gives them. But the dynamic shifts every 5-10 years in small ways. Maybe this will be a big shift.

That said, even if the equilibrium changes from today (and I think it will) I still share your cynicism that enshittification will ensue in some form. One example right now is the increasing inability to trust any reviews from any service.

madrox

Nothing the author said is wrong, but I don’t know how much it matters or if it would’ve been better if it handled all this out of the gate. I think if MCP were more complicated no one would’ve adopted it.

Being pretty close to OAuth 1.0 and the group that shaped it I’ve seen how new standards emerge, and I think it’s been so long since new standards mattered that people forgot how they happen.

I was one of the first people to criticize MCP when it launched (my comment on the HN announcement specifically mentioned auth) but I respect the groundswell of support it got, and at the end of the day the standard that matters is the one people follow, even if it isn’t the best.

rcarmo

There is one thing I pointed out in https://taoofmac.com/space/notes/2025/03/22/1900 and seems to be missing from the article:

"... MCP tends to crowd the model context with too many options. There doesn’t seem to be a clear way to set priorities or a set of good examples to expose MCP server metadata–so your model API calls will just pack all the stuff an MCP server can do and shove it into the context, which is both wasteful of tokens and leads to erratic behavior from models."

sshh12

It's in there briefly in the llm limitations section.