OpenAI adds MCP support to Agents SDK

268 comments

·March 26, 2025

keithwhor

Today MCP added Streamable HTTP [0] which is a huge step forward as it doesn't require an "always-on" connection to remote HTTP servers.

However, if you look at the specification it's clear bringing the LSP-style paradigm to remote HTTP servers is adding a bunch of extra complexity. This is a tool call, for example:

    {
      "jsonrpc": "2.0",
      "id": 2,
      "method": "tools/call",
      "params": {
        "name": "get_weather",
        "arguments": {
          "location": "New York"
        }
      }
    }

Which traditionally would just be HTTP POST to `/get_weather` with `{ "location": "New York" }`.

I've made the suggestion to remove some of this complexity [1] and fall back to just a traditional HTTP server, where a session can be negotiated with an `Authorization` header and we rely on traditional endpoints / OpenAPI + JSON Schema endpoint definitions. I think it would make server construction a lot easier and web frameworks would not have to materially be updated to adhere to the spec -- perhaps just adding a single endpoint.

[0] https://spec.modelcontextprotocol.io/specification/2025-03-2...

[1] https://github.com/modelcontextprotocol/specification/issues...

_pdp_

I fully agree.

MCP is just too complex for what it is supposed to do. I don't get what's the benefit. It is the kind of thing that has the potential to be a huge time waste because it requires custom dev tools to develop and troubleshoot.

It is not even a protocol in the traditional sense - more of a convention.

And of course we will implement it, like everyone else, because it is gathering momentum, but I do not believe it is the right approach at all. A simpler HTTP-based OpenAPI service would have been a lot better and it is already well supported in all frameworks.

The only way I can make sense of MCP in the context of an STDIO.

keithwhor

The `stdio` approach for local services makes complete sense to me. Including using JSONRPC.

But for remote HTTP MCP servers there should be a dead simple solution. A couple years ago OpenAI launched plugins as `.well-known/ai-plugin.json`, where it'd contain a link to your API spec, ChatGPT could read it, and voila. So all you needed to implement was this endpoint and ChatGPT could read your whole API. It was pretty cool.

ChatGPT Plugins failed, however. I'm confident it wasn't because of the tech stack, it was due to the fact that the integration demand wasn't really there yet: companies were in the early stages of building their own LLM stacks, ChatGPT desktop didn't exist. It also wasn't marketed as a developer-first global integration solution: little to no consistent developer advocacy was done around it. It was marketed towards consumers and it was pretty unwieldy.

IMO the single-endpoint solution and adhering to existing paradigms is the simplest and most robust solution. For MCP, I'd advocate that this is what the `mcp/` endpoint should become.

Edit: Also tool calling in models circa 2023 was not nearly as good as it is now.

_pdp_

I agree. What OpenAI did was simple and beautiful.

Also, I think there is a fundamental misunderstanding that MCP services are plug and play. They are not. Function names and descriptions are literally prompts so it is almost certain you would need to modify the names or descriptions to add some nuances to how you want these to be called. Since MCP servers are not really meant to be extensible in that sort of way, the only other alternative is to add more context into the prompt which is not easy unless you have a tone of experience. Most of our customers fail at prompting.

The reason I like the ai-plugin.json approach is that you don't have to change the API to make the description of a function a little bit different. One day MCP might support this but it will another layer of complexities that could have been avoided with a remotely hosted JSON / YAML file.

mordymoop

It took me a minute to even understand this comment because for me the “obvious” use-case for MCP is local filesystem tasks, not web requests. Using MCP to manipulate files is my primary LLM use-case and has been ever since Anthropic released it and integrated it into Claude Desktop. I understand where you’re coming from, but I suspect that the idea here is to build something that is more “filesystem first.”

keithwhor

That makes sense. But if that's the case I think we should call a spade a spade and differentiate "Local-first MCP" and "Remote MCP"; because what (many, most?) companies are really trying to do is integrate with the latter.

Which is where you see this sort of feedback, where a bunch of us API engineers are like; "there's already a well-trodden path for doing all of this stuff. Can we just do that and agree that it's the standard?"

tlrobinson

100%. I know I’m in the “get off my lawn” phase of my career when I see things like MCP and LangChain, but know I would have been excited about them earlier in my career.

soulofmischief

LangChain is an objectively terrible Frankenstein's monster of an API. If you were a good developer in your youth, you'd have still held it in contempt, and treat MCP with caution.

The MCP API is pretty bad, too, it's just that a paradigm is starting to emege regarding modularity, integration and agentic tooling, and MCP happens to be the only real shot in that direction st this particular moment.

davedx

I’m seriously considering getting out of IT because of it

pcarolan

Yeah, I had more luck with just giving an ai the openapi spec and letting it figure everything out. I like a lot about MCP (structure, tool guidance, etc), but couldn't it just have been a REST API and a webserver?

zambachi

I think people often think of their specific use-case and tend to forget the bigger picture. MCP does not force one transport or the other and that is great—use any transport you want as long as it uses JSON RPC as the payload.

The two built in transports are also extremely minimalistic and for SSE transport use regular HTTP—no need for web sockets or other heavier dependencies because SSE events are lightweight and broadly supported.

mbroecheler

Second that. A lot of our use cases are "remote tooling", i.e. calling APIs. Implementing an MCP server to wrap APIs seems very complex - both in terms of implementation and infrastructure.

We have found GraphQL to be a great "semantic" interface for API tooling definitions since GraphQL schema allows for descriptions in the spec and is very humanly readable. For "data-heavy" AI use cases, the flexibility of GraphQL is nice so you can expose different levels of "data-depth" which is very useful in controlling cost (i.e. context window) and performance of LLM apps.

In case anybody else wants to call GraphQL APIs as tools in their chatbot/agents/LLM apps, we open sourced a library for the boilerplate code: https://github.com/DataSQRL/acorn.js

_pdp_

Oh wow. Amazing. I did not think of that. I am not fan of GraphQL but you might be onto something here. I have not checked the code and perhaps this is not the right channel for this but my read is that this library allows any generic GraphQL server to exposed in this sort of way?

EGreg

I am actually working on such a thing, and I want to get it righr. This is like RSS or Atom or Jabber or XMPP wars. Or OpenGraph vs Twitter’s meta tags etc. I want interoperability which is cool, but I also want it to seamlessly interoperate with human roles and services.

What is the best way to connect with you? I would like to discuss ideas and protocols if you’re up for that.

rcarmo

Exactly my points: https://taoofmac.com/space/notes/2025/03/22/1900

pulkitsh1234

"Tool calling" is just one part of MCP, there are more things like "Sampling" which allow the server itself to initiate stuff on the client. As for tool calling, having a layer like MCP makes sense because there a lot of things which don't have a REST-API + may need direct access to the computer (filesystem, processes, etc).

Examples:

* Running SQL commands on a DB or a Redis instance. * Launching Docker containers, SSHing to a server an running some command. * Reading a file and extracting relevant information from it, like OCR. * Controlling a remote browser using the WebDriver protocol, have some kind of persistent connection to a backend.

As for just pure REST-API usecases, I think MCP serves what Swagger/OpenApi-Spec are meant to do, i.e. enforce some kind of format and give each endpoint a "Name" + list of Params which the LLM can invoke. The issue is that there is no standardised way to pass these API specs to LLMs as tools (maybe something can be built in this space). In the future, I can easily see some kind of library/abstraction that allows an MCP server to parse an existing API spec file to expose those APIs as tools which can be combined with some local state on the computer to allow stateful interactions with a REST API.

dan-kwiat

I just made a tool which parses any OpenAPI spec to MCP spec: https://www.open-mcp.org (literally just deployed so I don't know if the DNS has propagated globally yet..)

zomglings

I received a 500 response when I attempted to create an MCP server for an API.

I was using this URL: https://engineapi.moonstream.to/metatx/openapi.json

The response body:

    {success: false, error: "Server URL must start with https:// or http://"}

orliesaurus

very cool, I tried to feed it the toolhouse.ai openapi spec and it worked VERY quickly!! wow

ants_everywhere

My bias is I generally think RPC is much nicer than REST. But what's kind of funny here is that we have

(1) an RPC to call a (remote) method called "tools/call", which is a method that

(2) calls a method called get_weather

Both methods have arguments. But the arguments of "tools/call" are called "params" and the arguments of "get_weather" are called "arguments".

I realize this is a common pattern when you have to shell out, e.g. in python's subprocess.run().

But it also seems like there could be a cleaner API with better types.

keithwhor

I don’t disagree. I fought this battle for a long time — ran a company where I tried to simplify SDK development by making every endpoint POST and JSON params; sorta like SOAP / just simple RPC. Why do you need all the HTTP methods when most SDKs simplify everything to .retrieve etc, why not name the endpoints that?

What I realized was that these specs are valuable because they’re stable over long periods of time and handle many sorts of edge cases. Also from a systems integration perspective, everybody already knows and is trained in them. Over many years I accepted the wisdom of commons.

A lot of tooling already exists to make development of these sorts of systems easy to implement and debug. Hence why I think for Remote MCP servers, HTTP as it exists is a great choice.

mattmanser

I don't feel that's really true, it's easy to forget how fast things have moved.

For a long time lots of servers didn't really support PUT or DELETE, and it was only the early 2010s that it became common.

It's still a problem sometimes that you have to explicitly enable them (I'm looking at you IIS + WebDAV).

PATCH wasn't even added till 2010 and you still don't see it commonly used.

Perhaps we have different understandings of 'stable' and 'many years'.

I also agree with you on RPC, it's pretty ridiculous that some guys tried to boil every single API down to essentially 4 verbs. I remember when Google went all crazy on implementing pure REST and their APIs were atrocious.

And everyone still goes along with it even though it clearly doesn't work, so you always end up with a mix of REST and RPC in any non-trivial API.

But pure RPC doesn't really work as then you have to change every call to a POST, as you mention. Which is also confusing as everyone is now used to using the really restricted REST CRUD interface.

So now pure REST sucks and pure RPC sucks. Great job HTTP standards team!

To be fair to them, I know it's hard and at some point you can't fix your mistakes. These days I guess I've just accepted that almost all standards are going to suck a bit.

michaelsbradley

ad hoc RPC[1] that involves JSON request/response payloads and is wed to HTTP transport is arguably worse than conforming to the JSON-RPC 2.0 specification[2].

[1] if it’s not REST (even giving a pass on HATEOAS) then it’s probably, eventually, effectively RPC, and it’s still ad hoc even if it’s well documented

[2] https://www.jsonrpc.org/specification

imtringued

The big irony behind HATEOAS is that LLMs are the mythical "evolvable agents" that are necessary to make HATEOAS work in the first place. HATEOAS was essentially built around human level intelligence that can automatically crawl your endpoints and read documentation written in human language and then they scratched their head why it didn't catch on.

Only browser like clients could conform to HATEOAS, because they essentially delegate all the hard parts (dealing with a dynamically changing structureless API) to a human.

michaelsbradley

Well, like I wrote, "giving a pass on HATEOAS".

With e.g. JSON over HTTP, you can implement an API that satisfies the stateless-ness constraint of REST and so on. Without hypermedia controls it would fit at Level 2 of the RMM, more or less.

In that shape, it would still be a different beast from RPC. And a disciplined team or API overlord could get it into that shape and keep it there, especially if they start out with that intention.

The problem I've seen many times is that a JSON over HTTP API can start out as, or devolve into, a messy mix of client-server interactions and wind up as ad hoc RPC that's difficult to maintain and very brittle.

So, if a team/project isn't committed to REST, and it's foreseeable that the API will end up dominated by RPC/-like interactions, then why not embrace that reality and do RPC properly? Conforming to a specification like JSON-RPC can be helpful in that regard.

null

[deleted]

ammmir

Yeah, I always thought MCP was a bit verbose. It reminds me of the WSDL and SOAP mess of the 2000s. Model tool calls are just RPCs into some other service, so JSON-RPC makes sense. Is there anything else has wide adoption and good client support? XML-RPC? gRPC? Protobufs? I mean, it shouldn't need extra libraries to use. You can handroll a JSON-RPC request/response pretty easily from any programming language.

Regarding the verbosity, yeah, it's interesting how model providers make more money from more tokens used, and you/we end up paying for it somehow. When you're doing lots of tool calls, it adds up!

antupis

Why is this get_weather location "New York" always an example when people talk about tool calling?

richsong

1/ pre-trained models don't know current weather

2/ easy enough for people to understand

deadbabe

Because if you can make it New York City, you can make it anywhere.

pacjam

Totally agree - a well-defined REST API "standard" for tool listing and tool execution would have been much better. Could extend as needed to websockets for persistent connections / streaming data.

croes

Maybe MCP was developed with AI. LLMs tend to be overly verbose

bob1029

I am really struggling with what the value-add is with MCP. It feels like another distraction in the shell game of contemporary AI tech.

> MCP is an open protocol that standardizes how applications provide context to LLMs.

What is there to standardize? Last I checked, we are using a text-to-text transformer that operates on arbitrary, tokenized strings. Anything that seems fancier than tokens-to-tokens is an illusion constructed by the marketing wizards at these companies. Even things like tool/function calling are clever heuristics over plain-ass text.

> Currently, the MCP spec defines two kinds of servers, based on the transport mechanism they use: ...

This looks like micro services crossed with AI. I don't think many are going to have a happy time at the end of this adventure.

vessenes

If you're interested, I'd encourage you to implement an MCP integration and see if you change your mind.

For instance, I have a little 'software team in a box' tool. v1 integrated github and three different llms manually (react + python backend). This is fine. You can call github commands via CLI on the backend, and add functionality somewhat easily, depending on the LLM's knowledge.

Pain points -- if you want workflow to depend on multiple outputs from these pieces, (e.g. see that there's a pull request, and assess it, or see that a pull request is signed off on / merged, and update something) -- you must code most of these workflows manually.

v2, I wiped that out and have a simple git, github and architect MCP protocol written up. Now I can have claude as a sort of mastermind, and just tell it "here are all the things you can do, please XXX". It wipes out most of the custom workflow coding and lets me just tell Claude what I'd look to do -- on the backend, my non-LLM MCP server can deal with things it's good at, API calls, security checks, etc.

polishdude20

So is the MCP server acting like a middleman between the llm and the application you want to control?

Like, could I give the MCP server the ability to say exec Unix code on my machine and then tell the LLM "here's the MCP server, this function can execute Unix code and get back the response".

Then I can tell the LLM, "create an application using the MCP server that will listen to a GitHub webhook and git pull when the webhook hits and have it running" then the LLM would generate the commands necessary to do that and run them through the MCP server which just executed the Unix code. And viola?

I've gotten an llm to create files and run system commands for me.

Is that the most barebones application?

senko

That sounds like v1 was "tool calls llm", while v2 is "llm calls tool"?

The fact that the tool call is via mcp and not in-process function call seems to be an implementation detail?

vessenes

first sentence sounds right to me.

“Implementation detail” is doing a lot of work in the second sentence, though. There are whole startups like langchain that were trying to build out a reasonable agent framework integrated in such a way that the LLMs can drive. MCP makes that really easy — LLM training just has to happen once, against MCP spec, and I get client and LLM support for an iterative tool use scenario right in the LLM.

digdugdirk

Do you have the code available anywhere? I'm working on the same thing to learn how to utilize MCP, I'd love to see how someone else went about it.

vessenes

this will break, kill your computer and probably result in loss of life. Just saying. :)

That said, here's a gist: https://gist.github.com/vessenes/ec43b76965eed1b36b3467c598b...

nlarew

> What is there to standardize?

At a high level, the request format and endpoints. Instead of needing to write a bespoke connector for every type of context that matches their preferred API standards, I just tell my client that the server exists and the standard takes care of the rest.

Do you have similar doubts about something like gRPC?

> This looks like micro services crossed with AI.

Seems like a cynical take with no substance to me. What about a standard request protocol implies anything about separation of concerns, scaling, etc?

bob1029

> At a high level, the request format and endpoints.

I think we fundamentally disagree on what "request format" means in context of a large language model.

Spivak

Because it's not a request format for LLMs, it's a request format for client software that is instrumenting LLMs. If you make a connector to say HomeAssistant to turn on/off your lights you're exposing a tool definition which is really just a JSON schema. The agent will present that tool to the LLM as one it's allowed to use, validate that the LLM matched your change_light_state tool schema and send off the appropriate API call to your server.

The spec is genuinely a hot fucking mess that looks like a hobby project by an overeager junior dev but conceptually it's just a set of JSON schemas to represent common LLM things (prompts, tools, files) and some verbs.

The useful content of the spec is literally just https://github.com/modelcontextprotocol/specification/blob/m... and even then it's a bit much.

ripped_britches

Literally every protocol ever written is “clever heuristics over plain-ass {bits|bytes|text|etc}”

orangebread

I would think of MCP as a "plugin" for AI.

Right now your typical interaction in chat is limited with some exceptions of out of box tooling that most current platforms provide: file attachment, web search, etc.

MCP is a way to extend this toolbox, but the cooler thing is AI will be able to determine _at inference time_ what tools to use to fulfill the user's prompt.

Hope this helped!

oefnak

Well, you should check again, because it hasn't been text to text for a while. There are multimodal models now.

bob1029

If we want to get pedantic, in context of practical transformer models it has never been text to text. It has always been tokens to tokens. The tokens can represent anything. "multimodal" is a marketing term.

null

[deleted]

analyte123

You could say the same thing about almost any protocol, e.g. HTTP, which runs on arbitrary streams of bytes over TCP; and the headers, methods and status codes are just illusions on top of that.

talles

> Think of MCP like a USB-C port for AI applications.

That analogy may be helpful for mom, but not for me as a software engineer.

ondrsh

To really understand MCP you need to think about application design in a different way.

In traditional applications, you know at design-time which functionality will end up in the final product. For example, you might bundle AI tools into the application (e.g. by providing JSON schemas manually). Once you finish coding, you ship the application. Design-time is where most developers are operating in, and it's not where MCP excels. Yes, you can add tools via MCP servers at design-time, but you can also include them manually through JSON schemas and code (giving you more control because you're not restricted by the abstractions that MCP imposes).

MCP-native applications on the other hand can be shipped, and then the users can add tools to the application — at runtime. In other words, at design-time you don't know which tools your users will add (similar to how browser developers don't know which websites users will visit at runtime). This concept — combined with the fact that AI generalizes so well — makes designing this kind of application extremely fascinating, because you're constantly thinking about how users might end up enhancing your application as it runs.

As of today, the vast majority of developers aren't building applications of this kind, which is why there's confusion.

paradite

I think this is a good explanation on the client side of MCP. But most developers are not building MCP clients (I think?). Only a few companies like OpenAI, Anthropic, Cursor and Goose are building MCP client.

Most developers are currently building MCP servers that wrap a 3rd party or wrap their own service. And in this case, they are still at deciding on the tools in design-time, not runtime.

Also I want to mention that both Cursor and Claude desktop don't support dynamic toggling on / off tools within a MCP server, which means users can't really pick which tools to expose to AI. It exposes all tools within a MCP server in current implementation.

ondrsh

The concept of design-time vs. runtime applies to both clients and servers.

I believe you're implying that server developers can focus less on this concept (or sometimes even ignore it) when building a server. This is true.

However, the fact that end-users can now run MCP servers directly — rather than having to wait for developers to bundle them into applications — is a significant paradigm shift that directly benefits MCP server authors.

vykthur

This a good characterisation of functionality MCP might enable. Thanks.

In your opinion, what percentage of apps might benefit from this model where end users bring their own MCP tools to extend the capabilities of your app. What are some good examples of this - e.g., a development tool like Cursor, WindSurf likely apply, but are there others, preferable with end users?

How is the user incentivized to upskill towards finding the right tool to "bring in", installing it and then using it to solve their problem.

How do we think about about the implications of bring your own tools, knowing that unlike plugin based systems (e.g,. Chrome/extensions), MCP servers can be unconstrained in behaviour - all running within your app

ondrsh

> In your opinion, what percentage of apps might benefit from this model where end users bring their own MCP tools to extend the capabilities of your app.

Long term close to 100%. Basically all long-running, user-facing applications. I'm looking through my dock right now and I can imagine using AI tools in almost all of them. The email client could access Slack and Google Drive before drafting a reply, Linear could access Git, Email and Slack in an intelligent manner and so on. For Spotify I'm struggling right now, but I'm sure there'll soon be some kind of Shazam MCP server you can hum some tunes into.

> How is the user incentivized to upskill towards finding the right tool to "bring in", installing it and then using it to solve their problem.

This will be done automatically. There will be registries that LLMs will be able to look through. You just ask the LLM nicely to add a tool, it then looks one up and asks you for confirmation. Running servers locally is an issue right now because local deployment is non-trivial, but this could be solved via something like WASM.

> How do we think about about the implications of bring your own tools, knowing that unlike plugin based systems (e.g,. Chrome/extensions), MCP servers can be unconstrained in behaviour - all running within your app

There are actually 3 different security issues here.

#1 is related to the code the MCP server is running, i.e. the tools themselves. When running MCP servers remotely this obviously won't be an issue, when running locally I hope WASM can solve this.

#2 is that MCP servers might be able to extract sensitive information via tool call arguments. Client applications should thus ask for confirmation for every tool call. This is the hardest to solve because in practice, people won't bother checking.

#3 is that client applications might be able to extract sensitive information from local servers via tool results (or resources). Since the user has to set up local servers themselves right now, this is not a huge issue now. Once LLMs set them up, they will need to ask for confirmation.

amerine

I can’t express how much I agree with your perspective. It’s a completely different/total shift in how we might deliver functionality and… composability to users.

Well said.

kblissett

Isn't this just the same paradigm as plugins?

ondrsh

Similar, but one level higher.

Plugins have pre-defined APIs. You code your application against the plugin API and plugin developers do the same. Functionality is being consumed directly through this API — this is level 1.

MCP is a meta-protocol. Think of it as an API that lets arbitrary plugins announce their APIs to the application at runtime. MCP thus lives one level above the plugin's API level. MCP is just used to exchange information about the level 1 API so that the LLM can then call the plugin's level 1 API at runtime.

This only works because LLMs can understand and interpret arbitrary APIs. Traditionally, developers needed to understand an API at design-time, but now LLMs can understand an API at runtime. And because this can now happen at runtime, users (instead of developers) can add arbitrary functionality to applications.

I hate plugging my own blog again but I wrote about that exact thing before, maybe it helps you: https://www.ondr.sh/blog/thoughts-on-mcp

freeone3000

Oh, it’s the new HATEOAS? A pluggable framework for automatic discoverability of HTTP APIs is incredibly useful, and not just for AI :)

ondrsh

Unfortunately, MCP is not HATEOAS. It doesn't need to be, because it's not web-like. I wish it were.

HATEOAS is great for web-like structures because in each response it not only includes the content, but also all actions the client can take (usually via links). This is critical for architectures without built-in structure — unlike Gopher which has menus and FTP and Telnet which have stateful connections — because otherwise a client arriving at some random place has no indication on what to do next. MCP tackles this by providing a stateful connection (similar to FTP) and is now moving toward static entry points similar to Gopher menus.

I specifically wrote about why pure HATEOAS should come back instead of MCP: https://www.ondr.sh/blog/ai-web

TeMPOraL

No, you can't understand it until you understand that the world isn't all webshit and not everything is best used via REST.

(Not even webshit is best used by REST, as evidenced by approximately every "REST" API out there, designed as RPC over HTTP pretending it's not.)

aeonik

You might be able the say the user could "plug in" the new functionality. Or it allows them to "install" a new "application"?

sebazzz

So MCP to an application is like how a WebDriver interface is to a Web browser?

dotancohen

The full quote is better:

  > MCP is an open protocol that standardizes how applications provide context
  > to LLMs. Think of MCP like a USB-C port for AI applications. Just as USB-C
  > provides a standardized way to connect your devices to various peripherals
  > and accessories, MCP provides a standardized way to connect AI models to
  > different data sources and tools.

otabdeveloper4

No, it's actually a lot worse. "USB-C port for AI applications" sounds like something written by an insane person.

madeofpalk

I want to know in which way is USB-C applicable here. The complicated plug that's actually 30 different protocol's that's difficult understand what capabiltiies a plug/cable actually has?

TeMPOraL

Yes. 99% of that complexity is on the side of implementers - where it should be - and for actual use, approximately everything works well with everything else; specifics matter mostly when you're trying to plug devices that exercise the limits of advanced capabilities. Which sucks, yes, but mostly because implementers/vendors made it this way (would it hurt y'all to label your cables and devices properly, correctly and visibly?!).

jchw

Maybe they used an LLM to explain it. Gemini in particular is obsessed with these utterly useless analogies for everything, when I would prefer something closer to Wikipedia with more context. (Needless to say, I currently don't find LLMs useful for learning about things. That's a shame because that use case feels promising.)

lelandfe

I saw this ChatGPT-created analogy on a JS subreddit the other day:

> Imagine you have a robot in a room, and this robot can perform actions like turning on a light, opening a door, or picking up objects. Now, if you want to tell the robot to do something, you usually say something like, "Robot, pick up the pen!" or "Robot, open the door."

> In JavaScript, ‘this’ is like the "robot" in the room

Terrible.

colechristensen

LLMs are like an unlimited, poorly written encyclopedia. Often inaccurate or not entirely helpful, but will get you enough of an idea to find better sources. Sort of solving the "I don't know what I don't know" gap.

soulofmischief

In this regard, they have been extraordinarily fruitful for my research and studies.

votick

I thought this was a "your mom is compatible with everyone" joke

nova22033

[flagged]

stavros

I don't know whether to laugh at the joke or be triggered that USB takes three turns to insert.

gnfargbl

It's trying to say "you can plug lots of things into this in a standardized way."

https://norahsakal.com/blog/mcp-vs-api-model-context-protoco...

elamje

The closest software analogy I’ve heard is like passing around a callable/function with a standard interface. An LLM can call the callable and work with the returned data without needing to go back and forth between your application logic.

riemannzeta

The weather data example in their documentation makes it really simple to grasp how it works as an interface for models:

https://modelcontextprotocol.io/quickstart/server

I don't think it's terribly difficult to drill down into their GitHub to see what's happening under the hood if you need that level of detail.

tcdent

Big question in my mind was if OpenAI was going to formally endorse this (since it was created by Anthropic) but we have our answer.

MCP is now the industry standard for connecting LLMs to external tools.

otabdeveloper4

> MCP is now the industry standard for connecting LLMs to external tools.

No it isn't. We're still five or ten years away from real standards.

(Anyways it's clear that the future is smaller self-hosted LLMs, so whatever standard eventually emerges will be based on that paradigm.)

liamwire

Sure, like how we all self-host our own email servers and photo albums? Honestly, I think you’re about as wrong as you could possibly be, both on timelines, and in that I’d argue the arc of consumer tech adoption bends towards centralisation most often.

Standards are already emerging, including MCP, and to say that simply because they’ll evolve and be replaced over time means they’re not ‘real’ now is ridiculous. Look at the early internet and web as examples.

Local models, even accounting for reasonable progress and device performance improvements, will always, inherently, be behind the eight ball compared to SOTA models. While they may be sufficient for the low hanging fruit, I’d not bet against the frontier models being every bit as compelling relatively speaking.

Using weasel words like ‘real’ and ‘x is the future’ is astoundingly arrogant, and anyone claiming with confidence that they’ve got any idea where we’re heading is almost assuredly wrong.

greyman

1) It isn't a standard yet, but what else apart from filesystem-mcp can be used for prompts like "write me README.md for this repo" (like really produce the file)

2) For me it is not clear the future is smaller self-hosted LLMs. As of today, most useful for me is to use best models, and those are not self-hosted.

otabdeveloper4

Once we get used to the fact that LLMs exist they won't be sold on the gee-whiz "wow, a talking assistant just like in my sci-fi movies!" factor.

They'll be used for particular language classification and search tasks, and for that you'd want several lighter, faster, cheaper and more specialized models. Not one with an arbitrary "best" score that's based on tricking the Turing test.

fsndz

so we no longer need langchain and stuff like that, that's a win. But MCP also feels a bit overrated:https://www.lycee.ai/blog/why-mcp-is-mostly-bullshit

creddit

I feel like that article doesn't really live up to its title. At the end, its basic point is that MCP isn't a magic bullet (I don't think anyone claimed it was?) and that it has a lot of hype. It also makes it clear why MCP is good (ie don't need to rely on LangChain). Feels like its title should be "Why MCP is a Good Step but Doesn't Solve Agents" or something. But then, it wouldn't enable the millenial urge to use "shit".

esafak

I could not find the actual criticism in that article. What's the problem with MCP again? It's the first standard for agents.

mkagenius

http endpoint + function calling can do what MCP do, this extra bad named layer is just jargon festish

fsndz

the problem of agents is not the lack of standards, but reliability (reliability of tool use and reliability of outcomes). MCP does not solve any of that.

paulgb

That article seems to miss the point by being incurious about _why_ there is hype around MCP instead of LangChain, LangGrah, SmolAgents, LlamaIndex, etc.

We've had tool call frameworks before, but we haven't had a way of making tools that our clients actually talked to. There was no way to build tools that other people could download and run locally without getting them to switch to a client that baked those tools in. It's like the difference between static linking and dynamic linking, but for tool calls.

tcdent

Tool calls in any of the agent frameworks are just wrappers around native functions.

MCP let's you handle the processing outside of the agent context, redistribute tools independently, and provide them as either private or public hosted services.

Basically, add an HTTP layer on the existing concept of tools. It is not a replacement for a framework, it is an enhancement of a well established pattern.

rvz

> That article seems to miss the point by being incurious about _why_ there is hype around MCP instead of LangChain, LangGrah, SmolAgents, LlamaIndex, etc.

The VCs that were invested in AI companies (like Cursor) would of course need to hype something up like MCPs to get us to build them as little of those tools did not exist.

Cursor already makes $100M+. So why not get behind and integrate this chosen standard to make even more money with more MCP servers.

The last ingredient is to hype it all up on the internet to get everyone building.

A win for the VCs regardless even though it was suspiciously orchestrated as soon as Cursor integrated it.

taude

there's a reason there's MCP plugins for LangChain. Some companies will need massively customized workflows that LangChain is appropriate for, where it only needs to dig into a couple potentially publically accessed MCPS for things.

I could see a future where companies have their developer portal where now they have their APIs document, the pretty in swagger, the samples, etc, but they'll have a MCP endpoint (potentially), where they're safely exposing the data to an LLM. Your langchain node step to to get context, could call out to some of these hosted/shared mcps where you do standard stuff, like post to a Slack channel, grab some data from a SFDC instance, etc....

rvz

> But MCP also feels a bit overrated:

VCs invested in AI and agentic companies needed a way to get you guys to accelerate agents this year.

So why not "create" artificial hype for MCPs on the internet, since there were little to no MCP servers for LLMs to use despite it being several months old (November 2024) until Cursor integrated it.

This is the true reason why you see them screaming about MCPs everywhere.

fkyoureadthedoc

So the main criticism is a borderline conspiracy theory about VC's creating artificial hype for it?

jtrn

I hoped OpenAI would support OpenAPI for connecting to tools. Having created a couple of MCP servers, it feels like a less flexible and worse documented API to me. I can’t really see anything that is made better by MCP over OpenAPI. It’s a little bit less code for a lot less options. Give it some time and it will also get Swagger built in.

It’s solving a problem that was already robustly solved. So get we go with another standard.

dartos

> I can’t really see anything that is made better by MCP over OpenAPI

Well it’s transport agnostic, for one.

I think a big part of it is defining a stateful connection and codifying the concepts of prompts and tools.

Another issue with OpenAPI / swagger is that you still need to create a client per API, but with MCP it’s all uniform.

slt2021

transport argument is irrelevat when http can be both TCP and UDP, and even if you wanna do one protocol, it can be proxied/tunneled via another protocol transparently via VPN/wireguard/proxy.

so transport protocol is moot, especially from latency perspective when most time is spent doing AI inference (seconds) rather than passing packets (milliseconds).

I really wish OpenAI just embraced OpenAPI and that would have instantly gained millions of existing sites available to ChatGPT with zero code change.

sethaurus

It's transport-agnostic in the sense that it works locally (over stdout), not just remotely. It's not just for web-services. That's why HTTP isn't baked into this.

nsonha

I feel like we should have transport agnostic RPC by now, GRPC? And MCP is stateless too. And you don't have to create a client per API, it's up to implementation.

dartos

I think MCP is a layer higher than grpc. MCP can be implemented on gRPC

Mcp is definitely not stateless. It’s explicitly stateful…

See the 2nd bullet where it says “stateful connections” [1]

And I was saying that with OpenAPI, you need a client per API, or at least a series of http api calls.

1. https://spec.modelcontextprotocol.io/specification/2025-03-2...

Spivak

Are we reading the same documents? MCP implementations are all websockets so it's not really transport agnostic. They try to talk about how you can use other transports but its just that it's JSON-RPC and if you're willing to code both ends you can do whatever… which is always true. And MCP is explicitly a stateful protocol [1].

https://spec.modelcontextprotocol.io/specification/2025-03-2...

* JSON-RPC message format

* Stateful connections

* Server and client capability negotiation

There's a draft [1] maybe about resolving this.

https://github.com/modelcontextprotocol/specification/pull/2...

PeterStuer

Whatever the current state, if everyone throws their shoulders behind a common interface, we all win.

taude

There's already swagger MCP service out there, I don't know how production ready it is, but I saw something the other day when searching through GitHu. One of several implementations: https://github.com/dcolley/swagger-mcp

bjtitus

Emcee is a good implementation of this.

https://github.com/loopwork-ai/emcee

OrangeMusic

GraphQL looks like an even better choice - it really looks like they re-invented the wheel here.

samchon

Can't I do function calling in OpenAPI? I also feel like MCP is reinventing the wheel.

I have been converting OpenAPI documents into function calling schemas and doing tool calling since function calling first came out in 2023, but it's not easy to recreate a backend server to fit MCP.

Also, these days, I'm making a compiler-driven function calling specialized framework, but I'm a little cautious about whether MCP will support it. It enables zero-cost tool calling for TypeScript classes based on the compiler, and it also supports OpenAPI.

However, in the case of MCP, in order to fit this to the compiler-driven philosophy, I need to create a backend framework for MCP development first, or create an add-on library for a famous framework like NestJS. I can do the development, but there's so much more to do compared to OpenAPI tool calling, so it's a bit like that.

prometheon1

How should an MCP server like git work in your opinion? Should it then be written as a FastAPI server so that you have an openapi spec instead of just a CLI?

https://github.com/modelcontextprotocol/servers/blob/main/sr...

F7F7F7

You’re way over thinking the use cases for MCPs. That tells me you should stick to functions.

mkagenius

> That tells me you should stick to functions.

probably we all should ¯\_(ツ)_/¯

dan-kwiat

[dead]

gronky_

They’re all in. They announced they’ll add support for it in the desktop app and the API in the coming months: https://x.com/OpenAIDevs/status/1904957755829481737

emmanueloga_

I'm surprised this was announced in a random tweet instead of a blog post with a release roadmap or something like that.

swyx

because its a lil embarrassing oai didnt come up with it

simonw

"Think of MCP like a USB-C port for AI applications."

Given the enormous amounts of pain I've heard are involved in actually implementing any form of USB, I think the MCP community may want to find a different analogy!

PufPufPuf

Since I've worked with the official MCP SDK, I find this analogy quite accurate

jauntywundrkind

I assume they're (for now at least) targeting the old HTTP+SSE version of MCP, and not the new Streaming HTTP version? https://github.com/modelcontextprotocol/specification/pull/2...

There's some other goodies too. OAuth 2.1 support, JSON-RPC Batching... https://github.com/modelcontextprotocol/specification/blob/m...

nomilk

What are people using MCPs for? I search on youtube and see a lot of videos explaining how MCPs work, but none showing practical uses for a programmer (aside from getting the weather via cursor).

antoinec

I have two that I use a lot: - Postgres one connected to my local db - A browser (Playwright) That way I can ask cursor something like: "Find an object with xxx property in the db, open its page in the browser, fix console errors" And it's able to query my db, find relevant object, open the browser, check logs, fix the code based on errors.

Even simpler stuff: - copying pasting a broken local url in cursor, and asking it to fix all console errors. Works really well. - or when you have a complex schema, and need to find some kind of specific record, you can just ask cursor "find me a user that has X transactions, matches Y condition, etc..". I found it much faster than me at finding relevant records

consumer451

I only use one regularly, but I use it a lot. (Supabase)

Example use case: https://news.ycombinator.com/item?id=43466434

nomilk

I think I’m starting to see the potential.. So I could MPC cursor to my local Postgres (or, after some practice to build confidence, prod Postgres). Then use cursor to help debug.

This sounds risky but very useful if it works as intended!

georgeashworth

Yes, I have used that workflow to fix a tricky data corruption issue earlier this week.

A buggy feature left DB in invalid state, I described the issue to Claude + Postgres MCP to both query the DB to analyze and then generate SQL scripts to fix, and validation and rollback scripts. Easy enough to do without the tooling... but with the tooling, it took probably a quarter or less of the time.

Xelynega

Is this not just hiding the "complexity" of SQL behind an LLM and hoping for the best?

If you know SQL and know what you're trying to find, what can the LLM do quickly that you couldn't just constructing a query to get what you want?

Alternatively if you don't know SQL, aren't you never going to learn it if every opportunity you have you bust out an LLM and hope for the best?

consumer451

AS far as risk, I cannot imagine using an MCP with write capabilities to a DB, the Supabase one is read-only, which is perfect.

knowaveragejoe

I mainly use simple integrations:

- fetch, essentially curl any webpage or endpoint

- filesystem, read(and occasionally write) to local disk

- mcp-perplexity, so I can essentially have Claude sample from perplexity's various models and/or use it for web search. Somewhat superseded by Claude's own new web search capability, but perplexity's is generally better.

johnjungles

If you want to try out mcp (model context protocol) with little to no setup:

I built https://skeet.build/mcp where anyone can try out mcp for cursor and now OpenAI agents!

We did this because of a painpoint I experienced as an engineer having to deal with crummy mcp setup, lack of support and complexity trying to stand up your own.

Mostly for workflows like:

* start a PR with a summary of what I just did * slack or comment to linear/Jira with a summary of what I pushed * pull this issue from sentry and fix it * Find a bug a create a linear issue to fix it * pull this linear issue and do a first pass * pull in this Notion doc with a PRD then create an API reference for it based on this code * Postgres or MySQL schemas for rapid model development

Everyone seems to go for the hype but ease of use, practical pragmatic developer workflows, and high quality polished mcp servers are what we’re focused on

Lmk what you think!

polishdude20

This looks super cool and I can see myself using this!

Although the name is a bit unfortunate.

pfista

my favorite is implementing issues from linear

rgomez

Shamelessly promoting in here, I created an architecture that allows an AI agent to have those so called "tools" available locally (under the user control), and works with any kind of LLMs, and with any kind of LLM server (in theory). I've been showing demos about it for months now. Works as a middle-ware, in stream, between the LLM server and the chat client, and works very well. The project is open source, even the repo is outdated, but simply because no one is expressing interest in looking into the code. But here is the repo: https://github.com/khromalabs/Ainara. There's a link to a video in there. Yesterday just recorded a video showcasing DeepSeek V3 as the LLM backend (but could be any from OpenAI as well, or Anthropic, whatever).

nomel

The lack of interest may be from the crypto aspect:

> While the project will always remain open-source and aims to be a universal AI assistant tool, the officially developed 'skills' and 'recipes' (allowing AI to interact with the external world through Ainara's Orakle server) will primarily focus on cryptocurrency integrations. The project's official token will serve as the payment method for all related services.

rgomez

Thank you for the feedback... actually I need to update that, the crypto part of my project will be closed source (an specific remote server) but the idea behind the project itself is universal and open since the very beginning, I already developed dozens of skills including a meta-search engine (searches in several engines at once and combines results dynamically, all balanced by the AI) which are open source as well. Crypto just kind of showed itself as way of funding project, with no strings attached, and till this very day no one else showed up.

HN

OpenAI adds MCP support to Agents SDK

OpenAI adds MCP support to Agents SDK