LM Studio is now an MCP Host

47 comments

·June 25, 2025

chisleu

Just ordered a $12k mac studio w/ 512GB of integrated RAM.

Can't wait for it to arrive and crank up LM Studio. It's literally the first install. I'm going to download it with safari.

LM Studio is newish, and it's not a perfect interface yet, but it's fantastic at what it does which is bring local LLMs to the masses w/o them having to know much.

There is another project that people should be aware of: https://github.com/exo-explore/exo

Exo is this radically cool tool that automatically clusters all hosts on your network running Exo and uses their combined GPUs for increased throughput.

Like HPC environments, you are going to need ultra fast interconnects, but it's just IP based.

zackify

I love LM studio but I’d never waste 12k like that. The memory bandwidth is too low trust me.

Get the RTX Pro 6000 for 8.5k with double the bandwidth. It will be way better

dchest

I'm using it on MacBook Air M1 / 8 GB RAM with Qwen3-4B to generate summaries and tags for my vibe-coded Bloomberg Terminal-style RSS reader :-) It works fine (the laptop gets hot and slow, but fine).

Probably should just use llama.cpp server/ollama and not waste a gig of memory on Electron, but I like GUIs.

minimaxir

8 GB of RAM with local LLMs in general is iffy: a 8-bit quantized Qwen3-4B is 4.2GB on disk and likely more in memory. 16 GB is usually the minimum to be able to run decent models without compromising on heavy quantization.

imranq

I'd love to host my own LLMs but I keep getting held back from the quality and affordability of Cloud LLMs. Why go local unless there's private data involved?

noman-land

I love LM Studio. It's a great tool. I'm waiting for another generation of Macbook Pros to do as you did :).

prettyblocks

I've been using openwebui and am pretty happy with it. Why do you like lm studio more?

prophesi

Not OP, but with LM Studio I get a chat interface out-of-the-box for local models, while with openwebui I'd need to configure it to point to an OpenAI API-compatible server (like LM Studio). It can also help determine which models will work well with your hardware.

LM Studio isn't FOSS though.

I did enjoy hooking up OpenWebUI to Firefox's experimental AI Chatbot. (browser.ml.chat.hideLocalhost to false, browser.ml.chat.provider to localhost:${openwebui-port})

s1mplicissimus

i recently tried openwebui but it was so painful to get it to run with local model. that "first run experience" of lm studio is pretty fire in comparison. can't really talk about actually working with it though, still waiting for the 8GB download

truemotive

Open WebUI can leverage the built in web server in LM Studio, just FYI in case you thought it was primarily a chat interface.

null

[deleted]

incognito124

> I'm going to download it with Safari

Oof you were NOT joking

noman-land

Safari to download LM Studio. LM Studio to download models. Models to download Firefox.

teaearlgraycold

The modern ninite

karmakaze

Nice. Ironically well suited for non-Apple Intelligence.

politelemon

The initial experience with LMStudio and MCP doesn't seem to be great, I think their docs could do with a happy path demo for newcomers.

Upon installing the first model offered is google/gemma-3-12b - which in fairness is pretty decent compared to others.

It's not obvious how to show the right sidebar they're talking about, it's the flask icon which turns into a collapse icon when you click it.

I set the MCP up with playwright, asked it to read the top headline from HN and it got stuck on an infinite loop of navigating to Hacker News, but doing nothing with the output.

I wanted to try it out with a few other models, but figuring out how to download new models isn't obvious either, it turned out to be the search icon. Anyway other models didn't fare much better either, some outright ignored the tools despite having the capacity for 'tool use'.

b0a04gl

claude going mcp over remote kinda normalised the protocol for inference routing. now with lmstudio running as local mcp host, you can just tunnel it (cloudflared/ngrok), drop a tiny gateway script and boom your laptop basically acts like a mcp node in hybrid mesh. short prompts hit qwen local, heavier ones go claude. with same payload and interface we can actually get multihost local inference clusters wired together by mcp

patates

What models are you using on LM Studio for what task and with how much memory?

I have a 48GB macbook pro and Gemma3 (one of the abliterated ones) fits my non-code use case perfectly (generating crime stories which the reader tries to guess the killer).

For code, I still call Google to use Gemini.

robbru

I've been using the Google Gemma QAT models in 4B, 12B, and 27B with LM Studio with my M1 Max. https://huggingface.co/lmstudio-community/gemma-3-12B-it-qat...

null

[deleted]

null

[deleted]

visiondude

LMStudio works surprisingly well on M3 Ultra 64gb and 27b models.

Nice to have a local option, especially for some prompts.

minimaxir

LM Studio has quickly become the best way to run local LLMs on an Apple Silicon Mac: no offense to vllm/ollama and other terminal-based approaches, but LLMs have many levers for tweaking output and sometimes you need a UI to manage it. Now that LM Studio supports MLX models, it's one of the most efficient too.

I'm not bullish on MCP, but at the least this approach gives a good way to experiment with it for free.

zackify

Ollama doesn’t even have a way to customize the context size per model and persist it. LM studio does :)

Anaphylaxis

This isn't true. You can `ollama run {model}`, `/set parameter num_ctx {ctx}` and then `/save`. Recommended to `/save {model}:{ctx}` to persist on model update

chisleu

> I'm not bullish on MCP

You gotta help me out. What do you see holding it back?

minimaxir

tl;dr the current hype around it is a solution looking for a problem and at a high level, it's just a rebrand of the Tools paradigm.

mhast

It's "Tools as a service", so it's really trying to make tool calling easier to use.

pzo

I just wish they did some facelifting of UI. Right now is too colorfull for me and many different shades of similar colors. I wish they copy some color pallet from google ai studio or from trae or pycharm.

nix0n

LM Studio is quite good on Windows with Nvidia RTX also.

squanchingio

I'll be nice to have the MCP servers exposed like LMStudio OpenAI-like endpoints.

maxcomperatore

good.

api

I wish LM Studio had a pure daemon mode. It's better than ollama in a lot of ways but I'd rather be able to use BoltAI as the UI, as well as use it from Zed and VSCode and aider.

What I like about ollama is that it provides a self-hosted AI provider that can be used by a variety of things. LM Studio has that too, but you have to have the whole big chonky Electron UI running. Its UI is powerful but a lot less nice than e.g. BoltAI for casual use.

rhet0rica

Oh, that horrible Electron UI. Under Windows it pegs a core on my CPU at all times!

If you're just working as a single user via the OpenAI protocol, you might want to consider koboldcpp. It bundles a GUI launcher, then starts in text-only mode. You can also tell it to just run a saved configuration, bypassing the GUI; I've successfully run it as a system service on Windows using nssm.

https://github.com/LostRuins/koboldcpp/releases

Though there are a lot of roleplay-centric gimmicks in its feature set, its context-shifting feature is singular. It caches the intermediate state used by your last query, extending it to build the next one. As a result you save on generation time with large contexts, and also any conversation that has been pushed out of the context window still indirectly influences the current exchange.

SparkyMcUnicorn

There's a "headless" checkbox in settings->developer

diggan

Still, you need to install and run the AppImage at least once to enable the "lms" cli which can later be used. Would be nice with a completely GUI-less installation/use method too.

null

[deleted]

gregorym

I use https://ollamac.com/ to run Ollama and it works great. It has MCP support also.

simonw

That's clearly your own product (it links to Koroworld in the footer and you've posted about that on Hacker News in the past).

Are you sharing any of your revenue from that $79 license fee with the https://ollama.com/ project that your app builds on top of?