Skip to content(if available)orjump to list(if available)

Cloudflare Sandbox SDK

Cloudflare Sandbox SDK

26 comments

·October 16, 2025

simonw

Looks like there's one feature missing from this that I care about: I'd like more finely grained control over what outbound internet connections code running on the box can make.

As far as I can tell it's all or nothing right now:

  this.ctx.container.start({
    enableInternet: false,
  });
I want to run untrusted code (from users or LLMs) in these containers, and I'd like to avoid someone malicious using my container to launch attacks against other sites from them.

As such, I'd like to be able to allow-list just specific network points. Maybe I'm OK with the container talking to an API I provide but not to the world at wide. Or perhaps I'm OK with it fetching data from npm and PyPI but I don't want it to be able to access anything else (a common pattern these days, e.g. Claude's Code Interpreter does this.)

ashishbijlani

I’m extending Packj sandbox for agentic code execution [1]. You can specify allowlist for network/fs.

1. https://github.com/ossillate-inc/packj/blob/main/packj/sandb...

paxys

This simple feature bumps up the complexity of such a firewall by several orders of magnitude, which is why no similar runtime (like Deno) offers it.

Networking as a whole can easily be controlled by the OS or any intermediate layer. For controlling access to specific sites you need to either filter it at the DNS level, which can be trivially bypassed, or bake something into the application binary itself. But if you are enabling untrusted code and giving that code access to a TCP channel then it is effectively impossible to restrict what it can or cannot access.

simonw

The most convincing implementation I've seen of this so far is to lock down access to just a single IP address, then run an HTTP proxy server at that IP address which can control what sites can be proxied to.

Then inject HTTP_PROXY and HTTPS_PROXY environment variables so tools running in the sandbox know what to use.

navanchauhan

At least on macOS, there is a third way where you can control the network connection on the PID/binary level by setting up a network system extension and then setting up a content filter so you can allow/deny requests. It is pretty trivial to set this up, but the real challenge is usually in how you want to express your rules.

Little Snitch does this pretty well: https://www.obdev.at/products/littlesnitch/index.html

alooPotato

There is an open question about how file persistence works.

The docs claim they persist the filesystem even when they move the container to an idle state but its unclear exactly what that means - https://github.com/cloudflare/sandbox-sdk/issues/102

jasonriddle

This looks interesting.

Instead of having to code this up using typescript, is there an MCP server or API endpoint I can use? Basically, I want to connect an MCP server to an agent, tell it it can run typescript code in order to solve a problem or verify something.

whoiskatrin

If anyone is curious, more details on our SDK can be found here actually https://github.com/cloudflare/sandbox-sdk

_pdp_

Looks nice.

We rolled out our own that does pretty much the same thing but perhaps more because our solution can also mount persistent storage that can be carried between multiple runners. It does take 1-5 seconds to boot the environment (firecracker vms). If this sandbox is faster I will instruct the team to consider for fast starup.

This is also very similar to Vercel's sandbox thing. The same technology?

What I don't like about this approach is the github repo bootstrap setup. Is it more convenient compared to docker images pushed to some registry? Perhaps. But docker benefits from having all the artefacts prebuilt in advance, which in our case is quite a bit.

ATechGuy

> It does take 1-5 seconds to boot the environment (firecracker vms).

I'd say 1-5 secs is fast. Curious to know what use cases require faster boot up, and today suffer from this latency?

_pdp_

When your agent performs 20 tasks saving seconds here and there becomes a very big deal. I cannot even begin to describe how much time we've spent on optimising code paths to make the overall execution fast.

Last week I was on a call with a customer. They where running OpenAI side-by-side with our solution. I was pleased that we managed to fulfil the request under a minute while OpenAI took 4.5 minutes.

The LLM is not the biggest contributor to latency in my opinion.

ATechGuy

Thanks! While I agree with you on "saving seconds" and overall latency argument, according to my understanding, most agentic use cases are asynchronous and VM boot up time may just be a tiny fraction of overall task execution time (e.g., deep research and similar long running tasks in the background).

_pdp_

I browsed through the documents but it does not seem to be possible to auto destroy a sandbox after certain amount of idle time. This forces who ever is implementing this to do their own cleanup. It is kind of missed opportunity if you ask me as this is a big pain. It is sold as fire and forget but it seems that more serious workflows will require also a lot of supporting infrastructure.

alooPotato

You can easily set an alarm in the durable object to check if it should be killed and then call destroy yourself. Just a couple lines of code.

_pdp_

Nice. Thanks for the tip. I did not know that this was a thing. I will look it up.

eis

Cloudflare Containers (and therefore Sandbox) pricing is way too expensive. The pricing is a bit cumbersome to understand by being inconsistent with pricing of other Cloudflare products in terms of units and split between memory, cpu and disk instead of combined per instance. The worst is that it is given in these tiny fractions per second.

Memory: $0.0000025 per additional GiB-second vCPU: $0.000020 per additional vCPU-second Disk: $0.00000007 per additional GB-second

The smaller instance types have super low processing power by getting a fraction of a vCPU. But if you calculate the monthly cost then it comes to:

Memory: $6.48 per GB vCPU: $51.84 per vCPU (!!!) Disk: $0.18 per GB

These prices are more expensive than the already expensive prices of the big cloud providers. For example a t2d-standard-2 on GCP with 2 vCPUs and 8GB with 16GB storage would cost $63.28 per month while the standard-3 instance on CF would cost a whopping $51.84 + $103.68 + $2.90 = $158.42, about 2.5x the price.

Cloudflare Containers also don't have peristent storage and are by design intended to shut down if not used but I could then also go for a spot vm on GCP which would bring the price down to $9.27 which is less than 6% of the CF container cost and I get persistent storage plus a ton of other features on top.

What am I missing?

ATechGuy

Startups would build on big tech, so are likely to add their margins. Have you looked into (bulk) discounts from GCP/AWS?

fishmicrowaver

Is there some sort of competition for awful looking websites going on?

fidotron

This bizarre anti-aesthetic has been pushed in the web devex space for a few years now to appeal to other web devex companies.

Svoka

I thought it was cute and easy to read.

fishmicrowaver

They didn't test it with FF apparently.

sim0n

Looks perfectly fine in FF 144.0 on Mac OS.

fidotron

Does this relate to workerd in any way or is it something else entirely?

ChrisArchitect

These CF website relaunches are just that right? Workers last week (https://workers.cloudflare.com) and now this one yesterday. I mean, if CF has something newsworthy here they should do a blog post announcing it because otherwise it's just a refreshed website. It's hard to tell if there's anything new here.

It's the same SDK stuff from earlier this year right? https://developers.cloudflare.com/changelog/2025-06-24-annou...

whoiskatrin

it barely had any features then, this version is full of new functionality: streaming logs, long running processes, code interpreter and lots of other things and full docs site as well

nickandbro

Amazing, I can use this immediately for my vim site at: https://vimgolf.ai

Great work CF building off of the containers platform and making it more tailored towards a common problem.