Skip to content(if available)orjump to list(if available)

Notes on the new Claude analysis JavaScript code execution tool

animal_spirits

That's an interesting idea to generate javascript and execute it client side rather than server side. I'm sure that saves a ton of money for Anthropic not by not having to spin up a server for each execution.

stanleydrew

Also means you're not having to do a bunch of isolation work to make the server-side execution environment safe.

simonw

I've been trying to figure out the right pattern for running untrusted JavaScript code in a browser sandbox that's controlled by a page for a while now, looks like Anthropic have figured that out. Hoping someone can reverse engineer exactly how they are doing this - their JavaScript code is too obfuscated for me to dig out the tricks, sadly.

spankalee

The key is running the untrusted code in a cross-origin iframe so you can rely on the same-origin policies and `sandbox`[1].

You can control the code in a number of ways - loading a trusted shim that sets up a postMessage handler is pretty common. You can be careful and do that in a way that untructed code can't forge messages to look like their from the trusted code.

Another way is to use two iframes to the untrusted origin. One only loads untrusted code, the other loads a control API that talks to the trusted code. You can then to the loading into the iframe with a service worker. This is how the Playground Elements work (they're a set of web components that let you safely embed a mini IDE for code samples) https://github.com/google/playground-elements

[1]: https://developer.mozilla.org/en-US/docs/Web/HTML/Element/if...

TimTheTinker

You should check out how Figma plugins work. They have blog posts on all the tradeoffs they considered.

What I believe they settled on was a JS interpreter compiled to WASM -- it can run arbitrary JS but with very well-defined and restricted interfaces to the outside world (the browser's JS runtime environment).

dartos

Isn’t that how all JavaScript code runs in a browser?

TheRealPomax

Isn't what how all JS runs in the browser? There are different restrictions based on where JS comes from, and what context it gets loaded into.

aabhay

What are the attack vectors for a web browser js environment to do malicious things? All browser code is sandboxed via origin controls, and process isolation. It can’t even open an iframe and read the contents of that iframe.

njtransit

The attack vectors are either some type of credential or account compromise. Generally, these attacks fall under the cross-site scripting (XSS) umbrella. The browser exposes certain things to the JS context based on the origin. E.g. if you log in to facebook.com, facebook.com might set an authentication cookie that can be accessed in the JS context. Additionally, all outbound requests to facebook.com will include this authentication cookie. So, if you can execute JS in the context of facebook.com, you could steal this cookie or have the browser perform malicious actions that get implicitly authenticated.

TimTheTinker

It's a fine place to run code trusted by the server (or code trusted by the client within the scope of the app).

But for code not trusted by either, it's bad -- user data in the app can be compromised/exfiltrated.

Hence for third-party plugins for a web app, the built-in JS runtime doesn't have sufficient trust management capability.

null

[deleted]

mritchie712

duckdb-wasm[0] would be a good addition here. We use it in Definite[1] and I can't say enough good things about duckdb in general.

0 - https://github.com/duckdb/duckdb-wasm

1 - https://www.definite.app/

refulgentis

Interesting: I'm curious, what about it helps here specifically.

Approaching it naively and undercaffeinated, it sounds abstract, as in it would benefit the way any code could benefit from a persistence layer / DB

Also I'm curious if it would require a special one-off integration to make it work, or could it write JS that just imported the library?

thenaturalist

Funnily enough, I test code generation both on unpaid Claude and ChatGPT.

When working with Python, I've found Sonnet (pre 3.5) to be quite superior to ChatGPT (mostly 4, sometimes 3.5) with regards to verbosity, structure and prompt / instruct comprehension.

I've switched to a JavaScript project two weeks ago and the tables have turned.

Sonnet 3.5 is much more verbose and I need to make corrections a few times, whereas ChatGPTs output is shorter and on point.

I'll closely follow if this improves if Claude are focussing on JS themselves.

koolala

JavaScript is the perfect language for this. I can't wait for a sandboxed coding environment to totally set AI loose.

mlejva

Shameless plug here. We're building exactly this at E2B [0] (I'm the CEO). Sandboxed cloud environments for running AI-generated code. We're fully open-source [1] as well.

[0] https://e2b.dev

[1] https://github.com/e2b-dev

croes

They could run a little crypto miner to get more profit

willsmith72

This is a great step, but to me not very useful until the move out of context. Still I'm high on anthropic and happy gen ai didn't turn into a winner-take-all market like everyone predicted in 2021.