We hacked Gemini's Python sandbox and leaked its source code (at least some)

150 comments

·March 28, 2025

topsycatt

That's the system I work on! Please feel free to ask any questions. All opinions are my own and do not represent those of my employer.

ryao

I imagine you need to make and destroy sandboxed environments quite often. How fast does your code create a sandboxed environment?

Do you make the environments on demand or do you make them preemptively so that one is ready to go the moment that it is needed?

If you make them on demand, have you tested ZFS snapshots to see if it can be done even faster using zfs clone?

topsycatt

Sorry for the delay in replying!

We actually use gVisor (as stated in the article) and it has a very nifty feature called checkpoint_restore (https://gvisor.dev/docs/user_guide/checkpoint_restore/) which lets us start up sandboxes extremely efficiently. Then the filesystem is just a CoW overlay.

ryao

Thanks for the response. I had misread the article’s description of gVisor and mistook it as something meant to protect the rest of the system rather than something that handled the filesystem part of the sandbox. It is an interesting tool.

dullcrisp

What’s ZFS? That doesn’t sound like a Google internal tool I’ve ever heard of.

x-complexity

https://en.wikipedia.org/wiki/ZFS

It's a filesystem, to put it simply.

2OEH8eoCRo0

Oh boy. Get ready for the zealots

blixt

Seconding this. Also curious if this is done with microkernels (I put Unikraft high on the list of tech I'd use for this kind of problem, or possibly the still-in-beta CodeSandbox SDK – and maybe E2B or Fly but didn't have as good experiences with those).

luke-stanley

I use ZFS, but isn't the situation the sandbox is in totally different? Why would it be optimal?

ryao

If you are making sandboxes, you need to put the files in place each time. With ZFS clones, you can keep referencing the same files repeatedly, so the amount of changes to memory needed to create an environment are minimized. Let’s say the sandbox is 1GB and each clone operation does less than 1MB of memory writes. Then you have a >1000x reduction in writing needed to make the environment.

Furthermore, ZFS ARC should treat each read operation of the same files as reading the same thing, while a sandbox made the traditional way would treat the files as unique, since they would be full copies of each other rather than references. ZFS on the other hand should only need to keep a single copy of the files cached for all environments. This reduces memory requirements dramatically. Unfortunately, the driver has double caching on mmap()’ed reads, but the duplication will only be on the actual files accessed and the copies will be from memory rather than disk. A modified driver (e.g. OSv style) would be able to eliminate the double caching for mmap’ed reads, but that is a future enhancement.

In any case, ZFS clones should have clear advantages over the more obvious way of extracting a tarball every time you need to make a new sandbox for a Python execution environment.

RunningDroid

I believe they were referring to the use of ZFS snapshots for a Copy-on-Write type setup

hnuser123456

Is the interactive python sandbox incompatible with thinking models? It seems like I can only get the interactive sandbox by using 2.0 flash, not 2.0 flash thinking or 2.5 pro.

topsycatt

That's a good question! It's not incompatible, it's just a matter of getting the flow right. I can't comment too much on that process but I'm excited for the possibilities there.

hnuser123456

Oh, I see Gemini can run code as part of the thinking process. I suppose the sandbox that happens in was the target of this research, while code editing in Gemini Canvas just has a button to export to Colab for running. The screenshots in the research show a "run" button for generated code in the chat, but I'm not seeing that exact interface.

In any case, I share your excitement.

TechDebtDevin

Have you by chance read this paper: https://agent-gen.github.io/

wunderwuzzi23

That's cool. I did something similar in the early days with Google Bard when data visualization was added, which I believe was when the ability to run code got introduced.

One question I always had was what the user "grte" stands for...

Btw. here the tricks I used back then to scrape the file system:

https://embracethered.com/blog/posts/2024/exploring-google-b...

waych

The "runtime" is a google internal distribution of libc + binutils that is used for linking binaries within the monolithic repo, "google3".

This decoupling of system libraries from the OS itself is necessary because it otherwise becomes unmanageable to ensure "google3 binaries" remain runnable on both workstations and production servers. Workstations and servers each have their own Linux distributions, and each also needs to change over time.

saagarjha

Of course, this meant that some tools got stuck on some old glibc from like 2007.

flawn

It says in the article - Google Runtime Environment

jemfinch

grte is probably "google runtime environment", I would imagine.

fragmede

Do you think "hacked Gemini and leaked its source code" is an accurate representation of what happened here?

topsycatt

I'm on the Google side of the equation. I think the title is a bit sensationalized, but that's the author's prerogative.

devdudect

When are we going to be able to run sandboxed php code?

koakuma-chan

> but that's the author's prerogative

You submitted this.

enoughalready

Have you contemplated running the python code in a virtual environment in the browser?

seydor

you re the hacker or the google?

topsycatt

The google

larodi

"im the google" is definitely a top 3 chart synthpop song by ladytron .)

sans_souse

Can a Mod please change thread title to I'm The Google. AMA.

lugao

[flagged]

onemoresoop

Question: how does it feel inside google in terms of losing their lunch to OpenAi? Losing here is very loose, I don’t think OpenAI won yet but seems to have made a leap ahead of google in terms of marker share and we know google was sitting on tons of breakthroughs and research. Any panicking or internal discontent at google’s product policies? No need to answer if you’re uncomforable that your employer may hold you responsible for what you write here.

Mindwipe

Does anyone at Google care that you're trying to replace Assistant with this in the next few months and it can't set a timer yet?

(I mean it will tell you it's set a timer but it doesn't talk to the native clock app so nothing ever goes off if you navigate away from the window.)

hnuser123456

I doubt the guy working on the code sandbox can do anything about the overall resource allocation towards ensuring all legacy assistant features still work as well as they used to. That being said, I was trying to navigate out of an unexpected construction zone and asked google to navigate me home, and it repeatedly tried to open the map on my watch and lock my phone screen. I had to pull over and use my thumbs to start navigation the old fashioned way.

iury-sza

I keep reading people complaining about this but I can't understand why. Gemini can 100% set timers and with much more subtle hints than assistant ever could. It just works. I don't get why people say it can't.

It can also play music or turn on my smart lamps, change their colors etc. I can't remember doing any special configuration for it to do that either.

Pixel 9 pro

jdiff

I certainly can't get it to reliably play music on my Pixel 8. Mostly it summons YT Music, only occasionally do I get my music player, and sometimes I merely get "I'm an LLM, I can't help you with that."

And you used to be able to say "Find my phone" and it would chime and max screen brightness until found. Tried that with Gemini once, and it went on with very detailed instructions on using Google or Apple's Find My Device website (depending on what type of phone I owned), maybe calling it from another device if it's not silenced, or perhaps accepting that my device was lost or stolen if none of the above worked. Did find it during that lengthy attempt at being helpful though.

Another fun example, weather. When Gemini's in control, "What's the weather like tonight?" gets a short ramble about how weather depends on climate, with some examples of what the weather might be like broadly in Canada, Japan, or the United States at night.

Unlike Assistant where you could learn to adapt to its unique phrasing preferences, you just flat out can never reliably predict what Gemini's going to do. In exchange for higher peak performance, the floor dropped out the bottom.

dgunay

I dislike Google's (mis)management of Assistant as much as the next guy, but this just has not been my experience. I can tell Gemini on my phone to set timers and it works just fine.

ChadNauseam

I have a rooted pixel with a flashed custom android ROM, which should be a nightmare scenario for gemini, and it can set timers just fine (and the timers show up in the native clock app)

arebop

The Assistant can't reliably set timers either, though I guess 80% is considerably better than 0. Still, I think it used to be better back before Google caught a glimpse of a different squirrel to chase.

7bit

It can't do shit, especially in some EU countries, where it can do even less shit.

Setting timers reminders, calendar events. Nothing. If they kill the assistant, I'll go Apple, no matter how much I hate it.

GrayShade

Just tested, you need to enable "Gemini Apps", but they remember your interactions for 3, 18 or 36 months instead of 3 days.

nosrepa

I just want the assistant voice. I hate the Gemini ones.

whatevertrevor

I'm with you on that. I prefer a human trying to sound like a robot instead of a robot trying to sound human.

jwlake

Is there any reason it's not documented?

simonw

I've been using a similar trick to scrape the visible internal source code of ChatGPT Code Interpreter into a GitHub repository for a while now: https://github.com/simonw/scrape-openai-code-interpreter

It's mostly useful for tracking what Python packages are available (and what versions): https://github.com/simonw/scrape-openai-code-interpreter/blo...

Zopieux

Meanwhile they could just decide to publish this list in a document somewhere and keep it automatically up to date with their infra.

But not, secrecy for the sake of secrecy.

aleksiy123

Tbh I doubt this is secrecy.

More likely just noone has taken the time and effort to do it.

12345hn6789

What would the benefit of doing this be?

simonw

It's documentation. Makes it much easier for people to know what kind of problems they can solve using Code Interpreter.

It's a bit absurd that the best available documentation for that feature exists in my hacky scraped GitHub repository.

fudged71

I just used this package list (and sandbox limitations) to synthesize a taxonomy of capabilities: https://gist.github.com/trbielec/a00a58fa97a232bef8984cc8d01...

lqstuart

So by “we hacked Gemini and leaked its source code” you really mean “we played with Gemini with the help of Google’s security team and didn’t leak anything”

worldsavior

Sad that I didn't read this comment before reading this article.

parliament32

> resulting in the unintended inclusion of highly confidential internal protos in the wild

I don't think they're all that confidential if they're all on github: https://github.com/ezequielpereira/GAE-RCE/tree/master/proto...

saagarjha

I mean, those were also disclosed via a vulnerability.

Brian_K_White

But it still means they aren't guilty of leaking/disclosing them.

It's not a valid point of criticism. The escape did not in fact "result" in the leak of confidential photos. That already happened somewhere else. This only resulted in the republishing of something already public.

Or another way, it's not merely that they were already public elsewhere, the imortant point is that the photos were not given to the ai in confidence, and so re-publishing them did not violate a confidence, any more than say github did.

I'm no ai apologist btw. I say all of these ais are committing mass copyright violation a million times a second all day every day since years ago now.

saagarjha

I’m not criticizing them

tgtweak

The definition of hacking is getting pretty loose. This looks like the sandbox is doing exactly what it's supposed to do and nothing sensitive was exfiltrated...

bluelightning2k

Cool write up. Although it's not exactly a huge vulnerability. I guess it says a lot about how security conscious Google is that they consider this to be significant. (You did mention that you knew the company's specific policy considered this highly confidential so it does count but it feels a little more like "technically considered a vulnerability" rather than clearly one.)

jll29

Running the built-in "strings" command to extract a few file names from a binary is hardly hacking/cracking.

Ironically, though, getting the source code of Gemini perhaps wouln't be valuable at all; but if you had found/obtained access to the corpus that the model was pre-trained with, that would have been kind of interesting (many folks have many questions about that...).

dvt

> but if you had found/obtained access to the corpus that the model was pre-trained with, that would have been kind of interesting

Definitionally, that input gets compressed into the weights. Pretty sure there's a proof somewhere that shows LLM training is basically a one-way (lossy) compression, so there's no way to go back afaik?

jdiff

Not the original, but a lossy facsimile that's Good Enough for almost anything. And as the short history of LLMs and other nets has shown us, they're often not even all that lossy.

jeffbee

I guess these guys didn't notice that all of these proto descriptors, and many others, were leaked on github 7 years ago.

https://github.com/ezequielpereira/GAE-RCE/tree/master/proto...

theLiminator

It's actually pretty interesting that this shows that Google is quite secure, I feel like most companies would not fare nearly as well.

kccqzy

Yes and especially the article mentions "With the help of the Google Security Team" so it's quite collaborative and not exactly black box hacking.

commandersaki

Their "LLM bugSWAT" events, held in vibrant locales like Las Vegas, are a testament to their commitment to proactive security red teaming.

I don't understand why security conferences are attracted to Vegas. In my opinion its a pretty gross place to conduct any conference.

lmm

Excluding uptight scolds is a feature not a bug. There's a lot of overlap between people who find Vegas objectionable and people who find red teaming objectionable (because why would any decent person know attacking/exploiting techniques).

commandersaki

The irony is that Vegas takes a dim view of those that take advantage of their gaming venues. The institutions that run it are quite aggressive when it comes to being attacked.

Anyways, security conferences such as BSides run all over the world in various cities where red teaming type activities is embraced. IMO it'd be nice to diversify from Vegas, preferably places with more scenery/greenery like Boulder or something.

zem

relatively cheap event space and hotels. it's hard to find a city to host a large conference.

desmosxxx

What don't you understand. Vegas is literally built for conferences.

hashstring

Real, I feel the exact same way.

numbsafari

You answered your own question.

scudsworth

reinvent is in vegas

ein0p

They hacked the sandbox, and leaked nothing. The article is entertaining though.

kccqzy

They leaked one file in the sandbox that contained lots of internal proto files. The security team reviewed everything in the sandbox and thought nothing in it is sensitive and gave the green light; apparently the review didn't catch this in the sandbox.

I guess this is a failing of the security review process, and possibly also how the blaze build system worked so well that people forgot a step existed because it was too automated.

charcircuit

>that contained lots of internal proto files

So does Google Chrome.

kccqzy

No it's not the same level of internal. There are internal proto files specific to Chromium and its API endpoints, and then there are internal proto files for google3. The latter can divulge secrets about Google's general server side architecture. The former only divulges secrets about server side components relevant to Chromium.

fpgaminer

Awww, I was looking forward to seeing some of the leak ;) Oh well. Nice find and breakdown!

Somewhat relatedly, it occurred to me recently just how important issues like prompt injection, etc are for LLMs. I've always brushed them off as unimportant to _me_ since I'm most interested in local LLMs. Who cares if a local LLM is weak to prompt injection or other shenanigans? It's my AI to do with as I please. If anything I want them to be, since it makes it easier to jailbreak them.

Then Operator and Deep Research came out and it finally made sense to me. When we finally have our own AI Agents running locally doing jobs for us, they're going to encounter random internet content. And the AI Agent obviously needs to read that content, or view the images. And if it's doing that, then it's vulnerable to prompt injection by third party.

Which, yeah, duh, stupid me. But ... is also a really fascinating idea to consider. A future where people have personal AIs, and those AIs can get hacked by reading the wrong thing from the wrong backalley of the internet, and suddenly they are taken over by a mind virus of sorts. What a wild future.

20after4

> reading the wrong thing from the wrong backalley of the internet, and suddenly they are taken over by a mind virus of sorts. What a wild future.

This already happens to people on the internet.

tcoff91

Yeah, the way some people lose it from the internet reminds me of Snow Crash.

HN

We hacked Gemini's Python sandbox and leaked its source code (at least some)

We hacked Gemini's Python sandbox and leaked its source code (at least some)