Skip to content(if available)orjump to list(if available)

Heap-overflowing Llama.cpp to RCE

Heap-overflowing Llama.cpp to RCE

60 comments

·March 23, 2025

VladVladikoff

This is really incredible work. And the fact that you are 15 is blowing my mind. You have a really bright future ahead of you, and your parents must be really proud (at least I would be if you were my kid.) Hit me up if you want a summer internship finding security vulnerabilities at a hotel software startup (access control, property management, etc)

worldsavior

[flagged]

ziddoap

This is a crazy amount of assumptions to make about someone (and their family) that you know nothing about.

worldsavior

You're right, I don't know anything about his family intially, but it doesn't make me wrong. You can't do the kind of stuff he did without having a lot, a lot of time at hand. I'm not saying he is a problem, but I disagree with the path he has chosen.

asveikau

A lot of us reading your comment were coding at 15.

worldsavior

I was coding at 15 as well, but not as much as he is.

lynx97

Well, to some kids (I used to be one of them), the world outside is hostile, unfair and pretty much depressing. I used to basically just have my "keyboard", and it was a wonderful way of escaping the reality and dullness of human interactions as a visually disabled kid. My "keyboard" brought me a job that I love which also pays my bills and made me indepenent. Why would you have wanted to take my keyboard away? To force me to interact with other kids that basically hated me, abused me, or were basically dull and dumb? This condescending attitude of yours almost makes me cry.

worldsavior

I'm sorry if I have hurt you in some way, this is not my intention of course. I'm not trying to come across as a bully.

In your case I guess the situation is more applicable, but for someone that is not disabled in some way, I would say go and fight those other kids.

It's also a bit ironic. The internet is a much more cruel place than real life.

gosub100

Geohot removed the SIM lock on iPhones when he was 17

https://en.m.wikipedia.org/wiki/George_Hotz

Dude rooted the PS3 by directly reading the ram chips ( _from electrical probes _) 2 years later. I don't think his parents helped.

qwertox

His labor's so pure it'd make Marx weep.

yamrzou

This is amazing—made even more impressive by the fact that the author is just 15 years old!

Also, it's nice to see this mentioned:

> For this 10k-word write-up, I spent around a month finishing up the main parts, and refining/editing it took an extra while. Writing this is indeed a painful process. I spent the entire day on the weekend and 4-5 hours during the rest of the week working on it for around two weeks.

It's the kind of behind-the-scenes effort that often goes unspoken.

Sheeny96

10k words was the word count for my 3rd year undergraduate dissertation in the UK. Typically, this is tirelessly worked on over months. The quality of this far exceeds anything I produced during that time and anything I saw from my peers.

null

[deleted]

asveikau

I'm also detecting hints of non-native English in his writing which may make it even more effort. Though his Twitter account says he's based in Connecticut.

Edit: Wow, shocked that I'm being received so negatively about this, non native English is not "bad". It isn't meant as judgement but praise for what he's accomplished.

johnisgood

He's 15. Many adults whose native language is English can't write it correctly, or what are you referring to if not grammar errors? Can you give me examples?

asveikau

The way grammar works is that it is hard to describe the rules of your native language. I am parsing this as non native. My older kid is a few years younger than this guy, and her text messages to me sound more native than this corpus. No I can't describe it concretely. It's just how I parse word choice and sentence structure.

null

[deleted]

yamrzou

I tried to execute the PoC by running the following:

  git clone https://github.com/ggml-org/llama.cpp.git && cd llama.cpp
  git checkout c0d4843225eed38903ea71ef302a02fa0b27f048 # Checkout a revision prior to the exploit fix in 1d20e53c40c3cc848ba2b95f5bf7c075eeec8b19
  mkdir build-rpc && cd build-rpc
  cmake .. -DGGML_RPC=ON
  cmake --build . --config Release
  cd bin/
  ./rpc-server -p 50052
In a second terminal:

  nc -lvp 1337
Then running the exploit code in a third terminal (from llama.cpp/build-rpc/bin directory):

  pip install pwntools
  python exp.py # From https://gist.github.com/retr0reg/d13de3fde8f9d138fe1af48e59e630a9
It failed at Stage Three: Bypass boundary check via libggml and raised an EOFError. The RPC server exited with Segmentation fault. Any idea why?

retr0reg

it can be both because of the unsuccessful leak / wrong `libggml-base` offset. We're building a fake `ggml_backend_buffer` table from the leaked base + offset (the hard-coded offset of `libggml-base` should be adjusted with the compiled release) However this exploitation is not actually `libggml-base` version dependent, the partial-writing space is always one byte, and you can leak the `libggml-base` version with after a successful leak if you build every release's `libggml-base`, and map the last-two-bytes with each version.

I am happy you read it and liked it; more glad you tried it yourself :D

rboyd

sheesh. the visual aesthetics and script behavior on your blog are so tastefully executed. great job!

krackers

The smudges on the screenshots got me.

andrewSC

I'd honestly love to know what framework, theme, or stack is being used here! Looks incredible--great job!

evannotfound

Hi! I am the developer of Retr0's portfolio. I used nextjs for the framework, with framer motion + gsap for animation. The blog is powered by hashnode headless api with serverside rendering.

andrewSC

Awesome! Thank you for the follow up and great work!

evannotfound

thank you for your support!

zaphod420

Can anyone tl/dr this? Does this mean that its possible for a maliciously crafted LLM to execute arbitrary code via an exploit in llama.cpp?

cadamsdotcom

Thanks for adding value today.

m00dy

[flagged]

behnamoh

prodigies are amazing, but I often wonder what they end up doing later in life when the intelligence gap between them and their peers converges to zero.

miki123211

If they're lucky, the gap converges to zero because they surround themselves with more and more intelligent people as time passes. That's a recipe for success.

If they're unlucky, the gap converges to zero because they get used to not having to do much work, "fail upwards" because of the raw intelligence, and then can't keep up when surrounded by similarly intelligent people who actually do the work.

Failing at something you were told you were extremely good at, and hence based your entire identity around, is extremely difficult and demoralizing. Some people can never really recover from that, and AFAIK depression / suicide isn't unheard of.

Definitely not a problem for this particular kid though, "lack of hard work" and "coasting" is evidently not what this person is about.

THe middle scenario is kids that do the work, but stay in their community for economic / political / class / "born in the wrong place" reasons. Their talents are mostly squandered, but they might end up doing something very significant for the communities they're part of.

This used to be extremely common, a medieval peasant or ancient slave would most likely stay in their village, regardless of how much of a genius they were. The modern world made it much less so, and that's something worth celebrating.

msp26

Agreed, you put it well. It's really hard to develop a work ethic or learn how to study properly so late.

You lack all the foundational habits and are just used things working out naturally. There are zero consequences, only positive outcomes despite doing the bare minimum. And then it all goes to shit.

pram

In my teens I was in an IRC community full of various hacker/script kiddie miscreants. Some of these people I would call actual geniuses.

The trajectory of everyone ranged from early Facebook employees, a CMU CS PhD, to one literally going to prison for an exploit lol. You can never tell where life will take you.

subscribed

Some do wonderfully well, for example lcamtuf or Joanna Rutkowski.

If they're self-driven like the original author, they'll be good, they not necessarily need the gradient.

pragmatic8

Why do you presume that the intelligence gap would converge to zero?

ziddoap

Eventually everyone dies, thus becoming equally intelligent!

behnamoh

easy: how many genius people do you know who were also prodigies? early intelligence only gets you so far, the rest depends on hard work, passion, etc.

FeepingCreature

If your later work overshadows your earlier, you're not generally remembered as a prodigy.

om8

Not surprising, llama.cpp code is a mess.

It's sad that hacked things that emerge first are way more popular than properly done projects that come later.

retr0reg

In fact the llama.cpp codebase is well-developed and actively maintained. It has undergone iterative security hardening, intensive low-level security checks have been implemented in both the core inference engine and RPC components.

This standard of security is what made the exploitation such challenging and rewarding.

vlovich123

It’s actively maintained but I wouldn’t classify it as a clean codebase. Neither the abstractions it has within ggml, the structure of llama.cpp, effective use of modern c++ etc. it can’t even really make up its mind as to whether it should be c++ or c and there’s a lot of dirt because of that. Heck instead of using a submodule they’re copying ggml between projects making it very difficult to keep track of what’s actually happening where and what the ground truth is. It’s sloppy engineering. Parts are better designed for sure.

None of that is meant to take away from your effort or the success of llama.cpp, but I have spent quite a bit of time reading and working with the internals across layers and have a good eye for quality c++ patterns.

PartiallyTyped

Thanks for the writeup! Was a very interesting read! I've subscribed and I am looking forward to your next exploits! ^_^

qskousen

Is there a comparable open source thing "done properly"?

null

[deleted]

tuveson

llama.rs, of course /s