Skip to content(if available)orjump to list(if available)

I use zip bombs to protect my server

I use zip bombs to protect my server

166 comments

·April 28, 2025

layer8

Back when I was a stupid kid, I once did

    ln -s /dev/zero index.html
on my home page as a joke. Browsers at the time didn’t like that, they basically froze, sometimes taking the client system down with them.

Later on, browsers started to check for actual content I think, and would abort such requests.

bobmcnamara

I made a 64kx64k JPEG once by feeding the encoder the same line of macro blocks until it produce the entire image.

Years later I was finally able to open it.

opan

I had a ton of trouble opening a 10MB or so png a few weeks back. It was stitched together screenshots forming a map of some areas in a game, so it was quite large. Some stuff refused to open it at all as if the file was invalid, some would hang for minutes, some opened blurry. My first semi-success was Fossify Gallery on my phone from F-Droid. If I let it chug a bit, it'd show a blurry image, a while longer it'd focus. Then I'd try to zoom or pan and it'd blur for ages again. I guess it was aggressively lazy-loading. What worked in the end was GIMP. I had the thought that the image was probably made in an editor, so surely an editor could open it. The catch is that it took like 8GB of RAM, but then I could see clearly, zoom, and pan all I wanted. It made me wonder why there's not an image viewer that's just the viewer part of GIMP or something.

Among things that didn't work were qutebrowser, icecat, nsxiv, feh, imv, mpv. I did worry at first the file was corrupt, I was redownloading it, comparing hashes with a friend, etc. Makes for an interesting benchmark, I guess.

For others curious, here's the file: https://0x0.st/82Ap.png

I'd say just curl/wget it, don't expect it to load in a browser.

koolba

I hope you weren’t paying for bandwidth by the KiB.

santoshalper

Nah, back then we paid for bandwidth by the kb.

slicktux

That’s even worse! :)

m463

Sounds like the favicon.ico that would crash the browser.

I think this was it:

https://freedomhacker.net/annoying-favicon-crash-bug-firefox...

sandworm101

Devide by zero happens to everyone eventually.

https://medium.com/@bishr_tabbaa/when-smart-ships-divide-by-...

"On 21 September 1997, the USS Yorktown halted for almost three hours during training maneuvers off the coast of Cape Charles, Virginia due to a divide-by-zero error in a database application that propagated throughout the ship’s control systems."

" technician tried to digitally calibrate and reset the fuel valve by entering a 0 value for one of the valve’s component properties into the SMCS Remote Database Manager (RDM)"

astolarz

Bad bot

fuzztester

I remember reading about that some years ago. It involved Windows NT.

https://www.google.com/search?q=windows+nt+bug+affects+ship

artursapek

[flagged]

bilekas

> At my old employer, a bot discovered a wordpress vulnerability and inserted a malicious script into our server

I know it's slightly off topic, but it's just so amusing (edit: reassuring) to know I'm not the only one who, after 1 hour of setting up Wordpress there's a PHP shell magically deployed on my server.

protocolture

>Take over a wordpress site for a customer

>Oh look 3 separate php shells with random strings as a name

Never less than 3, but always guaranteed.

ianlevesque

Yes, never self host Wordpress if you value your sanity. Even if it’s not the first hour it will eventually happen when you forget a patch.

sunaookami

Hosting WordPress myself for 13 years now and have no problem :) Just follow standard security practices and don't install gazillion plugins.

carlosjobim

There's a lot of essential functionality missing from WordPress, meaning you have to install plugins. Depending on what you need to do.

But it's such a bad platform that there really isn't any reason for anybody to use WordPress for anything. No matter your use case, there will be a better alternative to WordPress.

arcfour

Never use that junk if you value your sanity, I think you mean.

dx4100

There's ways that prevent it - - Freeze all code after an update through permissions - Don't make most directories writeable - Don't allow file uploads, or limit file uploads to media

There's a few plugins that do this, but vanilla WP is dangerous.

colechristensen

>after 1 hour

I've used this teaching folks devops, here deploy your first hello world nginx server... huh what are those strange requests in the log?

jeroenhd

These days, almost all browsers accept zstd and brotli, so these bombs can be even more effective today! [This](https://news.ycombinator.com/item?id=23496794) old comment showed an impressive 1.2M:1 compression ratio and [zstd seems to be doing even better](https://github.com/netty/netty/issues/14004).

Though, bots may not support modern compression standards. Then again, that may be a good way to block bots: every modern browser supports zstd, so just force that on non-whitelisted browser agents and you automatically confuse scrapers.

kevin_thibedeau

If you nest the gzip inside another gzip it gets even smaller since the blocks of compressed '0' data are themselves low entropy in the first generation gzip. Nested zst reduces the 10G file to 99 bytes.

ChuckMcM

I sort of did this with ssh where I figured out how to crash an ssh client that was trying to guess the root password. What I got for my trouble was a number of script kiddies ddosing my poor little server. I switched to just identifying 'bad actors' who are clearly trying to do bad things and just banning their IP with firewall rules. That's becoming more challenging with IPV6 though.

Edit: And for folks who write their own web pages, you can always create zip bombs that are links on a web page that don't show up for humans (white text on white background with no highlight on hover/click anchors). Bots download those things to have a look (so do crawlers and AI scrapers)

1970-01-01

Why is it harder to firewall them with IPv6? I seems this would be the easier of the two to firewall.

carlhjerpe

Manual banning is about the same since you just book /56 or bigger, entire providers or countries.

Automated banning is harder, you'd probably want a heuristic system and look up info on IPs.

IPv4 with NAT means you can "overban" too.

malfist

Why wouldn't something like fail2ban not work here? That's what it's built for and has been around for eons.

firesteelrain

I think they are suggesting the range of IPs to block is too high?

CBLT

Allow -> Tarpit -> Block should be done by ASN

echoangle

Maybe it’s easier to circumvent because getting a new IPv6 address is easier than with IPv4?

j_walter

Check this out if you want to stop this behavior...

https://github.com/skeeto/endlessh

leephillips

These links do show up for humans who might be using text browsers, (perhaps) screen readers, bookmarklets that list the links on a page, etc.

ChuckMcM

true, but you can make the link text 'do not click this' or 'not a real link' to let them know. I'm not sure if crawlers have started using LLMs to check pages or not which would be a problem.

marcusb

Zip bombs are fun. I discovered a vulnerability in a security product once where it wouldn’t properly scan a file for malware if the file was or contained a zip archive greater than a certain size.

The practical effect of this was you could place a zip bomb in an office xml document and this product would pass the ooxml file through even if it contained easily identifiable malware.

secfirstmd

Eh I got news for ya.

The file size problem is still an issue for many big name EDRs.

marcusb

Undoubtedly. If you go poking around most any security product (the product I was referring to was not in the EDR space,) you'll see these sorts of issues all over the place.

KTibow

It's worth noting that this is a gzip bomb (acts just like a normal compressed webpage), not a classical zip file that uses nested zips to knock out antiviruses.

kazinator

I deployed this, instead of my usual honeypot script.

It's not working very well.

In the web server log, I can see that the bots are not downloading the whole ten megabyte poison pill.

They are cutting off at various lengths. I haven't seen anything fetch more than around 1.5 Mb of it so far.

Or is it working? Are they decoding it on the fly as a stream, and then crashing? E.g. if something is recorded as having read 1.5 Mb, could it have decoded it to 1.5 Gb in RAM, on the fly, and crashed?

There is no way to tell.

MoonGhost

Try content labyrinth. I.e. infinitely generated content with a bunch of references to other generated pages. It may help against simple wget and till bots adapt.

PS: I'm on the bots side, but don't mind helping.

unnouinceput

Do they comeback? If so then they detect it and avoid it. If not then they crashed and mission accomplished.

kazinator

I currently cannot tell without making a little configuration change, because as soon as an IP address is logged as having visited the trap URL (honeypot, or zipbomb or whatever), a log monitoring script bans that client.

Secondly, I know that most of these bots do not come back. The attacks do not reuse addresses against the same server in order to evade almost any conceivable filter rule that is predicated on a prior visit.

monster_truck

I do something similar using a script I've cobbled together over the years. Once a year I'll check the 404 logs and add the most popular paths trying to exploit something (ie ancient phpmyadmin vulns) to the shitlist. Requesting 3 of those URLs adds that host to a greylist that only accepts requests to a very limited set of legitimate paths.

wewewedxfgdf

I protected uploads on one of my applications by creating fixed size temporary disk partitions of like 10MB each and unzipping to those contains the fallout if someone uploads something too big.

warkdarrior

`unzip -p | head -c 10MB`

sidewndr46

What? You partitioned a disk rather than just not decompressing some comically large file?

gchamonlive

https://github.com/uint128-t/ZIPBOMB

  2048 yottabyte Zip Bomb

  This zip bomb uses overlapping files and recursion to achieve 7 layers with 256 files each, with the last being a 32GB file.

  It is only 266 KB on disk.
When you realise it's a zip bomb it's already too late. Looking at the file size doesn't betray its contents. Maybe applying some heuristics with ClamAV? But even then it's not guaranteed. I think a small partition to isolate decompression is actually really smart. Wonder if we can achieve the same with overlays.

sidewndr46

What are you talking about? You get a compressed file. You start decompressing it. When the amount of bytes you've written exceeds some threshold (say 5 megabytes) just stop decompressing, discard the output so far & delete the original file. That is it.

kccqzy

Seems like a good and simple strategy to me. No real partition needed; tmpfs is cheap on Linux. Maybe OP is using tools that do not easily allow tracking the number of uncompressed bytes.

wewewedxfgdf

Yes I'd rather deal with a simple out of disk space error than perform some acrobatics to "safely" unzip a potential zip bomb.

Also zip bombs are not comically large until you unzip them.

Also you can just unpack any sort of compressed file format without giving any thought to whether you are handling it safely.

fracus

I'm curious why a 10GB file of all zeroes would compress only to 10MB. I mean theoretically you could compress it to one byte. I suppose the compression happens on a stream of data instead of analyzing the whole, but I'd assume it would still do better than 10MB.

philsnow

A compressed file that is only one byte long can only represent maximally 256 different uncompressed files.

Signed, a kid in the 90s who downloaded some "wavelet compression" program from a BBS because it promised to compress all his WaReZ even more so he could then fit moar on his disk. He ran the compressor and hey golly that 500MB ISO fit into only 10MB of disk now! He found out later (after a defrag) that the "compressor" was just hiding data in unused disk sectors and storing references to them. He then learned about Shannon entropy from comp.compression.research and was enlightened.

null

[deleted]

marcusf

man, a comment that brings back memories. you and me both.

tom_

It has to cater for any possible input. Even with special case handling for this particular (generally uncommon) case of vast runs of the same value: the compressed data will probably be packetized somehow, and each packet can reproduce only so many repeats, so you'll need to repeat each packet enough times to reproduce the output. With 10 GB, it mounts up.

I tried this on my computer with a couple of other tools, after creating a file full of 0s as per the article.

gzip -9 turns it into 10,436,266 bytes in approx 1 minute.

xz -9 turns it into 1,568,052 bytes in approx 4 minutes.

bzip2 -9 turns it into 7,506 (!) bytes in approx 5 minutes.

I think OP should consider getting bzip2 on the case. 2 TBytes of 0s should compress nicely. And I'm long overdue an upgrade to my laptop... you probably won't be waiting long for the result on anything modern.

rtkwe

It'd have to be more than one byte. There's the central directory, zip header, local header then the file itself you need to also tell it how many zeros to make when decompressing the actual file but most compression algorithms don't work like that because they're designed for actual files not essentially blank files so you get larger than the absolute minimum compression.

malfist

I mean, if I make a new compression algorithm that says a 10GB file of zeros is represented with a single specific byte, that would technically be compression.

All depends on how much magic you want to shove into an "algorithm"

kulahan

There probably aren’t any perfectly lossless compression algorithms, I guess? Nothing would ever be all zeroes, so it might not be an edge case accounted for or something? I have no idea, just pulling at strings. Maybe someone smarter can jump in here.

mr_toad

No lossless algorithm can compress all strings; some will end up larger. This is a consequence of the pigeonhole principle.

ugurs

It requires at leadt few bytes, there is no way to represent 10GB of data in 8 bits.

msm_

But of course there is. Imagine the following compression scheme:

    0-253: output the input byte
    254 followed by 0: output 254
    254 followed by 1: output 255
    255: output 10GB of zeroes
Of course this is an artificial example, but theoretically it's perfectly sound. In fact, I think you could get there with static huffman trees supported by some formats, including gzip.

dagi3d

I get your point(and have no idea why it isn't compressed more), but is the theoretical value of 1 byte correct? With just one single byte, how does it know how big should the file be after being decompressed?

kulahan

It’s a zip bomb, so does the creator care? I just mean from a practical standpoint - overflows and crashes would be a fine result.

crazygringo

> For the most part, when they do, I never hear from them again. Why? Well, that's because they crash right after ingesting the file.

I would have figured the process/server would restart, and restart with your specific URL since that was the last one not completed.

What makes the bots avoid this site in the future? Are they really smart enough to hard-code a rule to check for crashes and avoid those sites in the future?

fdr

Seems like an exponential backoff rule would do the job: I'm sure crashes happen for all sorts of reasons, some of which are bugs in the bot, even on non-adversarial input.