Skip to content(if available)orjump to list(if available)

Data Compression Nerds Hate This One Trick [video]

willvarfar

Reminds me of one of my most exhilerating old-timer stories. I once took 15 seconds off of the boot time of a smartphone - halving the startup time! - by replacing the MNG (think PNG but multi-frame animation; no surprise it never took off) startup animation with a custom encode-runs-of-pixel-offsets that nicely captured the way the logo swirled. Then encoded this as static data in a dll which the phone had a built-in system for compressing which worked well for this data, so doubly faster. This made me a complete legend for a while, although that phone model ended up never shipping.

lifthrasiir

MNG was unfortunately designed so badly that Mozilla had to create APNG to replace it. In retrospect, MNG was only capable to do about 10% of what SVG 1.0 can do on top of an incomprensible binary format. (One may argue that SVG is also badly designed for this reason.)

juped

It's rare that I see such a fractally wrong comment. I guess APNG did come from Mozilla, which I think is the only true subpart of the above?

lifthrasiir

Slides: https://filmroellchen.eu/talks/QOI%20EH25/#/0/1

As that data compression nerd, QOI is both a refresher and an annoyance. It serves as a good example that input modelling is very important, so much that QOI almost beats PNG in spite of very suboptimal coding. But QOI also represents a missed opportunity because, well, we can do much better if we are willing to use something like QOI instead of standard formats! Weirdly enough people seem to value only the simplicity when it comes to QOI, which is not the main takeaway I believe. Maybe QOIR [1] might result in a better balanced format in the future though...

[1] https://github.com/nigeltao/qoir/

Nanopolygon

QOI is just a simple filter. It cannot do full compression. In fact, in certain cases it can increase the size instead of compressing. It is unnecessarily overrated, of course, mostly because it is open source. The rest is irrelevant. There is another codec that is as fast as QOI (or even faster and multi-core) but with a much higher compression ratio. HALIC (High Availability Lossless Image Compression). But because it's closed source, it definitely didn't get the attention and respect it should have gotten. And that's why I think it stopped developing.

https://github.com/Hakan-Abbas/HALIC-High-Availability-Lossl...

null

[deleted]

staunton

Let's say you had a lossless image format that's 20% smaller (on average for pictures people send over networks) than PNG. Let's say it takes 10% more computing power than PNG. Do you stand to make money? What would it be used for?

I can't imagine people will start storing their family pictures in a new format they've never heard of which is not supported by any software they use for "just" 20% better compression. Do they even want lossless compression in the first place (if you don't ask them directly and call it that)?

lifthrasiir

That's the portability bit the presenter mentions, and is a very important concern in practice. But how about recompression? For example many PNG files are suboptimally compressed partly because PNG is an old format and also because many softwares have been too dumb to produce a well-optimized PNG. In that case we may benefit from a transparent recompression, which may be done either by using a better library like libdeflate or by internally using a separate format that can be quickly transformed from and back to PNG. In fact Dropbox did so for JPEG files [1]. When I'm saying "so much better" I was thinking about such opportunities that benefit end users.

[1] https://github.com/dropbox/lepton

staunton

Dropbox apparently abandoned the project. Do you know what their takeaways were from trying to improve the JPEG storage?

For example, was it worth it in the end? Did they announce anything? Did they switch to another method or give up on the idea, or do we not know?

forkerenok

This is the website of the presented project for those who prefer text:

https://qoiformat.org/

An interesting and somewhat inspiring bit of trivia from the video: the creator barely understands modern image compression techniques (from their own words), but this hasn't stopped them from coming up with that impressive result.

Galanwe

In this day and age, storage is not as important as IO, so I hate it when I see benchmarks with compression ratios + CPU times. That's not helpful.

What I want to see is the total IO + CPU time across libraries for my specific IO and CPU constraints.

Sure, it makes benchmarks more involved to display, as scalars are not enough anymore, you need multiple curves, but that's meaningful at least.

To illustrate, if I have very fast IO, then I probably don't care of the compression ratio, it will be faster to download the raw payload and have 0 decompression cost.

On the other end of the spectrum, if I have very slow IO, I would gladly have a much slower decompression algorithm but higher compression ratio for a faster overall timing.

This is especially important because cloud storage for instance are rather cheap, but slow. Caches/CDNs are very fast, but storage is expensive. Etc

_ZeD_

[flagged]

kleiba

How are these two things even remotely related?