Skip to content(if available)orjump to list(if available)

High-Performance PNG Decoding

High-Performance PNG Decoding

8 comments

·March 23, 2025

pornel

In retrospect, it makes sense that fastest general-purpose inflate implementations can be beaten by fine-tuning for the kind of specific non-text data found in PNG images.

The Rust PNG crate took the same approach: https://lib.rs/fdeflate

adzm

Interesting aspect of blend2d in general is the usage of AsmJit for generating optimized platform specific asm at runtime

snickerbockers

That's a neat article, but I'm honestly surprised that inflating zlib compression is a thing people optimize for. I wrote a PNG loader of my own nine years ago[1] because I've found that reinventing wheels is the best way to learn new things. I took the obvious approach of inflaging the data one byte at a time, and i decided that GNU libc is good enough at buffering that I might as well fetch all the data one byte at a time using fgetc for ultimate convenience.

The year was 2016 but my PC at the time was a four year-old AMDFX 8120 "Bulldozer", which was notable for being one of the cheaper options for 8-core CPUs in 2012, but this bandwidth came at the expense of extremely slow throughput because it hailed from an era where AMD couldn't beat Intel's throughput so they'd try to compensate by investing in odd-ball features that only look good on paper and mark the price down a couple hundo so their "high-end" products where effectively competing with Intel's "middle-market".

But I digress, we're not here to talk shit about how bad AMD used to be in 2012.

ANYWAYS, even old "Dozie"[2] could run circles around anything that would have existed anywhere in the world when PNG was first standardized back in '96 so I was not in anyway surprised that my implementation could effortlessly load and display any image i threw at it. I didn't do performance testing or anything, it just didn't seem like there was any point in trying to compete with whatever millisecond-scale gainz libpng presumably had over my library when both of them are capable of loading the picture in less time it takes me to recognize the picture on the screen in front of me.

Anyways, I'm curious if you considered the possibility of abusing the Adam7 interlacing to declare victory early? I guess probably most people don't bother checking that box in GIMP but I'm of the opinion that as long as its possible, there's nothing in the rulebook about deferring work or showing the user a subsampled image.

[1] this somehow coincided with the news media going off nonstop about the "DeflateGate" SuperBowl scandal, but that really was a coincidence. Or maybe it planted some subconscious ideation into my psyche, IDK. But either way this is neither the first time nor the last time I have spent multiple months trying to implement something which i know is ultimately pointless to anyone but me.

[2] remember the CPU architecture was called "Bulldozer"

seritools

> I didn't do performance testing or anything, it just didn't seem like there was any point in trying to compete with whatever millisecond-scale gainz libpng presumably had over my library when both of them are capable of loading the picture in less time it takes me to recognize the picture on the screen in front of me.

The gains become relevant in bulk processing.

jcelerier

> ANYWAYS, even old "Dozie"[2] could run circles around anything that would have existed anywhere in the world when PNG was first standardized back in '96 so I was not in anyway surprised that my implementation could effortlessly load and display any image i threw at it. I didn't do performance testing or anything, it just didn't seem like there was any point in trying to compete with whatever millisecond-scale gainz libpng presumably had over my library when both of them are capable of loading the picture in less time it takes me to recognize the picture on the screen in front of me.

I remember reading a post on r/gameprogramming a couple weeks ago where someone used PNGs for their game assets which led to multiple minutes of load times spent on PNG decoding on a modern workstation. It's definitely still really relevant.

motorest

> That's a neat article, but I'm honestly surprised that inflating zlib compression is a thing people optimize for.

Think of all the usecases that involve editing an image, specially in context of the web. Does it surprise you that this is something people optimize for?

msephton

Please add some margins to your mobile style sheet. It's painful to read text that runs up against screen edges. I made it about a quarter of the way through the article add then gave up.

motorest

The site is perfectly fine on mobile.