Show HN: Meow – An Image File Format I made because PNGs and JPEGs suck for AI

76 comments

·June 15, 2025

One of the biggest context AI LLMs can get from images is their metadata, but it's extremely underutilized. and while PNG and JPEG both offer metadata, it gets stripped way too easily when sharing and is extremely limited for AI based workflows and offer minimal metadata entries for things that are actually useful. Plus, these formats are ancient (1995 and 1992) - it's about time we get an upgrade for our AI era. Meet MEOW (Metadata-Encoded Optimized Webfile) - an Open Source Image file format which is basically PNG on steroids and what I also like to call the purr-fect file format.

Instead of storing metadata alongside the image where it can be lost, MEOW ENCODES it directly inside the image pixels using LSB steganography - hiding data in the least significant bits where your eyes can't tell the difference, this also doesn't increase the image size significantly. So if you use any form of lossless compression, it stays.

What I noticed was, Most "innovative" image file formats died because of lack of adoption, but MEOW is completely CROSS COMPATIBLE WITH PNGs You can quite literally rename a .MEOW file to a .PNG and open it in a normal image viewer.

Here's what gets baked right into every pixel:

- Edge Detection Maps - pre-computed boundaries so AI doesn't waste time figuring out where objects start and end.

- Texture Analysis Data - surface patterns, roughness, material properties already mapped out.

- Complexity Scores - tells AI models how much processing power different regions need.

- Attention Weight Maps - highlights where models should focus their compute (like faces, text, important objects)

- Object Relationship Data - spatial connections between detected elements.

- Future Proofing Space - reserved bits for whatever AI wants to add (or comments for training LORAs or labelling)

Of course, all of these are editable and configurable while surviving compression, sharing, even screenshot-and-repost cycles :p

When you convert ANY image format to .meow, it automatically generates most AI-specific features and data from what it sees in the image, which makes it work way better.

Would love thoughts, suggestions or ideas you all have for it :)

Visit

fao_

> Instead of storing metadata alongside the image where it can be lost, MEOW ENCODES it directly inside the image pixels using LSB steganography

That makes the data much more fragile than metadata fields, though? Any kind of image alteration or re-encoding (which almost all sites do to ensure better compression — discord, imgur, et al) is going to trash the metadata or make it utterly useless.

I'll be honest, I don't see the need for synthesizing a "new image format" because "these formats are ancient (1995 and 1992) - it's about time we get an upgrade" and "metadata [...] gets stripped way too easily" when the replacement you are advocating not only is the exact same format as a PNG but the metadata embedding scheme is much more fragile in terms of metadata being stripped randomly when uploaded somewhere. This seems very bizarre to me and ill-thought-out.

Anyway, if you want a "new image format" because "the old ones were developed 30 years ago", there's a plethora of new image formats to choose from, that all support custom metadata. including: webp, jpeg 2000, HEIF, jpeg xl, farbfeld (the one the suckless guys made).

I'll be honest... this is one of the most irritating parts of the new AI trend. Everyone is an "ideas guy" when they start programming, it's fine and normal to come up with "new ideas" that "nobody else has ever thought of" when you're a green-eared beginner and utterly inexperienced. The irritating part is what happens after the ideas phase.

What used to happen was you'd talk about this cool idea in IRC and people would either help you make it, or they would explain why it wasn't necessarily a great idea, and either way you would learn something in the process. When I was 12 and new to programming, I had the "genius idea" that if we could only "reverse the hash algorithm output to it's input data" we would have the ultimate compression format... anyone with an inch of knowledge will smirk at this preposition! And so I learned from experts on why this was impossible, and not believing them, I did my own research, and learned some more :)

Nowadays, an AI will just run with whatever you say — "why yes if it were possible to reverse a hash algorithm to its input we would have the ultimate compression format", and then if you bully it further, it will even write (utterly useless!) code for you to do that, and no real learning is had in the process because there's nobody there to step in and explain why this is a bad idea. The AI will absolutely hype you up, and if it doesn't you learn to go to an AI that does. And now within a day or two you can go from having a useless idea, to advertising that useless idea to other people, and soon I imagine you'll be able to go from advertising that useless idea to other people, to manufacturing it IRL, and at no point are you learning or growing as a person or as a programmer. But you are wasting your own time and everyone else's time in the process (whereas before, no time was wasted because you would learn something before you invested a lot of time and effort, rather than after).

thinkingQueen

Exactly. Not long ago, someone showed up on Hacker News who had, on his own, begun to rediscover the benefits of arithmetic coding. Naturally, he was convinced he’d come up with a brand-new entropy coding method. Well, no harm done and it’s nice that people study compression but I was surpised how easily he got himself convinced of a discovery. Clearly he knew very little.

Overall, I think this is a positive ”problem” to have :-)

magicalhippo

I've had several revolutionary discoveries during my time programming. In each case, after the euphoria had settled a bit, I asked myself: Why aren't we already doing this? Why isn't this already a thing? What am I missing?

And lo and behold, in each case I did find that it was either not novel at all or it had some major downside I had initially missed.

Still, fun to think about new ways of doing things, so I still go at it.

whoisyc

> webp, jpeg 2000, HEIF, jpeg xl, farbfeld

I think you just illustrated how difficult it is to propose a new standard. Webp was not supported by many image related softwares (including the Adobe suite!) for years and earned a bad reputation, HEIF is also poorly supported, JPEG XL was removed from Chrome despite being developed by Google and not supported by any other browser AFAIK. Never heard of farbfeld before.

If the backing from Apple and Google was not enough to drive the adoption of an image format, I fail to see how this thing can go anywhere.

RiOuseR

[flagged]

ai_critic

Reality check:

Your extra data is a big JSON blob. Okay, fine.

File formats dating back to Targa (https://en.wikipedia.org/wiki/Truevision_TGA) support arbitrary text blobs if you're weird enough.

PNG itself has both EXIF data and a more general text chunk mechanism (both compressed and uncompressed, https://www.libpng.org/pub/png/spec/1.2/PNG-Chunks.html#C.An... , section 4.2.3, you probably want iTXt chunks).

exiftool will already let you do all of this, by the way. There's no reason to summon non-standard file format into the world (especially when you're just making a weird version of PNG that won't survive resizing or quantization properly).

ai_critic

Here, two incantations:

> exiftool -config exiftool.config -overwrite_original -z '-_custom1<=meta.json' cat.png

and

> exiftool -config exiftool.config -G1 -Tag_custom1 cat.png

You can (with AI help no less) figure out what `exiftool.config` should look like. `meta.json` is just your JSON from github.

Now go draw the rest of the owl. :)

kuberwastaken

Hi! Thanks for checking it out, means a lot :)

Yes, it is a big JSON blob atm, haha and t's definitely still a POC, but the idea is to avoid having a separate JSON file that adds to the complexity. While EXIF data works pretty well for most basic stuff, it's not enough for everything one might need for AI specific stuff, especially for things like attention maps and saliency regions.

I'm currently working on redundancy and error correction to deal with the resizing problem. Having a separate file format, even if it's a headache and adds another one to the list (well, another cute-sounding one at least), gives more customization options and makes it easier to associate the properties directly.

There's definitely a ton of work left to do, but I see a lot of potential in something like this (also, nice username)

ai_critic

> While EXIF data works pretty well for most basic stuff, it's not enough for everything one might need for AI specific stuff, especially for things like attention maps and saliency regions.

That's why I mentioned that you put anything, include binary data--which includes images--into the chunks in a PNG. I think Pillow even supports this (there are some PRs, like https://github.com/python-pillow/Pillow/pull/4292 , that suggest this).

Your problem domain is:

* Have something that looks like a PNG...

* ...that doesn't need supporting files outside itself...

* ...that can also store textual data (e.g., that JSON blob of bounding boxes and whatnot)...

* ...and can also store image data (e.g., attention maps and saliency regions).

What I'm telling you is that the PNG file format already supports all of this stuff, you just need to be smart enough to read the spec and apply the affordances it gives you.

> I'm currently working on redundancy and error correction to deal with the resizing problem. Having a separate file format, even if it's a headache and adds another one to the list (well, another cute-sounding one at least), gives more customization options and makes it easier to associate the properties directly.

In the 90s, we'd already spent vast sums of gold and blood and tears solving the "holy shit, how do we encode multiple things in images so that they can survive an image pipeline, be extensible to end users, and be compressed reliably."

None of this has been new for three decades. Nothing you are going to do is going to be a value add over correctly using the file format you already have.

I promise that you aren't going to see anything particularly new or exciting in this AI goldrush that isn't an isomorphism of something much smarter, much better-paid people solved back when image formats were still a novel problem domain (again, in the 1990s).

vunderba

> it's not enough for everything one might need for AI specific stuff, especially for things like attention maps and saliency regions.

Why not exactly? ComfyUI encodes an absolute bonker amount of information (all arbitrary JSON) into workflow PNG files without any issues.

ai_critic

Indeed. And character cards for chatbots (like in SillyTavern) have supported this for years.

gavinray

Maybe I'm jaded, but I fail to see how a bespoke file format is a better solution than bundling a normal image and a JSON/XML document containing metadata that adheres to a defined specification.

It feels like creating a custom format with backwards PNG compatibility and using steganography to cram metadata inside is an inefficient and over-engineered alternative to a .tar.gz with "image.png" and "metadata.json"

kuberwastaken

That's fair and how it's traditionally done but the entire idea of this was to have everything you need on the image itself and reduce the complexity and extra files, no risk of losing the JSON, mismatching versions, or needing extra packaging steps.

I'm working on redundancy and error correction to make it better!

CharlesW

> …creating a custom format with backwards PNG compatibility and using steganography to cram metadata inside is an inefficient and over-engineered alternative to a .tar.gz with "image.png" and "metadata.json"

So, "perfect Show HN"? ¯\_(ツ)_/¯

xhkkffbf

Yes, separate metadata has great advantages, but it can get separated from the main file pretty easily. Many social media platforms and email sites will let you embed PNG files. But they won't let you embed an image with a separate metadata file that's always kept along with it.

When images get loose in the wild, this can be very helpful.

jbverschoor

Why not simply JXL? It has multiple channels, can store any metadata, is lossy/lossless.

spookie

Or even DDS.

DanHulton

You have invented essentially an _incredible way_ to poison AI image datasets.

Step 1: Create .meow images of vegetables, with "per-pixel metadata" instead encoded to represent human faces. Step 2: Get your images included in the data set of a generative image model. Step 3: Laugh uproariously as every image of a person has vaguely-to-profoundly vegetal features.

whoisyc

This assumes people training AI are going to put in the efforts to extract metadata from a poorly specified “format” with a barely coherent buzzword ridden README file. Realistically, they will just treat any .meow as opaque binary blobs and any png as regular png file.

zdw

It would be better to use this as an additional extension before the normal extension like other tools that embed additional metadata do.

For example, Draw.io can embed original diagrams in .svg and .png files, and the pre-suffix is .drawio.png or .drawio.png .

kuberwastaken

Hmm that's a great idea as well, I'll look into it, thank you :)

a2128

You're adding metadata, but what problems does this added metadata solve exactly? If your converter can automatically compute these image features, then AI training and inference pipelines can trivially do the same, so I don't see the point in needing a new file format that contains these.

Moreover, models and techniques get better over time, so these stored precomputed features are guaranteed to become obsolete. Even if they're there and it's simple to use in a pipeline and everybody is using this file format, pipelines still won't use it when they were precomputed years ago and state-of-the-art techniques give more accurate features.

jtsylve

The answer may be in your question.

- This is currently solved by inference pipelines. - Models and techniques improve over time.

The ability for different agents with different specialties to add additional content while being able to take advantage of existing context is what makes the pipeline work.

Storing that content in the format could allow us to continue to refine the information we get from the image over time. Each tool that touches the image can add new context or improve existing context and the image becomes more and more useful over time.

I like the idea.

kuberwastaken

Said it better than I could have

also, the idea is to integrate the conversion processes/ pipelines with other data that'll help with customized workflows.

ai_critic

> Each tool that touches the image can add new context or improve existing context and the image becomes more and more useful over time.

This is literally the problem solved by chunk-based file formats. "How do we use multiple authoring tools without stepping on each other" is a very old and solved problem.

ahofmann

So converting the file to a lossy format, or resizing the image as png will destroy the encoded information? I see why one wants to use it, but I think it can be only useful in a controlled environment. As soon as someone else has access to the file, the information can easily get lost. Just like metadata.

bastawhiz

Modifying the image in any way (cropping, resizing, etc) destroys the metadata. This is necessary in basically every application that interacts with any kind of model that uses images, either for token count reasons, file size reasons, model limits, etc. (Source: I work at a genai startup)

At inference time, you don't control the inputs, so this is moot. At training time, you've already got lots of other metadata that you need to store and preserve that almost certainly won't fit in steganographically encoded format, and you've often got to manipulate the image before feeding it into your training pipeline. Most pipelines don't simply take arbitrary images (nor do you want them: plenty of images need to be modified to, for instance, remove letterboxing).

The other consideration is that steganography is actively introducing artifacts to your assets. If you're training on these images, you'll quickly find that your image generation model, for instance, cannot generate pure black. If you're adding what's effectively visual noise to every image you train on, the model will generate images with noise.

vunderba

Was just coming here to say this. Most graphic editors can easily preserve EXIF/IPTC data across edits.

Without an entirely dedicated editor or postprocessing plugin, stenography gets destroyed on modification.

moritzwarhier

Why not store metadata, along with a checksum of the png, in myPublicPhoto.png.meow?

Labeling and metadata a separate concerns. "Edge detection maps" etc are implementation details of whatever you are doing with image data, and quite likely to be non-portable.

And non-removability / steganography of additional metadata is not a selling point at all?

So my thoughts are, this violates separation of concerns and seems badly thought-out.

It also mangles labeling, metadata and technicalities, and attempts to predict future requirements.

I don't understand potential utility.

can16358p

Nice work!

Though I have one question: once 2 bits/channel are used with Meow-specific data thus leaving 6bits/channel, I doubt how it can still retain perfect image quality when either: (if everything's re-encoded) dynamic range is reduced by 75% or LSB changes introduce noise to the original image. Not too much noise, but still.

voxleone

Great idea and insight. If i understand, it will allow you to embed metadata such as bounding box coordinates and class names, something I have also been working on[0] -- embedding computer vision annotation data directly into an image's EXIF tags, rather than storing it in separate sidecar text files. The idea is simplifying the dataset's file structure. It could offer unexpected advantages — especially for smaller or proprietary datasets, or for fine-tuning tasks where managing separate annotation files adds unnecessary overhead.

[0] https://github.com/VoxleOne/XLabel

Edited for clarity