Qwen-Image: Crafting with native text rendering

13 comments

·August 4, 2025

https://qianwen-res.oss-cn-beijing.aliyuncs.com/Qwen-Image/Q...

rushingcreek

Not sure why this isn’t a bigger deal —- it seems like this is the first open-source model to beat gpt-image-1 in all respects while also beating Flux Kontext in terms of editing ability. This seems huge.

zamadatix

It's only been a few hours and the demo is constantly erroring out, people need more time to actually play with it before getting excited. Some quantized GGUFs + various comfy workflows will also likely be a big factor for this one since people will want to run it locally but it's pretty large compared to other models. Funnily enough, the main comparison to draw might be between Alibaba and Alibaba. I.e. using Wan 2.2 for image generation has been an extremely popular choice, so most will want to know how big a leap Qwen-Image is from that rather than Flux.

The best time to judge how good a new image model actually is seems to be about a week from launch. That's when enough pieces have fallen into place that people have had a chance to really mess with it and come out with 3rd party pros/cons of the models. Looking hopeful for this one though!

rwmj

This may be obvious to people who do this regularly, but what kind of machine is required to run this? I downloaded & tried it on my Linux machine that has a 16GB GPU and 64GB of RAM. This machine can run SD easily. But Qwen-image ran out of space both when I tried it on the GPU and on the CPU, so that's obviously not enough. But am I off by a factor of two? An order of magnitude? Do I need some crazy hardware?

mortsnort

I believe it's roughly the same size as the model files. If you look in the transformers folder you can see there are around 9 5gb files, so I would expect you need ~45gb vram on your GPU. Usually quantized versions of models are eventually released/created that can run on much less vram but with some quality loss.

foobarqux

Why doesn't huggingface list the aggregate model size?

zippothrowaway

You're probably going to have to wait a couple of days for 4 bit quantized versions to pop up. It's 20B parameters.

nickandbro

The fact that it doesn’t change the images like 4o image gen is incredible. Often when I try to tweak someone’s clothing using 4o, it also tweaks their face. This only seems to apply those recognizable AI artifacts to only the elements needing to be edited.

herval

You can select the area you want edited on 4o, and it’ll keep the rest unchanged

djoldman

Checkout section 3.2 Data Filtering:

https://qianwen-res.oss-cn-beijing.aliyuncs.com/Qwen-Image/Q...

yjftsjthsd-h

Wow, the text/writing is amazing! Also the editing in general, but the text really stands out

artninja1988

Insane how many good Chinese open source models they've been releasing. This really gives me hope

anon191928

It will take years for people to use these but Adobe is not alone.

herval

Adobe has never been alone. Photoshop’s AI stuff is consistently behind OSS models and workflows. It’s just way more convenient

HN

Qwen-Image: Crafting with native text rendering

Qwen-Image: Crafting with native text rendering