Releasing weights for FLUX.1 Krea

102 comments

·July 31, 2025

dvrp

Hello everyone.

I’m the Co-founder and CTO of Krea. We’re excited because we wanted to release the weights for our model and share it with the HN community for a long time.

My team and I will try to be online and try to answer any questions you may have throughout the day.

mk_stjames

Any plans to get into working with the Flux 'Kontext' version, the editing models? I think the use cases of such prompted image editing is just wildly huge. Their demo blew my mind, although I haven't seen the quality of the open weight version yet. It is also a 12B distill.

jackphilson

Hi. Thanks for this. What is your goal of doing so? From a business standpoint. Or is it purely altruistic?

dvrp

Haha-classic!

It’s simple: hackability and recruiting!

The open-source community hacking around it and playing with it PLUS talented engineers who may be interested in working with us already makes this release worth it. A single talented distributed systems engineer has a lot of impact here.

Also, the company ethos is around AI hackability/controllability, high-bar for talent, and AI for creatives - so this aligns perfectly.

The fact that Krea serves both in-house and 3rd-Party models tells you that we are not that bullish on models being a moat.

wjrb

I can say that it's definitely working on me! I hadn't heard of Krea before, and this is a great introduction to your work. Thanks for sharing it.

Western0

I need model for other language than english

cubefox

Regarding the P(.|photo) vs P(.|minimal) example, how do you actually decide this conflict? It seems to me that photorealism should be a strong default "bias".

My reasoning: If the user types in "a cat reading a book" then it seems obvious that the result should look like a real cat which is actually reading a book. So it obviously shouldn't have an "AI style", but it also shouldn't produce something that looks like an illustration or painting or otherwise unrealistic. Without further context, a "cat" is a photorealistic cat, not an illustration or painting or cartoon of a cat.

In short, it seems that users who want something other than realism should be expected to mention it in the prompt. Or am I missing some other nuances here?

vunderba

Nice release. Ran some preliminary tests using the 12b Txt2Img Krea model. Its biggest wins seems to be raw speed (and possibly realism) but perhaps unsurprisingly did not score any higher on the leaderboard for prompt adherence than the normal Flux.1D model.

https://genai-showdown.specr.net

On another note, there seem to be some indication that Wan 2.2+ future models might end up becoming significant players in the T2I space though you'll probably need a metric ton of LoRAs to cover some of the lack of image diversity.

dvrp

Can you point to a URL with the tests you’ve done?

Also, FWIW, this model focus was around aesthetics rather than strict prompt adherence. Not to excuse the bad samples, but to emphasize what was one of the research goals.

It’s a thorny trade-off, but an important one if one wants to get rid of what’s sometimes known as “the flux look”.

Re: Wan 2.2 I’ve also been reading of people commenting about using Wan 2.2 for base generation and Krea for the refiner pass which I thought was interesting.

vunderba

The Image Showdown site actually does have Flux Krea images but they're hidden by default. If you open up the "Customize Models" dialog you can compare them against other Flux models (Flux.1 Dev and Kontext).

> FWIW, this model focus was around aesthetics

Agreed - whereas these tests are really focused on various GenAI image models ability to follow complicated prompts and are not as concerned with overall visual fidelity.

Regarding the "flux look" I'd be interested to see if Krea addresses both the waxy skin look AND the omnipresent shallow depth of field.

sangwulee

Hi! I'm lead researcher on Krea-1. FLUX.1 Krea is a 12B rectified flow model distilled from Krea-1, designed to be compatible with FLUX architecture. Happy to answer any technical questions :)

bsenftner

From a traditional media production background, where media is produced in separate layers, which are then composited together to create a final deliverable still image, motion clip, and/or audio clip - this type of media production through the creation of elements that are then combined is an essential aspect of expense management, and quality control. Current AI image, video and audio generation methods do not support any of that. ForgeUI did briefly, but that went away, which I suspect because few understand large scale media production requirements.

I guess my point being: do you have any (real) experienced media production people working with you? People that have experience working in actual feature film VFX, animated commercial, and multi-million dollar budget productions?

If you really want to make your efforts a wild success, simply support traditional media production. None of the other AI image/video/audio providers seem to understand this, and it is gargantuan: if your tools plugged into traditional media production, it will be adopted immediately. Currently, they are tentatively and not adopted because they do not integrate with production tools or expectations at all.

oompty

The model looks incredible!

Regarding this part: > Since flux-dev-raw is a guidance distilled model, we devise a custom loss to finetune the model directly on a classifier-free guided distribution.

Could you go more into detail on the specific loss used for this and any other possible tips for finetuning this that you might have? I remember the general open source ai art community had a hard time with finetuning the original distilled flux-dev so I'm very curious about that.

swyx

thanks for doing this!

what does " designed to be compatible with FLUX architecture" mean and why is that important?

sangwulee

FLUX.1 is one of the most popular open weights text-to-image models. We distilled Krea-1 to FLUX.1 [dev] model so that the community can adopt it seamlessly into existing ecosystem. Any finetuning code, workflows, etc that was built on top of FLUX.1 [dev] can be reused with our model :)

swyx

do LoRAs conflict with your distillation?

bangaladore

Can someone ELI5 why the safetensor file is 23.8 GB, given the 12B parameter model? Does the model use closer to 24 GB of VRAM or 12 GB of VRAM. I've always associated a 1 billion parameter = 1 GB of VRAM. Is this estimate inaccurate?

sangwulee

Quick napkin math assuming bfloat16 format : 1B * 16 bits = 16B bits = 2GB. Since it's a 12B parameter model, you get around ~24GB. Downcasting to bfloat16 from float32 comes with pretty minimal performance degradation, so we uploaded the weights in bfloat16 format.

piperswe

A parameter can be any size float. Lots of downloadable models are FP8 (8 bits per parameter), but it appears this model is FP16 (16 bits per parameter)

Often, the training is done in FP16 then quantized down to FP8 or FP4 for distribution.

dragonwriter

I think they are bfloat16, not FP16, but they are both 16bpw formats, so it doesn't make a size difference.

iyn

Wiki article on bfloat16 for reference, since it was new to me: https://en.wikipedia.org/wiki/Bfloat16_floating-point_format

Tokumei-no-hito

pardon the ignorance but it's the first time I've heard of bfloat16.

i asked chat for an explanation and it said bfloat has a higher range (like fp32) but less precision.

what does that mean for image generation and why was bfloat chosen over fp?

petercooper

That's a good ballpark for something quantized to 8 bits per parameter. But you can 2x/4x that for 16 and 32 bit.

7734128

I've never seen a 32 bit model. There's bound to be a few of them, but it's hardly a normal precision.

zamadatix

Some of the most famous models were distributed as F32, e.g. GPT-2. As things have shifted more towards mass consumption of model weights it's become less and less common to see.

ilc

Tried a simple prompt, and got some pretty interesting results:

"Octopus DJ spinning the turntables at a rave."

The human like hands the DJ sprouts are interesting, and no amount of prompting seems to stop them.

Opinionated, as the paper says.

earthicus

Describing it as "Octopus DJ with no fingers" got rid of the hands for me, but interestingly, also removed every anthropomorphized element of the octopus, so that it was literally just an octopus spinning turntables.

ilc

I still get octopus hands, even with just "Octopus DJ with no fingers." nothing else.

Maybe you got a lucky roll :)

SubiculumCode

I've never gotten one to make what I am thinking of: A Galton board. At the top, several inches apart are two holes from which balls drop. One drops blue balls, the other red balls. They form a merged distribution below in columns, demonstrating dual overlapping normal distributions

Imagine one of these: https://imgur.com/a/DiAOTzJ but with two spouts at the top dropping different colored balls

Its attempts: https://imgur.com/undefined https://imgur.com/a/uecXDzI

CGMthrowaway

Have you tried building one irl? I can't find a video of a double one

SubiculumCode

I have not. It definitely is not something in training sets :)

vipermu

hey hn! I'm one of the founders at Krea.

we prepared a blogpost about how we trained FLUX Krea if you're interested in learning more: https://www.krea.ai/blog/flux-krea-open-source-release

orphea

Off topic but did you really hide scroll bars on the website? Why...?

  .scrollbar-hide {
    -ms-overflow-style: none;
    scrollbar-width: none;
  }

johnisgood

They probably did it because the website might look better without a scrollbar, but they should realize that many browsers hide the scrollbar and they only get displayed when you hover over or when you start scrolling. That said, the scrollbar is always there for me (unless hidden by CSS), and I would not have minded it at all.

VladVladikoff

UI brought to you by vibe code

BoorishBears

nah, Krea is just from that side of design twitter where you don't uppercase letters and you can break the rules sometimes. very atypography-coded.

bluehark

Do you have an NVIDIA optimized version? Similar to how RTX accelerated FLUX.1 Kontext: https://blogs.nvidia.com/blog/rtx-ai-garage-flux-kontext-nim...

sangwulee

We have not added a separate RTX accelerated version for FLUX.1 Krea, but the model is fully compatible with existing FLUX.1 dev codebase. I don't think we made a separate onnx export for it though. Doing 4~8 bit quantized version with SVDQuant would be a nice follow up so that the checkpoint is more friendly for consumer grade hardware.

dvrp

Relevant links:

- GitHub repository: https://github.com/krea-ai/flux-krea

- Model Technical Report: https://www.krea.ai/blog/flux-krea-open-source-release

- Huggingface model card: https://huggingface.co/black-forest-labs/FLUX.1-Krea-dev

TuringNYC

I'd recommend you offer a clearly documented pathway for companies to license commercial output usage rights if they get the results they seek (i'll know soon enough!)

dvrp

You can find details about the license here: https://huggingface.co/black-forest-labs/FLUX.1-dev/blob/mai...

In a nutshell, it follows the same license as BFL Flux-dev model.

TuringNYC

I usually use https://github.com/axolotl-ai-cloud/axolotl on Lambda/Together for working with these types of models. Curious what others are using? What is the quickest way to get started? They mention Pre-training and Post-training but sadly didnt provide any reference starter scripts.

dvrp

We actually have a GitHub repository to help with inference code.

Check this out: https://github.com/krea-ai/flux-krea

Let me see if we can add more details on the blog post and thanks for the flag!

TuringNYC

Thanks! Yes, the inference is pretty straightforward, but the real opportunity IMHO is the custom pre-training and post-training opportunities given the open weights.

Western0

uv not working no clicking no torch

and

Cannot access gated repo for url https://huggingface.co/black-forest-labs/FLUX.1-Krea-dev/res.... Access to model black-forest-labs/FLUX.1-Krea-dev is restricted. You must have access to it and be authenticated to access it. Please log in.