Skip to content(if available)orjump to list(if available)

Gemini 2.5 Flash Image

fariszr

This is the gpt 4 moment for image editing models. Nano banana aka gemini 2.5 flash is insanely good. It made a 171 elo point jump in lmarena!

Just search nano banana on Twitter to see the crazy results. An example. https://x.com/D_studioproject/status/1958019251178267111

dcre

Alarming hands on the third one: it can't decide which way they're facing. But Gemini didn't introduce that, it's there in the base image.

koakuma-chan

Why is it called nano banana?

Jensson

Engineers often have silly project names internally, then some marketing team rewrites the name for public release.

ceroxylon

It seems like every combination of "nano banana" is registered as a domain with their own unique UI for image generation... are these all middle actors playing credit arbitrage using a popular model name?

bonoboTP

I'd assume they are just fake, take your money and use a different model under the hood. Because they already existed before the public release. I doubt that their backend rolled the dice on LMArena until nano-banana popped up. And that was the only way to use it until today.

echelon

We already had this with gpt-image-1, this is just faster and looks better.

notsylver

I digitised our family photos but a lot of them were damaged (shifted colours, spills, fingerprints on film, spots) that are difficult to correct for so many images. I've been waiting for image gen to catch up enough to be able to repair them all in bulk without changing details, especially faces. This looks very good at restoring images without altering details or adding them where they are missing, so it might finally be time.

Almondsetat

All of the defects you have listed can be automatically fixed by using a film scanner with ICE and a software that automatically performs the scan and the restoration like Vuescan. Feeding hundreds (thousands?) of photos to an experimental proprietary cloud AI that will give you back subpar compressed pictures with who knows how many strange artifacts seems unnecessary

zwog

Do you happen to know some software to repair/improve video files? I'm in the process of digitalizing a couple of Video 2000 and VHS casettes of childhood memories of my mom who start suffering from dementia. I have a pretty streamlined setup for digitalizing the videos but I'd like to improve the quality a bit.

Barbing

Hope it works well for you!

In my eyes, one specific example they show (“Prompt: Restore photo”) deeply AI-ifies the woman’s face. Sure it’ll improve over time of course.

indigodaddy

Another question/concern for me: if I restore an old picture of my Gramma, will my Gramma (or a Gramma that looks strikingly similar) ever pop up on other people's "give me a random Gramma" prompts?

danielbln

That time had arrived a few months ago already with Flux Kontext (https://bfl.ai/models/flux-kontext).

matsemann

Half the time I ask Gemini to generate some image it claims it doesn't have the capability. And in general I've felt it's so hard to actually use the features Google announce? Like, a third of them is in one product, some in another which I can't use, and no idea what or where I should pay to get access. So confusing.

Al-Khwarizmi

Yeah, in fact the website says "Try it in Gemini" and I'm not sure if I'm already trying it or not - if I choose Gemini 2.5 Flash in the regular Gemini UI, I'm using this?

adidoit

Very impressive.

I have to say while I'm deeply impressed by these text to image models, there's a part of me that's also wary of their impact. Just look at the comments beneath the average Facebook post.

postalcoder

I have been testing google's SynthID for images and while it isn't perfect, it is very good, insofar that I felt some relief from that same creeping dread over what these images will do to perceived reality.

It survives a lot of transformation like compression, cropping, and resizing. It even survives over alterations like color filtering and overpainting.

sigmar

facebook isn't going to implement detection though. Many (if not most) of the viral pictures are AI-generated. and facebook is incentivized to let their users get fooled to generate endless scrolling

paul7986

Along with those being fooled there are many comments saying this is fake, AI trash and etc. That portion of the commenters are teaching the ignorant and soon no one will believe what they see on the Internet as real.

knicholes

I got scammed for $15k BTC last weekend during the (failed) SpaceX Launch. I believe the deepfake of Elon and transferred it over. The tech is very convincing, and the attacks ever increasingly sophisticated.

yifanl

This presumes that you're okay with giving the real Elon your wallet but not a fake Elon, but why?

fxtentacle

Plot twist: It wasn't a deepfake.

You sent your wallet to the real Elon and he used it as he saw fit. ;)

UltraSane

On the balance of probabilities it being a scam is vastly more likely than Elon actually wanting to contact you. Why would Elon need $15k in bitcoin?

AbraKdabra

I don't mean to be rude, but this sounds like natural selection doing its work.

michelb

These SpaceX scams are rampant on youtube and highly, highly lucrative. It’s crazy and you have to be very vigilant, as whatever is promised lines up with Elon’s MO.

rangerelf

Why would anyone give them any money AT ALL?

It's not like they're poor or struggling.

Am I missing something?

nickthegreek

it requires zero vigilance if you dont play the game.

kamranjon

Would you consider writing a blog post about this experience? I'm incredibly interested in learning more details about how this unfolded.

paul7986

Well just go on this guy's lawn and you will find your answer lol

Imustaskforhelp

Please pardon me since I don't know if this is satirical or not. I'd wish if you could clarify it.

Because if this is real, then the world is cooked

if not, then the fact that I think that It might be real but the only reason I believe its a joke is because you are on hackernews so I think that either you are joking or the tech has gotten so convincing that even people on hackernews (which I hold to a fair standard) are getting scammed.

I have a lot of questions if true and I am sorry for your loss if that's true and this isn't satire but I'd love it if you could tell me if its a satirical joke or not.

bauruine

I guess it was something like [0] The Nigerian prince is now a deep fake Elon but the concept is the same. You need to send some money to get way more back.

[0]: https://www.ncsc.admin.ch/ncsc/en/home/aktuell/im-fokus/2023...

amatajohn

the modern turing test:

am i getting scammed by a billionare or an AI billionaire?

MitPitt

Facebook comments are obviously botted too

kemyd

I don't get the hype. Tested it with the same prompts I used with Midjourney, and the results are worse than in Midjourney a year ago. What am I missing?

bonoboTP

The hype is about image editing, not pure text-to-image. Upload an input image, say what you want changed, get the output. That's the idea. Much better preservation of characters and objects.

kemyd

Thanks for clarifying this. That makes a lot more sense.

cdrini

Hmm, I think the hype is mainly for image editing, not generating. Although note I haven't used it! How are you testing it?

ihsw

[dead]

mkl

That lamp example is pretty impressive (though it's hard to know how cherry-picked it is). The lamp is plugged in, it's lighting the things in the scene, it's casting shadows.

lifthrasiir

FYI, this is the famed nano-banana model which has been now renamed to gemini-2.5-flash-image-preview in LMArena.

Mistletoe

https://medium.com/data-science-in-your-pocket/what-is-googl...

For people like me that don’t know what nano-banana is.

mock-possum

Wow I hate the ‘voice’ in that article - big if true though.

daemonologist

I suspect the "voice" is a language model with a bad system prompt. (Possibly the author's own words run through an LLM, to be charitable.)

postscapes1

This is what i came here to find out. Thanks.

bsenftner

All these image models are time vampires and need to be looked at with very suspicious eyes. Try to make a room - that's easy, now try to make multiple views of the same room - next to impossible. If one is intending to use these image models for anything that requires consistency of imagery, forget it.

abdusco

I love that it's substantially faster than ChatGPT's image generation. It takes ages, so slow that the app tells you to not wait and sends you notification when the generation finishes.

andrewinardeer

"Generate an image of OpenAI investors after using Gemini 2.5 Flash Image"

radarsat1

I've had a task in mind for a while now that I've wanted to do with this latest crop of very capable instruction-following image editors.

Without going into detail, basically the task boils down to, "generate exactly image 1, but replace object A with the object depicted in image 2."

Where image 2 is some front-facing generic version, ideally I want the model to place this object perfectly in the scene, replacing the existing object, that I have identified ideally exactly by being able to specify its position, but otherwise by just being able to describe very well what to do.

For models that can't accept multiple images, I've tried a variation where I put a blue box around the object that I want to replace, and paste the object that I want it to put there at the bottom of the image on its own.

I've tried some older models, and ChatGPT, also qwen-image last week, and just now, this one. They all fail at it. To be fair, this model got pretty damn close, it replaced the wrong object in the scene, but it was close to the right position, and the object was perfectly oriented and lit. But it was wrong. (Using the bounding box method.. it should have been able to identify exactly what I wanted to do. Instead it removed the bounding box and replaced a different object in a different but close-by position.)

Are there any models that have been specifically trained to be able to infill or replace specific locations in an image with reference to an example image? Or is this just like a really esoteric task?

So far all the in-filling models I've found are only based on text inputs.

rushingcreek

Yes! There is a model called ACE++ from Alibaba that is specifically trained to replace masked areas with a reference image. We use it in https://phind.design. It does seem like a very esoteric and uncommon task though.

ceroxylon

I don't think it is that esoteric, that sounds like deepfake 101. If you don't mind answering, does Phind do anything to prevent / mitigate this?

beyonddream

“Internal server error

Sorry, there seems to be an error. Please try again soon.”

Never thought I would ever see this on a google owned websites!

lionkor

A cheap quip would be "it's vibe-coded", but that might actually very well be the case at this point!

j_m_b

If this can do character consistency, that's huge. Just make it do the same for video...

ACCount37

It's probably built on reused "secret sauce" from the video generation models.