Skip to content(if available)orjump to list(if available)

Gemini 3

Gemini 3

51 comments

·November 18, 2025

tylervigen

I am personally impressed by the continued improvement in ARC-AGI-2, where Gemini 3 got 31.1% (vs ChatGPT 5.1's 17.6%). To me this is the kind of problem that does not lend itself well to LLMs - many of the puzzles test the kind of thing that humans intuit because of millions of years of evolution, but these concepts do not necessarily appear in written form (or when they do, it's not clear how they connect to specific ARC puzzles).

The fact that these models can keep getting better at this task given the setup of training is mind-boggling to me.

The ARC puzzles in question: https://arcprize.org/arc-agi/2/

grantpitt

Agreed, it also leads performance on arc-agi-1. Here's the leaderboard where you can toggle between arc-agi-1 and 2: https://arcprize.org/leaderboard

bnchrch

I've been so happy to see Google wake up.

Many can point to a long history of killed products and soured opinions but you can't deny theyve been the great balancing force (often for good) in the industry.

- Gmail vs Outlook

- Drive vs Word

- Android vs iOS

- Worklife balance and high pay vs the low salary grind of before.

Theyve done heaps for the industry. Im glad to see signs of life. Particularly in their P/E which was unjustly low for awhile.

digbybk

Ironically, OpenAI was conceived as a way to balance Google's dominance in AI.

qweiopqweiop

Forgot to mention absolutely milking every ounce of their users attention with Youtube, plus forcing Shorts!

ThrowawayR2

They've poisoned the internet with their monopoly on advertising, the air pollution of the online world, which is an transgression that far outweighs any good they might have done. Much of the negative social effects of the online world come from the need to drive more screen time, more engagement, more clicks, and more ad impressions firehosed into the faces of users for sweet, sweet, advertiser money. When Google finally defeats ad-blocking, yt-dlp, etc., remember this.

visarga

Yes, this is correct, and it happens everywhere. App Store, Play Store, YouTube, Meta, X, Amazon and even Uber - they all play in two-sided markets exploiting both its users and providers at the same time.

63stack

- Making money vs general computing

rvz

Google always has been there, its just that many didn't realize that DeepMind even existed and I said that they needed to be put to commercial use years ago. [0] and Google AI != DeepMind.

You are now seeing their valuation finally adjusting to that fact all thanks to DeepMind finally being put to use.

[0] https://news.ycombinator.com/item?id=34713073

stevesimmons

A nice Easter egg in the Gemini 3 docs [1]:

    If you are transferring a conversation trace from another model, ... to bypass strict validation in these specific scenarios, populate the field with this specific dummy string:

    "thoughtSignature": "context_engineering_is_the_way_to_go"
[1] https://ai.google.dev/gemini-api/docs/gemini-3?thinking=high...

bilekas

> The Gemini app surpasses 650 million users per month, more than 70% of our Cloud customers use our AI, 13 million developers have built with our generative models, and that is just a snippet of the impact we’re seeing

Not to be a negative nelly, but these numbers are definitely inflated due to Google literally pushing their AI into everything they can, much like M$. Can't even search google without getting an AI response. Surely you can't claim those numbers are legit.

lalitmaganti

> Gemini app surpasses 650 million users per month

Unless these numbers are just lies, I'm not sure how this is "pushing their AI into everything they can". Especially on iOS where every user is someone who went to App Store and downloaded it. Admittedly on Android, Gemini is preinstalled these days but it's still a choice that users are making to go there rather than being an existing product they happen to user otherwise.

Now OTOH "AI overviews now have two billion users" can definitely be criticised in the way you suggest.

aniforprez

I don't know for sure but they have to be counting users like me whose phone has had Gemini force installed on an update and I've only opened the app by accident while trying to figure out how to invoke the old actually useful Assistant app

realusername

> it's still a choice that users are making to go there rather than being an existing product they happen to user otherwise.

Yes and no, my power button got remapped to opening Gemini in an update...

I removed that but I can imagine that your average user doesn't.

joaogui1

It says Gemini App, not AI Overviews, AI Mode, etc

blinding-streak

Gemini app != Google search.

You're implying they're lying?

AstroBen

And you're implying they're being 100% truthful?

Marketing is always somewhere in the middle

coffeecoders

Feels like the same consolidation cycle we saw with mobile apps and browsers are playing out here. The winners aren’t necessarily those with the best models, but those who already control the surface where people live their digital lives.

Google injects AI Overviews directly into search, X pushes Grok into the feed, Apple wraps "intelligence" into Maps and on-device workflows, and Microsoft is quietly doing the same with Copilot across Windows and Office.

Open models and startups can innovate, but the platforms can immediately put their AI in front of billions of users without asking anyone to change behavior (not even typing a new URL).

Workaccount2

AI overviews has arguable done more harm than good for them, because people assume it's Gemini, but really it's some ultra light weight model made for handling millions of queries a minute, and has no shortage of stupid mistakes/hallucinations.

acoustics

Microsoft hasn't been very quiet about it, at least in my experience. Every time I boot up Windows I get some kind of blurb about an AI feature.

svantana

Grok got to hold the top spot of LMArena-text for all of ~24 hours, good for them [1]. With stylecontrol enabled, that is. Without stylecontrol, gemini held the fort.

[1] https://lmarena.ai/leaderboard/text

inkysigma

Is it just me or is that link broken because of the cloudflare outage?

Edit: nvm it looks to be up for me again

scrollop

Here it makes a text based video editor that works:

https://youtu.be/MPjOQIQO8eQ?si=wcrCSLYx3LjeYDfi&t=797

icyfox

Pretty happy the under 200k token pricing is staying in the same ballpark as Gemini 2.5 Pro:

Input: $1.25 -> $2.00 (1M tokens)

Output: $10.00 -> $12.00

Squeezes a bit more margin out of app layer companies, certainly, but there's a good chance that for tasks that really require a sota model it can be more than justified.

rudedogg

Every recent release has bumped the pricing significantly. If I was building a product and my margins weren’t incredible I’d be concerned. The input price almost doubled with this one.

gertrunde

"AI Overviews now have 2 billion users every month."

"Users"? Or people that get presented with it and ignore it?

singhrac

They're a bit less bad than they used to be. I'm not exactly happy about what this means to incentives (and rewards) for doing research and writing good content, but sometimes I ask a dumb question out of curiosity and Google overview will give it to me (e.g. "what's in flower food?"). I don't need GPT 5.1 Thinking for that.

recitedropper

"Since then, it’s been incredible to see how much people love it. AI Overviews now have 2 billion users every month."

To get to 2 billion a month they must be counting anyone who sees an AI overview as a user. Cringe. They should just go ahead and claim the "most quickly adopted product in history" as well.

I hope this comment gets ungrayed, to restore my faith in public forums. Not holding my breath though--way too much capital rests on these releases, and places like HN have too much sway in the tech community to not be manipulated.

thedelanyo

Reading the introductory passage - all I can say now is, Ai is here to stay.

casey2

The first paragraph is pure delusion. Why do investors like delusional CEOs so much? I would take it as a major red flag.