Gemini 3 for developers: New reasoning, agentic capabilities
40 comments
·November 18, 2025srameshc
amelius
Yeah, at this point I want to see the failure modes. Show me at least as many cases where it breaks. Otherwise, I'll assume it's an advertisement and I'll skip to the next headline. I'm not going to waste my time on it anymore.
Kiro
I agree but if Gemini 3 is as good as people on HN said about the preview, then this is the wrong announcement to sleep on.
jstummbillig
I think it's fun to see, what is not even considered magic anymore today.
ponyous
Just generated a bunch of 3D CAD models using Gemini 3.0 to see how it compares in spatial understanding and it's heaps better than anything currently out there - not only intelligence but also speed.
Will run extended benchmarks later, let me know if you want to see actual data.
lfx
Just hand sketched what 5 year old would do on the paper - the house, trees, sun. And asked to generate 3d model with tree.js.
Results are amazing! 2.5 and 3 seems way way head.
giancarlostoro
I'm not familiar enough with CAD what type of format is it?
koakuma-chan
When I see CAD, I always think of Casting Assistant Device.
ponyous
It’s not a format, but in my mind it implies designs that are supposed to be functional as opposed to models that are meant for virtual games.
It generated a blender script that makes the model.
adastra22
I would have used OpenSCAD for that purpose.
bilbo0s
Did your prompt instruct it to use blender?
slackerIII
What's the easiest way to set up automatic code review for PRs for my team on GitHub using this model?
clusterhacks
I wish I could just pay for the model and self-host on local/rented hardware. I'm incredibly suspicious of companies totally trying to capture us with these tools.
wohoef
Curious to see it in action. Gemini 2.5 has already been very impressive as a study buddy for courses like set theory, information theory, and automata. Although I’m always a bit skeptical of these benchmarks. Seems quite unlikely that all of the questions remain out of their training data.
mccoyb
I truly do not understand what plan to use so I can use this model for longer than ~2 minutes.
Using Anthropic or OpenAI's models are incredibly straightforward -- pay us per month, here's the button you press, great.
Where do I go for this for these Google models?
fschuett
Update VSCode to the latest version and click the small "Chat" button at the top bar. GitHub gives you like $20 for free per month and I think they have a deal with the larger vendors because their pricing is insanely cheap. One week of vibe-coding costs me like $15, only downside to Copilot is that you can't work on multiple projects at the same time because of rate-limiting.
kachapopopow
ai studio, you get a bunch of usage free if you want more you buy credits (google one subscriptions also give you some additional usage)
mccoyb
I see -- so this is the "paid" AI studio plan?
Does that have any relation to the Gemini plan thing: https://one.google.com/explore-plan/gemini-advanced?utm_sour...
?
kachapopopow
that's for the first party google integrations - not 3rd party. ai studio just gives you an api key that you can use anywhere.
aliljet
Understanding precisely why Gemini 3 isn't front of the pack on SWE Bench is really what I was hoping to understand here. Especially for a blog post targeted at software developers...
cube2222
Yeah, they mention a benchmark I'm seeing the first time (Terminal-Bench 2.0) and are supposedly leading in, while for some reason SWE Bench is down from Sonnet 4.5.
Curious to see some third-party testing of this model. Currently it seems to primarily improve of "general non-coding and visual reasoning" primarily, based on the benchmarks.
svantana
SWEBench-Verified is probably benchmaxxed at this stage. Claude isn't even the top performer, that honor goes to Doubao [1].
Also, the confidence interval for a such a small dataset is about 3 percent points, so these differences could just be up to chance.
pawelduda
Why is this particular benchmark important?
null
deanc
The AntiGravity seems to be a bit overwhelmed. Unable to set up an account at the moment.
jordanpg
What is Gemini 3 under the hood? Is it still just a basic LLM based on transformers? Or are there all kinds of other ML technologies bolted on now? I feel like I've lost the plot.
anilgulecha
It's a mixture-of-experts model. Basically N smaller model pieces put together, and when inference occurs, only 1 is active at a time. Each model piece would be tuned/good in one area.
meowface
I am very ignorant in this field but I am pretty sure under the hood they are all still fundamentally built on the transformer architecture, or at least innovations on the original transformer architecture.
fosterfriends
Gemini 3 and 3 pro are good bit cheaper than Sonnet 4.5 as well. Big fan
hubraumhugo
No gemini-3-flash yet, right? Any ETA on that mentioned? 2.5-flash has been amazing in terms of cost/value ratio.
I think I am in this AI fatigue phase. I am past all hype with models, tools and agents and back to problem and solution approach, sometimes code gen with AI , sometimes think and ask for a piece of code. But not offloading to AI and buying all the bs, waiting it to do magic with my codebase.