Introducing Gemma 3n
25 comments
·June 26, 2025wiradikusuma
tyushk
Licensing. You can't use Gemini Nano weights directly (at least commercial ly) and must interact with them through Android MLKit or similar Google approved runtimes.
You can use Gemma commercially using whatever runtime or framework you can get to run it.
jabroni_salad
Gemma is open source and apache 2.0 licensed. If you want to include it with an app you have to package it yourself.
gemini nano is an android api that you dont control at all.
nicce
> Gemma is open source and apache 2.0 licensed
Closed source but open weight. Let’s not ruin the definition of the term in advantage of big companies.
impure
I suspect the difference is in the training data. Gemini is much more locked down and if it tries to repeat something from the draining data verbatim you will get a 'recitation error'.
readthenotes1
Perplexity.ai gave an easier to understand response than Gemini 2.5 afaict.
Gemini nano is for Android only.
Gemma is available for other platforms and has multiple size options.
So it seems like Gemini nano might be a very focused Gemma everywhere to follow the biology metaphor instead of the Italian name interpretation
null
impure
I've been playing around with E4B in AI Studio and it has been giving me really great results, much better than what you'd expect from an 8B model. In fact I'm thinking of trying to install it on a VPS so I can have an alternative to pricy APIs.
tgtweak
Any readily-available APKs for testing this on Android?
danielhanchen
Made some GGUFs if anyone wants to run them!
./llama.cpp/llama-cli -hf unsloth/gemma-3n-E4B-it-GGUF:UD-Q4_K_XL -ngl 99 --jinja --temp 0.0
./llama.cpp/llama-cli -hf unsloth/gemma-3n-E2B-it-GGUF:UD-Q4_K_XL -ngl 99 --jinja --temp 0.0
I'm also working on an inference + finetuning Colab demo! I'm very impressed since Gemma 3N has audio, text and vision! https://docs.unsloth.ai/basics/gemma-3n-how-to-run-and-fine-...
upghost
Literally was typing out "Unsloth, do your thing!!" but you are way ahead of me. You rock <3 <3 <3
Thank you!
null
Workaccount2
Anyone have any idea on the viability of running this on a Pi5 16GB? I have a few fun ideas if this can handle working with images (or even video?) well.
gardnr
The 4-bit quant weighs 4.25 GB and then you need space for the rest of the inference process. So, yeah you can definitely run the model on a Pi, you may have to wait some time for results.
minimaxir
LM Studio has MLX variants of the model out: http://huggingface.co/lmstudio-community/gemma-3n-E4B-it-MLX...
However it's still 8B parameters and there are no quantized models just yet.
turnsout
This looks amazing given the parameter sizes and capabilities (audio, visual, text). I like the idea of keeping simple tasks local. I’ll be curious to see if this can be run on an M1 machine…
Fergusonb
Sure it can, easiest way is to get ollama, then `ollama run gemma3n` You can pair it with tools like simonw's LLM to pipe stuff to it.
bigyabai
This should run fine on most hardware - CPU inference of the E2B model on my Pixel 8 Pro gives me ~9tok/second of decode speed.
jacknews
WTF is wrong with y'all?
It's not under-SH-tanding
It's under-S-tanding.
svat
This is completely offtopic, but in case your question in genuine:
https://www.youtube.com/watch?v=F2X1pKEHIYw
> Why Some People Say SHTRONG (the CHRUTH), by Dr Geoff Lindsey
I still don't understand the difference between Gemma and Gemini for on-device, since both don't need network access. From https://developer.android.com/ai/gemini-nano :
"Gemini Nano allows you to deliver rich generative AI experiences without needing a network connection or sending data to the cloud." -- replace Gemini with Gemma and the sentence still valid.