Hunyuan3D-2-Turbo: fast high-quality shape generation in ~1s on a 4090
28 comments
·March 20, 2025fixprix
ForTheKidz
> The game Myst is all about this magical writing script that allowed people to write entire worlds in books. That's where it feels like this is all going. Unity/Blender/Photoshop/etc.. is ripe for putting a LLM over the entire UI and exposing the APIs to it.
This is probably the first pitch for using AI as leverage that's actually connected with me. I don't want to write my own movie (sounds fucking miserable), but I do want to watch yours!
baq
Look up blender and unity MCP videos. It’s working today.
anonzzzies
You tried sharing your screen with Gemini intead of screenshots? I found it sometimes is really brilliant and sometimes terrible. It's mostly a win really.
awongh
What's the best img2mesh model out there right now, regardless of processing requirements?
Are any of them better or worse with mesh cleanliness? Thinking in terms of 3d printing....
Y_Y
How are they extracting value here? Is this just space-race-4-turbo propagandising?
I see plenty of GitHub sites that are barely more than advertising, where some company tries to foss-wash their crapware, or tries to build a little text-colouring library that burrows into big projects as a sleeper dependency. But this isn't that.
What's the long game for these companies?
sruc
Nice model, but strange license. You are not allowed to use it in EU, UK, and South Korea.
“Territory” shall mean the worldwide territory, excluding the territory of the European Union, United Kingdom and South Korea.
You agree not to use Tencent Hunyuan 3D 2.0 or Model Derivatives: 1. Outside the Territory;
johaugum
Meta’s Llama models (and likely many others') have similar restrictions.
Since they don’t fully comply with EU AI regulations, Meta preemptively disallows their use in those regions to avoid legal complications:
“With respect to any multimodal models included in Llama 3.2, the rights granted under Section 1(a) of the Llama 3.2 Community License Agreement are not being granted to you if you are an individual domiciled in, or a company with a principal place of business in, the European Union. This restriction does not apply to end users of a product or service that incorporates any such multimodal models”
https://github.com/meta-llama/llama-models/blob/main/models/...
ForTheKidz
Probably for domestic protection more than face value. Western licenses certainly have similar clauses to protect against liability for sanction violations. It's not like they can actually do much to prevent the EU from gaining from it.
North Korea? Maybe. Uk? Who gives a shit
justlikereddit
Because the EU regulations on AI and much else can be summarized as
>"We're going to bleed you dry you through lawfare taxation, not IF, we're going to fucking do it!!!!"
The UK? The only reason they don't publicly execute people for social media thought crime is that they abolished capital punishment.
The west is going to hell
littlestymaar
This is merely a “we don't take responsibility if this somehow violates EU rules around AI”, it's not something they can enforce in any way.
But even as such a strategy, I don't think that would hold if the Commission decided to fine Tencent for releasing that in case it violated the regulation.
IMHO it's just the lawyers doing something to please the boss who asked them to “solve the problem” (which they can't, really).
quitit
Running my usual img2mesh tests on this.
1. It does a pretty good job, definitely a steady improvement
2. The demos are quite generous versus my own testing, however this type of cherry-picking isn't unusual.
3. The mesh is reasonably clean. There are still some areas of total mayhem (but these are easy to fix in clary modelling software.)
amelius
I don't understand why it is necessary to make it this fast.
Philpax
It helps with iteration - you can try out different concepts and variations quickly without having to wait, especially as you refine what you want and your understanding of what it's capable of.
Also, in general, why not?
amelius
> Also, in general, why not?
There are various reasons:
- Premature optimization will take away flexibility, and will thus affect your ability to change the code later.
- If you add features later that will affect performance, then since the users are used to the high performance, they might think your code is slow.
- There are always a thousands things to work on, so why spend effort on things that users, at this point, don't care much about?
lwansbrough
How long before we start getting these rigged using AI too? I’ve seen a few of these 3D models so far but none that do rigging.
null
leshokunin
Can we see meshes, exports in common apps as examples?
This looks better than the other one on the front page rn
dvrp
Agree. That's why I posted it; I was surprised people were sleeping on this. But it's because they posted something yesterday and so the link dedup logic ignored this. This is why I linked to the commit instead.
There are meshes examples on the Github. I'll toy around with it.
dvrp
See also: https://github.com/Tencent/FlashVDM
boppo1
Can it run on a 4080 but slower, or is the vram a limitation?
dvrp
They don't mention that and I don't have one — can you try for yourself and let us know? I think you can get it from Huggingface or GH @ https://github.com/Tencent/Hunyuan3D-2
fancyfredbot
They mention "It takes 6 GB VRAM for shape generation and 24.5 GB for shape and texture generation in total."
So based on this your 4080 can do shape but not texture generation.
boppo1
Nnice, that's all i needed anyway.
thot_experiment
almost certainly, i haven't tried the most recent models but i have used hy3d2 and hy3d2-fast a lot and they're quite light to inference. You're gonna spend more time decoding the latent than you will inferencing. Takes about 6gb vram on my machine, I can't imagine these will be heavier.
I recently got into creating avatars for VR and have used AI to learn Unity/Blender so ridiculously fast, like just a couple weeks I've been at it now. All the major models can answer basically any question. I can paste in screenshots of what I'm working on and questions and it will tell me step by step what to do. I'll ask it what particular settings mean, there are so many settings in 3d programs; it'll explain them all and suggest defaults. You can literally give Gemini UV maps and it'll generate textures for you, or this for 3d models. It feels like the jump before/after stack overflow.
The game Myst is all about this magical writing script that allowed people to write entire worlds in books. That's where it feels like this is all going. Unity/Blender/Photoshop/etc.. is ripe for putting a LLM over the entire UI and exposing the APIs to it.