Bolt3D: Generating 3D Scenes in Seconds
59 comments
·March 19, 2025burgerone
What's the use case for putting AI into everything? Pretty much every AI product so far has been and still is subject to hallucinations and inaccuracies and on top of that it's hugely computing intensive. Sure, it's the best we have right now and it allows us to do things that were previously next to impossible with manual programming work, but it's far from being something that's actually viable. And what would be the use case for turning a picture into an approximated 3d mesh that is only really complete from one angle? LIDAR does a stunningly accurate job at that already, reproducibly (although granted that this cannot retroactively be applied to existing photos).
graypegg
So I agree with you, but to be fair it is neat, and I think academia should be allowed to try things with little to no *immediate* commercial value. Being "neat" is enough IMO if there's enough resources to go around.
In the long run, yeah this *exact* application is sort of pointless. I expected to see the lens parameters factored into the process. It's not. This would mean that everything is not only dimensionally inaccurate since there's no reference measurement, but also proportionally inaccurate to other things in the scene. You can actually see the effect of that on the "flower car" example. (the entire shape of the car is warped) Let alone the fact that the entire scene that can't be seen in the original photo is made up.
Maybe someone would use this to make game assets? But you'd need to fix them up a ton before using them. Other sibling comments make the point that there's no wireframes... so we can assume the polygon count here is insane.
Either way... it's just neat.
AlexeyBelov
> What's the use case for putting AI into everything?
Money.
scyzoryk_xyz
I’m imagining this approach being combined down the line with your typical photogrammetry approach for reinforcing quality.
diggan
Show. Us. The. Wireframes!
Every single time a new "Generate 3D" thing appears, they never show the wireframes of the objects/scenes up front, always you need to download and inspect things yourself. How is this not standard practice already?
Not displaying the wireframes at all, or even offer sample files so we could at least see it ourselves, just makes it look like you already know that the generated results are unusable...
jsheard
They usually don't show the material channels either, which I assume is because there aren't any, and instead the lighting is statically baked into the asset. That works for a demo where you just wiggle the camera in a circle, but it'll immediately fall apart if the lighting environment changes or anything in the scene moves.
Legend2440
Think of it more like a 3D picture than an animation model.
There are no materials channels or wireframe. It’s a volumetric 3D representation, like a picture made up of color blobs.
kookamamie
And thus unusable for most things 3D models and scenes would be used for today.
ajwin
My understanding is that it is not mesh, it’s Gaussian Splatting. There are tools to convert Splats into mesh though.
diggan
Yeah, but isn't still the expected outcome to end up with actual 3D objects, not point clouds? Or did people start integrating point clouds into their 3D workflows already? Besides for stuff like volumes and alike, I think most of us are still stuck working with polygons for 3D.
tracerbulletx
No geometry in the conventional sense. I did a demo of rendering a Gaussian splat in React Three Fiber here, you can open the linked splat file (its hosted on hugging face) if you want to see the data format. https://codesandbox.io/p/sandbox/3d-gaussian-splat-in-react-... I also have this youtube video about creating that demo https://www.youtube.com/watch?v=6tVcCTazmzo
lmpdev
I use pointclouds all the time in Rhino/Lastools/Meshlab
I much prefer pointclouds and nurbs over meshes
Not everything is gamedev
Legend2440
You can convert splats into meshes using a simple marching cubes algorithm.
But the meshes produced are not easy to edit.
hwillis
If you wanted to show someone a walkaround of the Sistine chapel or David, would you be better off using triangles and PBR and raycast lighting? You don't really gain anything from all that; you're doing a tremendous amount of computation just to recapture the particular lighting at an exact time. If you want the same detail that a few good pictures capture -tens of millions of pixels- you need to have many billions of triangles onscreen.
With splats you can have incredibly high fidelity with identical lighting and detail built in already. If you want to make a game or a movie, don't use splats. If you want to recreate a static scene from pictures, splats work very well.
text0404
splats augment 3D scenes, they don't replace them. i've seen them used for AR/VR, photogrammetry, and high-performance 3D. going from splats to a 3D model would be a downgrade in terms of performance.
lallysingh
Expected by whom? Other researchers in this space? That's the audience for this work.
dheera
Not necessarily.
If you're using it to render video you don't need to go into the mesh world.
eMPee584
Isn't https://svraster.github.io/ just superceding gaussians? Voxels are also not meshes, but might they not prove even more useful for coming rendering engines..?
text0404
splats don't have wireframes, and they have an embedded webgpu viewer in the linked page.
quitit
I think there should be a standard set of images for comparison, because I've never seen a mesh generator readme that wasn't impressive. I test each one I get my hands on and the results are often disappointing.
null
null
slowtrek
Is anything like this available locally yet?
emmelaich
Here's the repo: https://github.com/szymanowiczs/splatter-image
Apparently you can clone and run the demo locally. But wasn't clear at a glance how much is local and what hardware required.
echelon
Your link above (Splatter Image) is not the same code / paper / research as Bolt3D.
This is a previous paper/work by the lead author a year before they interned at Google Research and produced Bolt3D.
Bolt3D appears to be his intern research project done in conjunction with a bunch of other Google and DeepMind researchers.
I don't suspect there will ever be publicly available code for this.
kombine
This is previous work. Bolt3D uses the same principle, of predicting a per-pixel Gaussian splatting representation but it also trains a diffusion model, which is only feasible if you have substantial compute available.
Given that it's a work done at Google I will not expect them to release source code. But it will be reproduced by someone else soon enough.
emmelaich
True, I should've said related work.
ashikns
Isn't it generating in the browser using webgpu?
gessha
I assume that’s for interactive viewing only, not for generation.
> Our method takes 6.25 seconds to reconstruct one scene on a single H100 NVIDIA GPU or 15 seconds on an A100.
null
null
dvrp
I mean, it's the same author but seems like co-authors are different.
How do you know it's the actual implementation?
noduerme
I'm very unclear as to what is supposed to be happening locally here, but as soon as the demo finishes loading, it crashes firefox on my phone.
lukan
I assume WebGPU related? FF lacks behind in support and I assume they use it under the hood. Meaning I would try chrome. (I know, I know)
antonkar
We’ll hopefully convert an LLM into a 3D haunted house and finally democratize AI interpretability by having millions of gamers walking in them
null
oplane
Can 3D generation happen for terrain views on a map ?
flykespice
Good good, now show me the topology
marianaenhn
ok, thx
curtisszmania
[dead]
It doesn't seem to work that well. Once you move off the primary camera axis, like rotate around, you notice that there are many regions with only sparse resolution and there are gaps everywhere. It is totally unusable for anything.
Sure, it solves for the primary view, but this is claiming it is a 3D scene reconstruction/inference technique and in that claim it only sort of works.
For example: https://i.postimg.cc/43tj36jv/Screenshot-2025-03-20-at-8-52-...