World Emulation via Neural Network
34 comments
·April 25, 2025das_keyboard
ajb
>I don't get this analogy at all. Instead of a human information flows through a neural network which alters the information.
These days most photos are also stored using lossy compression which alters the information.
You can think of this as a form of highly lossy compression of an image of this forest in time and space.
Most lossy compression is 'subtractive' in that detail is subtracted from the image in order to compress it, so the kind of alterations are limited. However there have been previous non-subtractive forms of compression (eg, fractal compression) that have been criticised on the basis of making up details, which is certainly something that a neural network will do. However if the network is only trained on this forest data, rather than being also trained on other data and then fine tuned, then in some sense it does only represent this forest rather than giving an 'informed impression' like a human artist would.
montebicyclelo
Awesome work / demo / blog
Link to the demo in case people miss it [1]
> using a customized camera app which also recorded my phone’s motion
Using phone's gyro as a proxy for "controls" is very clever
[1] https://madebyoll.in/posts/world_emulation_via_dnn/demo/
Valk3_
This might be a vague question, but what kind of intuition or knowledge do you need to work with these kind of things, say if you want to make your own model? Is it just having experience with image generation and trying to incorporate relevant inputs that you would expect in a 3D world, like the control information you added for instance?
titouanch
This is very impressive for a hobby project. I was wondering if you were planning to release the source code. Being able to create client-hosted, low-requirement neural networks for world generation could be really useful for game dev or artistic projects.
AndrewKemendo
I think this is very interesting because you seem to have reinvented NeRF, if I’m understanding it correctly. I only did one pass through but it looks at first glance like a different approach entirely.
More interesting is that you made an easy to use environment authoring tool that (I haven’t tried it yet) seems really slick.
Both of those are impressive alone but together that’s very exciting.
bjornsing
NeRF is a more complex and constrained approach, based on a kind of ray tracing. But results are obviously similar.
nopakos
Next we should try "Excel emulation via Neural Network". We get rid of a lot of intermediate steps, calculations, user interface etc!
What could go wrong?
Jokes aside, this is insanely cool!
downboots
or for a large dataset of math identities and have the user draw one side
Imanari
Amazing work. Could you elaborate on the model architecture and the process that lead you to using this architecture?
Macuyiko
The model seems to be viewable here:
https://netron.app/?url=https://madebyoll.in/posts/world_emu...
tehsauce
I love this! Your results seem comparable to the counter strike or minecraft models from a bit ago with massively less compute and data. It's particularly cool that it uses real world data. I've been wanting to do something like this for a while, like capturing a large dataset while backpacking in the cascades :)
I didn't see it in an obvious place on your github, do you have any plans to open source the training code?
udia
Very nice work. Seems very similar to the Oasis Minecraft simulator.
ollin
Yup, definitely similar! There are a lot of video-game-emulation World Models floating around now, https://worldarcade.gg had a list. In the self-driving & robotics literature there have also been many WMs created for policy training and evaluation. I don't remember a prior WM built on first-person cell-phone video, but it's a simple enough concept that someone has probably done it for a student project or something :)
puchatek
This is great but I think I'll stick to mushrooms.
ulrikrasmussen
I also thought those wooden guard rails looked pretty spot on how they would look on 2C-B. The only thing that's missing is the overlay of geometric patterns on even surfaces.
bongodongobob
Yeah, the similarities to psychedelics with some of this stuff is remarkable.
ilaksh
It makes me think that maybe our visual perception is similar to what this program is doing in some ways.
I wonder if there are any computer vision projects that take a similar world emulation approach?
Imagine you collected the depth data also.
voidspark
Yes the model is a U-Net, which is a type of Convolutional Neural Network (CNN), which is inspired by the structure of the visual cortex.
https://en.wikipedia.org/wiki/Convolutional_neural_network#H...
LoganDark
For some reason, psilocybin causes me to randomly just lose consciousness, and LSD doesn't. Weird stuff.
null
gitroom
Gotta say, Ive always wanted to try building something like this myself. That kind of grind pays off way more than shiny announcements imo.
> So, if traditional game worlds are paintings, neural worlds are photographs. Information flows from sensor to screen without passing through human hands.
I don't get this analogy at all. Instead of a human information flows through a neural network which alters the information.
> Every lifelike detail in the final world is only there because my phone recorded it.
I might be wrong here but I don't think this is true. It might also be there because the network inferred that it is there based on previous data.
Imo this just takes the human out of a artistic process - creating video game worlds and I'm not sure if this is worth archiving.