Stretching Each Dollar: Diffusion Training from Scratch on a Micro-Budget
2 comments
·January 13, 2025philipkglass
The differently styled images of "astronaut riding a horse" are great, but that has been a go-to example for image generation models for a while now. The introduction says that they train on 37 million real and synthetic images. Are astronauts riding horses now represented in the training data more than would have been possible 5 years ago?
If it's possible to get good, generalizable results from such (relatively) small data sets, I'd like to see what this approach can do if trained exclusively on non-synthetic permissively licensed inputs. It might be possible to make a good "free of any possible future legal challenges" image generator just from public domain content.
>The estimated training time for the end-to-end model on an 8×H100 machine is 2.6 days.
That's a $250,000 machine for the micro budget. Or if you don't want to do it locally ~$2,000 to do it on someone else's machine for the one model.