Skip to content(if available)orjump to list(if available)

Show HN: Run Qwen3-Next-80B on 8GB GPU at 1tok/2s throughput

cahaya

Nice. Seems like i cannot run this on my Apple silicon M chips right?

addandsubtract

Great work! Can this technique also be used to run image diffusion models on lower VRAM GPUs?