Nvidia DGX Spark
94 comments
·August 24, 2025hereme888
Y_Y
You are doing god's work.
In fact you're also doing the work Nvidia should have done when they put together their (imho) ridiculously imprecise spec sheet.
canucker2016
5090: 32GB RAM (newegg & amazon lowest price seems to be +300 more)
4090: 24GB RAM
Thor & Spark: 128GB RAM (probably at least 96GB usable by the GPU if they behave similar to the AMD Strix Halo APU)
aurareturn
It's not good value when you put it like that. It doesn't have a lot of compute and bandwidth. What it has is the ability to run DGX software for CUDA devs I guess. Not a great inference machine either.
conradev
where does an RTX Pro 6000 Blackwell fall in this? I feel like that’s the next step up in performance (and about the same price as two Sparks)
qingcharles
I thought the 6000 was slightly lower throughput than 5090, but obviously has a shitload more RAM.
skhameneh
It's more throughput, but way less value and there's still no NVLink on the 6000. Something like ~4x the price, ~20% more performance, 3x the VRAM.
There's two models that go by 6000, the RTX Pro 6000 (Blackwell) is the one that's currently relevant.
scosman
How does the process management comparison work for GPU vs full systems?
nodesocket
Once the updated Mac Studio with M4/M5 Ultra comes out, pretty much going to make the DGX irrelevant right?
wmf
Ultras are pretty expensive.
nodesocket
I mean the spark is $3,999 and current M3 Max 28-Core CPU 60-Core GPU is the same price. I would expect the refreshed studio will stay around the same price.
syntaxing
While a completely different price point, I have a Jetson Orin Nano. Some people forget the kernels are more or less set in stone for product like these. I could rebuild my own Jetpack kernel but it’s not that straight forward to update something like CUDA or any other module. Unless you’re a business where your product relies on this hardware, I find it hard to buy this for consumer applications.
coredog64
Came in here to say the same thing. Have bought 3 Nvidia dev boards and never again as you quickly get left behind. You're then stuck compiling everything from scratch.
larodi
My experience with Jetson Nano was that it had to have its Ubuntu debloatred first (with 3rd party script) before we could get their NN something library to run the image recognition, designated to run on this device.
These seem to be highly experimental boards, even though are super powerful for their form factor.
nightski
Am I missing something or does the comparably priced (technically cheaper) Jetson Thor have double the PFLOPs of the Spark with the same memory capacity and similar bandwidth?
Apes
My understanding is the DGX Spark is optimized for training / fine tuning and the Jetson Thor is optimized for running inference.
Architecturally, the DGX Spark has a far better cache setup to feed the GPU, and offers NVLINK support.
AlotOfReading
There's a lot of segmentation going on in the Blackwell generation from what I'm told.
modeless
Also Thor is actually getting sent out to robotics companies already. Did anyone outside Nvidia get a DGX Spark yet?
null
cherioo
The mainstream options seem to be
Ryzen AI Max 395+, ~120 tops (fp8?), 128GB RAM, $1999
Nvidia DGX Spark, ~1000 tops fp4, 128GB RAM, $3999
Mac Studio max spec, ~120 tflops (fp16?), 512GB RAM, 3x bandwidth, $9499
DGX Spark appears to potentially offer the most token per second, but less useful/value as everyday pc.
UncleOxidant
> Ryzen AI Max 395+, ~120 tops (fp8?), 128GB RAM, $1999
Just got my Framework PC last week. It's easy to setup to run LLMs locally - you have to use Fedora 42, though, because it has the latest drivers. It was super easy to get qwen3-coder-30b (8 bit quant) running in LMStudio at 36 tok/sec.
jauntywundrkind
NVidia Spark is $4000. Or, will be, supposedly whenever it comes out.
Also notably, Strix Halo and DGX Spark are both ~275GBps memory bandwidth. Not always but in many machine learning cases it feels like that's going to be the limiting factor.
rjzzleep
Maybe the real value of the DGX spark is to work on Switch 2 emulation. ARM + Nvidia GPU. Start with Switch 2 emulation on this machine and then optimize for others. (Yeah, I know, kind of expensive toy).
null
aurareturn
Mac Studio max spec, ~120 tflops (fp16?), 384GB RAM, 3x bandwidth, $9499
512GB.DGX has 256GB/s bandwidth so it wouldn't offer the most tokens/s.
rz2k
Perhaps they are referring to default GPU allocation that is 75% of the unified memory, but it is trivial to increase it.
jauntywundrkind
The GPU memory allocation refers to how capacity is alloted, not bandwidth. Sounds like the same 256-bit/quad-channel 8000MHz lpddr5 you can get today with Strix Halo.
echelon
tokens/s/$ then.
garyfirestorm
What did I miss? This was revealed in May - I don’t see anything new in that link since it was revealed.
wmf
Not much. There was a presentation yesterday but it's mostly what we already knew: https://www.servethehome.com/nvidia-outlines-gb10-soc-archit...
monster_truck
Paper launch. The people I know there who I have asked about it haven't even seen one yet
fh973
Ordered one in spring. Delivery time was pushed from July to September. Apparently they had a bug in the HDMI output.
ComplexSystems
The RAM bandwidth is so slow on this that you can barely train or do inference or do anything on it. I think the only use case they have in mind for this is fine tuning pretrained models.
wmf
It's the same as Strix Halo and M4 Max that people are going gaga about, so either everyone is wrong or it's fine.
gardnr
Memory Bandwidth:
Nvidia DGX: 273 GB/s
M4 Max: (up to) 546 GB/s
M3 Ultra: 819 GB/s
RTX 5090: ~1.8 TB/s
RTX PRO 6000 Blackwell: ~1.8 TB/s
7thpower
The other ones are not framed as an “AI Supercomputer on your desk”, but instead are framed as powerful computers that can also handle AI workloads.
aurareturn
M4 max has more than double the bandwidth.
Strix Halo has the same and I agree it’s overrated.
Rohansi
I would expect/hope that DGX would be able to make better use of its bandwidth than the M4 Max. Will need to wait and see benchmarks.
KingOfCoders
I think it depends on your model size
Fits into 32gb: 5090
Fits into 64gb - 96gb: Mac Studio
Fits into 128gb: for now 395+ $/token/s,
Mac Studio if you don't care about $
but don't have unlimited money for Hxxx
This could be great for models that fit 128gb and you want best $/token/s (if it is faster than a 395+).numpad0
Do anyone know why official pages don't mention FP16 performance(250 TFLOPS)?
null
null
maz1b
Dunno, doesn't seem that good to me. Granted, I recognize the pace of advancement, but fwiw at present time.. yeah.
I'd rather just get an M3 Ultra. Have an M2 Ultra on the desk, and an M3 Ultra sitting on the desk waiting to be opened. Might need to sell it and shell out the cash for the max ram option. Pricey, but seems worthwhile.
FP4-sparse (TFLOPS) | Price | $/TF4s
5090: 3352 | 1999 | 0.60
Thor: 2070 | 3499 | 1.69
Spark: 1000 | 3999 | 4.00
____________
FP8-dense (TFLOPS) | Price | $/TF8d (4090s have no FP4)
4090 : 661 | 1599 | 2.42
4090 Laptop: 343 | vary | -
____________
Geekbench 6 (compute score) | Price | $/100k
4090: 317800 | 1599 | 503
5090: 387800 | 1999 | 516
M4 Max: 180700 | 1999 | 1106
M3 Ultra: 259700 | 3999 | 1540
____________
Apple NPU TOPS (not GPU-comparable)
M4 Max: 38
M3 Ultra: 36