Nvidia DGX Spark

94 comments

·August 24, 2025

Visit

hereme888

FP4-sparse (TFLOPS) | Price | $/TF4s

5090: 3352 | 1999 | 0.60

Thor: 2070 | 3499 | 1.69

Spark: 1000 | 3999 | 4.00

____________

FP8-dense (TFLOPS) | Price | $/TF8d (4090s have no FP4)

4090 : 661 | 1599 | 2.42

4090 Laptop: 343 | vary | -

____________

Geekbench 6 (compute score) | Price | $/100k

4090: 317800 | 1599 | 503

5090: 387800 | 1999 | 516

M4 Max: 180700 | 1999 | 1106

M3 Ultra: 259700 | 3999 | 1540

____________

Apple NPU TOPS (not GPU-comparable)

M4 Max: 38

M3 Ultra: 36

Y_Y

You are doing god's work.

In fact you're also doing the work Nvidia should have done when they put together their (imho) ridiculously imprecise spec sheet.

canucker2016

5090: 32GB RAM (newegg & amazon lowest price seems to be +300 more)

4090: 24GB RAM

Thor & Spark: 128GB RAM (probably at least 96GB usable by the GPU if they behave similar to the AMD Strix Halo APU)

aurareturn

It's not good value when you put it like that. It doesn't have a lot of compute and bandwidth. What it has is the ability to run DGX software for CUDA devs I guess. Not a great inference machine either.

conradev

where does an RTX Pro 6000 Blackwell fall in this? I feel like that’s the next step up in performance (and about the same price as two Sparks)

qingcharles

I thought the 6000 was slightly lower throughput than 5090, but obviously has a shitload more RAM.

skhameneh

It's more throughput, but way less value and there's still no NVLink on the 6000. Something like ~4x the price, ~20% more performance, 3x the VRAM.

There's two models that go by 6000, the RTX Pro 6000 (Blackwell) is the one that's currently relevant.

scosman

How does the process management comparison work for GPU vs full systems?

nodesocket

Once the updated Mac Studio with M4/M5 Ultra comes out, pretty much going to make the DGX irrelevant right?

wmf

Ultras are pretty expensive.

nodesocket

I mean the spark is $3,999 and current M3 Max 28-Core CPU 60-Core GPU is the same price. I would expect the refreshed studio will stay around the same price.

syntaxing

While a completely different price point, I have a Jetson Orin Nano. Some people forget the kernels are more or less set in stone for product like these. I could rebuild my own Jetpack kernel but it’s not that straight forward to update something like CUDA or any other module. Unless you’re a business where your product relies on this hardware, I find it hard to buy this for consumer applications.

coredog64

Came in here to say the same thing. Have bought 3 Nvidia dev boards and never again as you quickly get left behind. You're then stuck compiling everything from scratch.

larodi

My experience with Jetson Nano was that it had to have its Ubuntu debloatred first (with 3rd party script) before we could get their NN something library to run the image recognition, designated to run on this device.

These seem to be highly experimental boards, even though are super powerful for their form factor.

nightski

Am I missing something or does the comparably priced (technically cheaper) Jetson Thor have double the PFLOPs of the Spark with the same memory capacity and similar bandwidth?

Apes

My understanding is the DGX Spark is optimized for training / fine tuning and the Jetson Thor is optimized for running inference.

Architecturally, the DGX Spark has a far better cache setup to feed the GPU, and offers NVLINK support.

AlotOfReading

There's a lot of segmentation going on in the Blackwell generation from what I'm told.

modeless

Also Thor is actually getting sent out to robotics companies already. Did anyone outside Nvidia get a DGX Spark yet?

null

[deleted]

cherioo

The mainstream options seem to be

Ryzen AI Max 395+, ~120 tops (fp8?), 128GB RAM, $1999

Nvidia DGX Spark, ~1000 tops fp4, 128GB RAM, $3999

Mac Studio max spec, ~120 tflops (fp16?), 512GB RAM, 3x bandwidth, $9499

DGX Spark appears to potentially offer the most token per second, but less useful/value as everyday pc.

UncleOxidant

> Ryzen AI Max 395+, ~120 tops (fp8?), 128GB RAM, $1999

Just got my Framework PC last week. It's easy to setup to run LLMs locally - you have to use Fedora 42, though, because it has the latest drivers. It was super easy to get qwen3-coder-30b (8 bit quant) running in LMStudio at 36 tok/sec.

jauntywundrkind

NVidia Spark is $4000. Or, will be, supposedly whenever it comes out.

Also notably, Strix Halo and DGX Spark are both ~275GBps memory bandwidth. Not always but in many machine learning cases it feels like that's going to be the limiting factor.

rjzzleep

Maybe the real value of the DGX spark is to work on Switch 2 emulation. ARM + Nvidia GPU. Start with Switch 2 emulation on this machine and then optimize for others. (Yeah, I know, kind of expensive toy).

null

[deleted]

aurareturn

  Mac Studio max spec, ~120 tflops (fp16?), 384GB RAM, 3x bandwidth, $9499

512GB.

DGX has 256GB/s bandwidth so it wouldn't offer the most tokens/s.

rz2k

Perhaps they are referring to default GPU allocation that is 75% of the unified memory, but it is trivial to increase it.

jauntywundrkind

The GPU memory allocation refers to how capacity is alloted, not bandwidth. Sounds like the same 256-bit/quad-channel 8000MHz lpddr5 you can get today with Strix Halo.

echelon

tokens/s/$ then.

garyfirestorm

What did I miss? This was revealed in May - I don’t see anything new in that link since it was revealed.

wmf

Not much. There was a presentation yesterday but it's mostly what we already knew: https://www.servethehome.com/nvidia-outlines-gb10-soc-archit...

monster_truck

Paper launch. The people I know there who I have asked about it haven't even seen one yet

fh973

Ordered one in spring. Delivery time was pushed from July to September. Apparently they had a bug in the HDMI output.

ComplexSystems

The RAM bandwidth is so slow on this that you can barely train or do inference or do anything on it. I think the only use case they have in mind for this is fine tuning pretrained models.

wmf

It's the same as Strix Halo and M4 Max that people are going gaga about, so either everyone is wrong or it's fine.

gardnr

Memory Bandwidth:

Nvidia DGX: 273 GB/s

M4 Max: (up to) 546 GB/s

M3 Ultra: 819 GB/s

RTX 5090: ~1.8 TB/s

RTX PRO 6000 Blackwell: ~1.8 TB/s

7thpower

The other ones are not framed as an “AI Supercomputer on your desk”, but instead are framed as powerful computers that can also handle AI workloads.

aurareturn

M4 max has more than double the bandwidth.

Strix Halo has the same and I agree it’s overrated.

Rohansi

I would expect/hope that DGX would be able to make better use of its bandwidth than the M4 Max. Will need to wait and see benchmarks.

KingOfCoders

I think it depends on your model size

   Fits into 32gb: 5090
   Fits into 64gb - 96gb: Mac Studio
   Fits into 128gb: for now 395+ $/token/s, 
     Mac Studio if you don't care about $ 
     but don't have unlimited money for Hxxx

This could be great for models that fit 128gb and you want best $/token/s (if it is faster than a 395+).

numpad0

Do anyone know why official pages don't mention FP16 performance(250 TFLOPS)?

null

[deleted]

null

[deleted]

maz1b

Dunno, doesn't seem that good to me. Granted, I recognize the pace of advancement, but fwiw at present time.. yeah.

I'd rather just get an M3 Ultra. Have an M2 Ultra on the desk, and an M3 Ultra sitting on the desk waiting to be opened. Might need to sell it and shell out the cash for the max ram option. Pricey, but seems worthwhile.

HN

Nvidia DGX Spark

Nvidia DGX Spark