Nvidia's RTX Pro 6000 has 96GB of VRAM and 600W of power

65 comments

·March 18, 2025

standardly

Keep in mind the significant cost savings incurred by the fact it doubles as a home heating solution

jauntywundrkind

Just keep in mind that a heat pump will often be 300-400% efficient at adding heat. This is 100% efficient and for once that's not actually very good.

chneu

Every room in my house with a desktop computer is consistently +5F the rest. There's some difference because of monitors and sun exposure, but yeah.

I actually did have to alter my HVAC automation to account for it, lol.

pwr22

The more you buy, the more you save!

Joel_Mckay

We had a GPU cluster in a office building, and were asked to vacate given the HVAC for the entire floor was overwhelmed. It smelled like a tropical beach dumpster most of the time, and the lights would dim when a new job was queued.

ML is so boring, =3

null

[deleted]

bryanlarsen

No price indicated. If you have to ask, you're not the market.

whatever1

A Blank check to nvidia will get you a spot in their buying invitation raffle.

caycep

Wasn't the ADA edition like $6k or so?

ein0p

Good luck finding it for less than $9K, and it has half the VRAM, and older chip. I predict MSRP of no less than $15K, with IRL prices above $20K.

blitzar

Ouch, I was just asking for a friend.

pacetherace

Send the bill to my manager

hooloovoo_zoo

It still has to compete with renting an actual professional card(s).

42lux

These are workstation cards and a lot of professionals need them to do other work than ML. The elitist attitude is pretty amusing, though.

hooloovoo_zoo

It's not elitist. The Nvidia 'pro' cards (quadro etc.) have always been a slightly unlocked, wildly more expensive version of the consumer cards. The v100, a100, h100 are meaningful hardware upgrades to the consumer line.

blitzar

> need them to do other work than ML

I need more frames in CS2.

jdprgm

vram mafia

1024core

What's the limitation that keeps memory limited to 96GB? Could one put 512GB of memory on a card? I'm curious about what is the limiting factor.

jsheard

GDDR memory buses are so fast that the RAM chips have to be packed tightly around the GPU core to maintain signal integrity, so the limit is more or less how many chips they can physically fit multiplied by the biggest chip capacity their suppliers can provide.

codedokode

But theoretically RAM chips do not need to be synchronous with each other. Even more, the data lanes on the chip do not need to be synchronous - you can treat each lane as an independent serial channel. And GDDR latency is high enough that longer lanes won't change anything.

etiam

Does that mean it's perfectly feasible to have more if one accepts a higher latency? Seems like there could be plenty of use cases where that's preferable.

Lramseyer

Not exactly. The name of the game with GDDR memory is "speed on the cheap." To do this, it uses a parallel bus with data rates pushed to the max. Not much headroom for things that could compromise signal integrity like socketed parts, or even board traces longer than they absolutely need to be. That's why the DRAM modules are close to the GPU and they're always soldered down.

Also, the latency with GDDR7 is pretty terrible. It uses PAM3 signaling with a cursed packet encoding scheme. At least they were nice enough to add in a static data scrambler this time around! The lack of RLL was kind of a pain in GDDR6.

codedokode

GDDR chips already have very high latency.

lazide

Also limited by heat dissipation.

pavlov

Apple sells a Mac Studio with the M3 Ultra chip and 512GB VRAM (unified memory between CPU and GPU). It costs $9,500.

Their secret is that the memory is manufactured within the chip package.

jayd16

LPDDR5X ~550GB/s vs GDDR7 which is ~1.8 TB/s

angoragoats

This doesn’t really answer the parent’s question.

That the memory is on a PCB close to the CPU/GPU certainly helps with signal integrity, but it is not by any means relevant here. The Apple platform has high memory bandwidth compared to x86 PCs because the CPU has a wide memory bus. You can get similar memory bandwidth out of high-end Epyc and Xeon CPUs which use standard DIMMs but with many more memory channels than a regular desktop computer.

codedokode

Why NVIDIA cannot manufacture 512 Gb chips and put 16 of them on the board?

jsheard

Apples architecture comes with its own trade-offs, it gives them huge capacity and pretty good bandwidth, but not nearly as much as Nvidia's architectures have. The M3 Ultra is 800GB/sec, the RTX 5090 is 1.8TB/sec, and the H200 is 4.8TB/s(!). Huge capacity with middling bandwidth is in vogue because it's a good fit for AI inference, but AI training and most other applications of GPUs need as much bandwidth as they can get.

ein0p

It's actually not within the chip's package. It's soldered to the board. It's just regular, fairly high spec LPDDR5X IIRC, there are just a TON of memory channels.

xxs

The statement is correct; it's not on the substrate similar to AMD's 3D cache or it doesn't use interposer like HBM.

You can consider it like a small PCB that has the CPU die and the memory soldered very closely nearby (and like mentioned +memory channels)

pavlov

It's not in the package? TIL... My misunderstanding seems to be common across the interwebs.

I remember Apple used to show slides depicting the M1 SoC as one unit containing a CPU, GPU, Neural Engine, cache, and DRAM all together. But slides shown at an Apple event definitely qualify for artistic license.

utf_8x

While there are definitely physical limits, the core limitation here is greed. They would sell less cards. Same reason why their consumer cards are limited to ridiculously low amounts of VRAM (16GB on an RTX5080, only 8 on the RTX4060, etc) so if you want to do any serious AI you have to buy their overpriced enterprise cards.

GuuD

Bandwidth/GPU real estate in terms of area. Biggest GDDR7 chips are 3GB 32-bit, it has 512bit wide bus. And even this is going to moonlight as a space heater

codedokode

Why Apple packs something like 64 Gb on the CPU chip and NVIDIA cannot?

bryanlarsen

Apple uses LPDDR4 which comes in densities of up to 16Gb AFAICT. So they can have ~5X as much memory using the same bus width.

wmf

LPDDR dies can stack but GDDR cannot?

blitzar

Planned obsolescence, got to sell 7xxx cards somehow.

YetAnotherNick

What's the usecase of 512 GB memory that you cannot achieve through multi GPUs? Maybe you can make it bit cheaper as you don't require multiple chip, but I would say it is just a maybe because the chip is not the costliest part for Nvidia to manufacture, it is the memory[1].

[1]: https://www.nextplatform.com/2024/02/27/he-who-can-pay-top-d...

immibis

One card with twice the memory would let you run the model in half as many cards, at half the speed.

magicalhippo

Will be fun to see if it has all the ROPs it should[1], or if NVIDIA has really gone all in on making this the worst product launch ever...

[1]: https://www.techpowerup.com/332884/nvidia-geforce-rtx-50-car...

jsheard

What a beast. Like past generations there is a variant with a blower-style cooler which is limited to 300W, but now they're also doing variants modelled after their gaming cards but with even higher TDPs. Triple the memory of the gaming flagship too, you used to only get double.

zamadatix

This looks like the same TDP as the gaming flagship of the same chip (5090, also GB202 based).

jsheard

The 5090 reference design is 575W. Not a huge difference, but the workstation card is slightly more.

zokier

As far as I can tell, it uses the same infamous power connector as 5090. I wonder if there are any differences there, maybe some additional balancing/safety features?

kelseyfrog

I just hope whatever organ I’m selling to afford this GPU is one of the paired ones.

theandrewbailey

Too bad SLI isn't a thing anymore.

0cf8612b2e1e

Is this real 600W or 750W when we are in burst mode? I am too accustomed to the TDP lies from CPUs.

xxs

T is for thermal, it's mostly about the need to be able to dissipate that much heat on average not peak (or transient) power.

torginus

> 600W

is this a sign that semiconductor scaling is completely dead now?

adrian_b

Not yet, but the NVIDIA RTX 5000 (Blackwell) series does not use a newer manufacturing process, so their energy efficiency is slightly worse than that of the RTX 4000 SUPER (Ada) series, which remain the most efficient GPUs (e.g. RTX 4080 SUPER).

The RTX 5000 (Blackwell) series has increased the performance only by using bigger chips and a higher power consumption. The RTX Pro Blackwell series use the same chips as the consumer series.

null

[deleted]

Hiko0

600W of power? Would you sell a car with "35l/100km of power"?

noqc

what on earth is wrong with watts as a unit of power?

quickthrowman

Would you rather have it say 50A @ 12V? The headline is written incorrectly (because the article writer didn’t write it), but the article says ‘needs 600W of power’.

Headlines are misleading, film at 11.

HN

Nvidia's RTX Pro 6000 has 96GB of VRAM and 600W of power

Nvidia's RTX Pro 6000 has 96GB of VRAM and 600W of power