I built a dual RTX 3090 rig for local AI in 2025 (and lessons learned)

AJRF

Those GPUs are so close to each other, doesn’t the heat cause instability?

suladead

I built pretty much this exact rig myself, but now it's gathering dust, any other uses for this rather than localLLMS

DaSHacka

vidya

tensorlibb

I'm a huge fan of OpenRouter and their interface for solid LLM's but I recently jumped into fine tuning / modifying my own vision models for FPV drone detection (just for fun) and my daily workstation and it's 2080 just wasn't good enough.

Even in 2025 it's cool how solid a setup dual 3090's still are. nvlink is an absolute must but it's incredibly powerful. I'm able to run the latest Mistral thinking models and relatively powerful yolo based VLM's like the ones RoboFlow is based on.

Curious if anyone else is still using 3090's or has feedback for scaling up to 4-6 3090s.

Thanks everyone ;)

vladgur

I am exploring options just for fun.

a used 3090 is around $900 on ebay. a used rtx 6000 ADA is around $5k

4 3090s are slower at inference and worse at training than 1 rtx 6000.

4x3090 would consume 1400W at load.

Rtx 6000 would consume 300W at load.

If you god forbid live in California and your power averages 45 cents per kwh, 4x3090 would be $1500+ more per year to operate than a single RTX 6000[0]

[0] Back of the napkin/ChatGPT calculation of running the GPU at load for 8 hours per day.

Note: I own a pc with a 3090, but if i had to build an AI training workstation, i would seriously consider cost to operate and resale value(per component).

logicallee

>I am exploring options just for fun.

Since you're exploring options just for fun, out of curiosity, would you rent it out whenever you're not using it yourself, so it's not just sitting idle? (Could be noisy and loud). You'd be able to use your computer for other work at the same time and stop whenever you wanted to use it yourself.

jacquesm

I've built a rig with 14 of them. NVLink is not 'an absolute must', it can be useful depending on the model and the application software you use and whether you're training or inferring.

The most important figure is the power consumed per token generated. You can optimize for that and get to a reasonably efficient system, or you can maximize token generation speed and end up with two times the power consumption for very little gain. You also will likely need to have a way to get rid of excess heat and all those fans get loud. I stuck the system in my garage, that made the noise much more manageable.

breakds

I am curious about the setup of 14 GPUs - what kind of platform (motherboard) do you use to support so many PCIe lanes? And do you even have a chassis? Is it rack-mounted? Thanks!

jacquesm

I used a large supermicro server chassis, a dual Xeon motherboard with 7 8 lane PCI Express slots, all the ram it would take (bought second hand), splitters, four massive powersupplies. I extended the server chassis with aluminum angle riveted onto the base. It could be rack mounted but I'd hate to be the person lifting it in. The 3090s were a mix, 10 of the same type (small, and with blower style fans on them) and 4 much larger ones that were kind of hard to accommodate (much wider and longer). I've linked to the splitter board manufacturer in another comment in this thread. That's the 'hard to get' component but once you have those and good cables to go with them the remaining setup problems are mostly power and heat management.

fxtentacle

The 3090 are a sweet spot for training. It’s the first generation with seriously fast VRAM. And it’s the last generation before Nvidia blocked NVlink. If you need to copy parameters between GPUs during training, the 3090 can be up to 70% faster than 4090 or 5090. Because the latter two are limited by PCI express bandwidth.

jacquesm

To be fair though, the 4090 and 5090 are much easier capable of saturating PCI express than the 3090 is, even at 4 lanes per card the 3090 rarely manages to saturate the links, it still handsomely pays off to split down to 4 lanes and add more cards.

I used:

https://c-payne.com/

Very high quality and manageable prices.

CraigJPerry

if it's just for detection would audio not be cheaper to process?

I'm imagining a cluster of directional microphones, and then i don't know if it's better to perform some sort of band pass filtering first since it's so computationally cheap or whether it's better to just feed everything into the model directly. No idea.

I guess my first thought was just sounds from a drone likely is detectable reliably at a greater distance than visual, they're so small and a 180 degree by 180 degree hemisphere of pixels is a lot to process.

Fun problem either wayway.

ivape

Are you using it enough to see it on your electric bill? 4-6 would have to start showing up on that bill at some point.

jszymborski

I just don't get why the RTX 4090 is still so expensive on the used market. New Rtx 5090s are almost as expensive!

tayo42

Are these just for ai now? Or are games pushing video cards that much?

renewiltord

They're dropping. I'm trying to offload 8x 4090s and I'll average $1500 I think.

username12349

total cost?

bigiain

It says $3090 (maybe easy to miss since it also talks about RTX 3090s?)

jszymborski

It's written quite large on the page, just over 3K

HN

I built a dual RTX 3090 rig for local AI in 2025 (and lessons learned)

I built a dual RTX 3090 rig for local AI in 2025 (and lessons learned)