AMD's CDNA 4 Architecture Announcement

jauntywundrkind

Faster small matrix, for AI. Yup, that seems like good fit for what folks want.

Supercharging the Local Data Share (LDS) that's shared by threads is really cool to hear about. 64 -> 160KB size. Writes into LDS go from 32B max to 128B, increasing throughout. Transposes, to help get the data in the right shape for its next use.

Really really curious to see what the UDNA unified next gen architectures look like, if they really stick to merging Compute and Radeon CDNA and RDNA, as promised. If consumers end up getting multi-die compute solutions that would be neat & also intimidatingly hard (lots of energy spent keeping bits in sync across cores/coherency). After Navi 4X ended up having its flagship cancelled way back now, been wondering. I sort of expect that this won't scale as nicely as Epyc being a bunch of Ryzen dies. https://wccftech.com/amd-enthusiast-radeon-rx-8000-gpus-alle...

robjeiter

When looking at inference is AMD already on par with Nvidia?

moondistance

Yes, for many applications.

Meta, OpenAI, Crusoe, and xAI recently announced large purchases of MI300 chips for inference.

MI400, which will be available next year, also looks to be at least on par with Nvidia's roadmap.

moondistance

(this is also why AMD popped 10% at open yesterday - this is a new development and talks from their 2025 "Advancing AI" event were published late last week + over the weekend)

christkv

Is the software stack still lacking?

OneDeuxTriSeiGo

Yeah it's still a few years behind but it's getting better. They are hiring software and tooling engineers like crazy. I keep tabs on some of the job slots companies have in our area and every time I check AMD they always have tons of new slots for software, firmware, and tooling (and this has been the case for ~3 years now).

They've been playing catch up after "the bad old days" when they had to let a bunch of people go to avoid going under but it looks like they are catching back up to speed. Now it's just a matter of giving all those new engineers a few years to get their software world in order.

moondistance

Yes, big time, but there continues to be lots of progress.

Most importantly, models are maturing, and this means less custom optimization is required.

bee_rider

Machine learning is, of course, a massive market and everybody’s focus.

But, does AMD just own the whole HPC stack at this point? (Or would they, if the software was there?).

At least the individual nodes. What’s their equivalent to Infiniband?

phonon

Ultra Ethernet

https://www.tomshardware.com/networking/amd-deploys-its-firs...

https://semianalysis.com/2025/06/11/the-new-ai-networks-ultr...

OneDeuxTriSeiGo

It's also worth noting Ultra Ethernet isn't just an AMD thing. The steering committee for the UEC is made up of basically every hardware manufacturer in the space except Nvidia. And of course Nvidia is a general contributor as well (presumably so they don't get left behind).

https://ultraethernet.org/

jauntywundrkind

Also UltraEthernet went 1.0 (6d ago), had a decent sized comments: https://news.ycombinator.com/item?id=44249190

wmf

Cray Slighshot is even faster than Infiniband.

Now that Nvidia is removing FP64 I assume AMD will have 100% of the HPC market until Fujitsu Monaka comes out.

null

[deleted]

HN

AMD's CDNA 4 Architecture Announcement

AMD's CDNA 4 Architecture Announcement