Skip to content(if available)orjump to list(if available)

AMD: Microcode Signature Verification Vulnerability

wrs

"A test payload for Milan and Genoa CPUs that makes the RDRAND instruction return 4"... Turns out kernel RNG belt-and-suspenders was justified?

rincebrain

"in theory" Linux, at least, was supposed to use rdrand's output as _one_ source of entropy, not _the only_ source.

No idea what Windows et al do for this, or if that's still true, but I believe the above description was how the argument was originally ended.

Also, tbh, if you can patch arbitrary instruction behavior, just replacing rdrand seems like far too ham fisted a tool with the level of versatility in your hands...

Tuna-Fish

Linux's rdrand use is proof against it returning bad output, but is not proof against malicious microcode. Reason for this is that the malicious microcode can examine register contents and alter the value it's returning so that mixing it into the previous source of randomness produces the desired evil value.

rincebrain

Sure, but as stated, if you don't trust the CPU at all, rdrand becomes the least of your concerns.

jstasiak

An article mentioning this kind of attack: https://blog.cr.yp.to/20140205-entropy.html

dist-epoch

Windows collects hardware device entropy and mixes it into the pool.

It also persists the pool across reboots so it doesn't start empty.

wahern

For organizations like the NSA, a simple hack like this is the opposite of ham-fisted, though presumably they'd try to ensure the output would pass common statistical tests. The utility in quietly breaking cryptographic protocols is presumably a major reason why it was chosen for the proof-of-concept.

rincebrain

Sure, but that was my point, was that just overriding rdrand would be much more ham-fisted than any practical attack I would expect someone to weaponize this into outside of a PoC.

dooglius

Intel's theory I believe was that it could become the only source. The argument in favor I think is the same one you are making now: anything that could hack the chip deeply enough to break rdrand is powerful enough to achieve the same goals in another way.

rincebrain

I'm sure Intel loved that idea, given that I don't think RDRAND showed up on AMD chips for 3 years after Intel launched support for it, and that would let them look much better on a number of benchmarks for that duration...

monocasa

I wouldn't focus so much on the rdrand part. That's more a proof that they have crafted a custom microcode than trying to say that rdrand is broken.

wrs

I think the argument for entropy mixing was that RDRAND could be broken, not that it was broken. On these processors, apparently yes it could.

c2xlZXB5Cg1

saagarjha

I assume the Google engineers have a sense of humor

Lammy

I realize the OP is explicitly (w/r/t 4) a reference to XKCD, but See Also: Dilbert 2001-10-25 :)

https://web.archive.org/web/20011027002011/http://www.dilber...

https://dilbert-viewer.herokuapp.com/2001-10-25

eptcyka

That's a reference to the PS3's RNG.

Karliss

I always thought the same when seeing that XKCD comic, but turns out it was the opposite. PS3 hack (2010) came 3 years after the XKCD comic(2007), they even used it as one of slides during 27C3 presentation. https://media.ccc.de/v/27c3-4087-en-console_hacking_2010#t=2...

nicman23

oh fuck i had forget that blast of a presentation

Aardwolf

Can a CPU implement RDRAND however it wants, or are there specifications that constrain it in some way?

E.g. if a CPU were to put a microscopic lavalamp and camera inside it and use a hash code of pictures taken from this lavalamp as the result of RDRAND: would that be compliant?

I know microscopic lavalamp in CPU is not physically feasible, what I'm asking is if RDRAND is broken by design, or proper physics based RNG's could be used to implement if a CPU maker wanted to

bestouff

Yes ! I wonder what's the Windows implementation of the equivalent API.

loeg

> Windows 10 has many entropy sources; these work together to ensure that the OS has good entropy. Different entropy sources guarantee good entropy in different situations; by using them all the best coverage is attained.

> Interrupt Timings

> The primary entropy source in Windows 10 is the interrupt timings. On each interrupt to a CPU the interrupt hander gets the Time Stamp Count (TSC) from the CPU. This is typically a counter that runs on the CPU clock frequency; on X86 and X64 CPUs this is done using the RDTSC instruction.

> ...

> The Intel RDRAND instruction is an on-demand high quality source of random data.

> If the RDRAND instruction is present, Winload gathers 256 bits of entropy from the RDRAND instruction. Similarly, our kernel-mode code creates a high-pull source that provides 512 bits of entropy from the RDRAND instruction for each reseed. (As a high source, the first 256 bits are always put in pool 0; providing 512 bits ensures that the other pools also get entropy from this source.)

> Due to some unfortunate design decisions in the internal RDRAND logic, the RDRAND instruction only provides random numbers with a 128-bit security level. The Win10 code tries to work around this limitation by gathering a large amount of output from RDRAND which should trigger a reseed of the RDRAND-internal PRNG to get more entropy. Whilst this solves the problem in most cases, it is possible for another thread to gather similar outputs form RDRAND which means that a 256-bit security level cannot be guaranteed.

> Based on our feedback about this problem, Intel implemented the RDSEED instruction that gives direct access to the internal entropy source. When the RDSEED instruction is present, it is used in preference to RDRAND instruction which avoids the problem and provides the full desired guarantees. For each reseed, we gather 128 output bytes from RDSEED, hash them with SHA-512 to produce 64 output bytes. As explained before, 32 of these go into pool 0 and the others into the ‘next’ pool for this entropy source.

https://aka.ms/win10rng

cluckindan

Run the patch and find out.

xmodem

Security implications aside, the ability to load custom microcode onto these chips could have fascinating implications for reverse engineering and understanding them better.

mindslight

Security implications? This is a win for distributed security - it's a "vulnerability" in the sense of the end of a movie where the evil overlord's plan falls apart.

> This vulnerability could be used by an adversary to compromise confidential computing workloads protected by the newest version of AMD Secure Encrypted Virtualization, SEV-SNP or to compromise Dynamic Root of Trust Measurement.

I don't know whether the people who write this upside-down corpo newspeak are so coked up on the authoritarian paradigm that they've lost touch with the reality, or if they're just paid well enough by the corpos to not care about making society worse, or what. But I'll translate:

This "vulnerability" might be used by the owner of a computer to inspect what their computer is actually doing or to defend themselves against coercion aiming to control the software they're running.

ngneer

I am all for Right to Repair, and am upset with the "evil overlords" as much as the next guy. But, to present the treacherous computing argument all on its own, is, well, incomplete. I am a huge fan of Ross Anderson, RIP. However, the reality is that while DRTM was originally envisioned primarily for DRM applications, it is nowadays leveraged to protect PCs against bootkits and the like. YOU may like to own your computer through and through, down to the microcode, but, for most people, their computer is viewed as a product, that they trust vendors to secure for them. If ultimate control is so important to you, can you not buy any machine you like, including one based on a RISC-V open-source processor?

https://community.amd.com/t5/business/amd-and-microsoft-secu...

mindslight

> If ultimate control is so important to you, can you not buy any machine you like, including one based on a RISC-V open-source processor?

This argument (or the individualist approach in general) no longer works in the context of remote attestation ("Dynamic Root of Trust Measurement"). As soon as the "average person" has a computer that can be counted on to betray what software they're running, remote parties will start turning the screws of coercion and making everyone use such a machine.

j16sdiz

I can't follow the logic.

In the context of remote attestation, they can revoke the key and vulnerability like these won't help.

mindslight

I was talking about the direct capabilities.

You're talking about the first order reaction.

The reaction to that reaction is that the noose of remote attestation develops slower. As things currently stand, they can't revoke the attestation keys of all the affected processors with websites (etc) just telling the large numbers of people with with those computers that they need to buy new ones. Rather the technological authoritarians have to continue waiting for a working scheme before they can push the expectation of remote attestation being something that is required.

AnotherGoodName

Also I’m curious if there’s opportunity for an all out perf variant from home brewers.

Eg. Throw away all spectre mitigations, find all the hacks to get each instructions timing down, etc.

basementcat

While you’re at it, allow for a few more ulp (units of least place) error for floating point ops.

basementcat

Kidding aside, my understanding is this sort of thing cannot be microcode patched.

But I would be pleased to be proven wrong.

pabs3

I wonder how much of the built-in microcode you could replace with Free-as-in-Freedom equivalents, I expect not all of it due to available SRAM?

p_l

Unless you want to break the CPU, huge portions would have to be bit-to-bit equal, which makes the exercise have easy lower value (is it "free as in freedom" if it's effectively a bit wise copy?) while adding legal issues (it would not just be derivative work, it works be mostly an exact copy)

userbinator

This reminds me of a decade-old utility called "Bulldozer Conditioner" that claimed to increase performance of certain AMD CPUs dramatically (and was verified by benchmarks), yet the author was extremely avoidant of the technical details of how it was accomplished --- AFAIK no one publicly RE'd and posted information on it that I could find, and I never got around to doing it either, but now I wonder if he had figured out how to modify and optimise(!) the microcode.

monocasa

I was curious, so I took a look at Bulldozer Conditioner. It seems it's working by flipping some chicken bits in un(or not very well)documented MSRs. The same kind of registers that were used to disable certain processor optimizations in service to spectre mitigations.

neuroelectron

It would be interesting to have an open source microcode community.

hedora

As an end user, I wonder how my cloud provider can prove to me that they installed AMD's fix and are not simply running a malicious version of the microcode on their CPU that claims to have the fix.

bri3d

It's in the CVE: "AMD SEV-SNP users can verify the fix by confirming TCB values for SNP in their attestation reports."

You can read about how this works here: https://www.amd.com/content/dam/amd/en/documents/epyc-techni...

If you aren't using SEV-SNP / attested compute, you have bigger fish to fry anyway since you have no actual trust in your hypervisor.

kccqzy

While I agree with you, from experience most people and most workloads definitely aren't using SEV-SNP. The hypervisor and the cloud provider are always assumed to be honest.

I personally evaluated this technology a few years ago, and even just moving a small part of our team's workload to SEV-SNP faced so much resistance from the VP level. I'm not sure if that's more of a problem with bureaucracy at my old employer or a general problem.

bri3d

Oh, 100%. My point was just: if you're worried that you can't audit your cloud provider's microcode patch level because you aren't using SEV-SNP, you have a threat model where their microcode patch level is not particularly relevant to you to begin with.

kyrra

For the large providers (Amazon, Google, Microsoft), they buy so many CPUs that they for issues like this, they tend to be given the fixes before they are released to the public. So I'd wager that those 3 already have patched their entire fleet.

bornfreddy

Try to install your own patch for RDRAND and check that it is returning 4? Of course, getting 4 multiple times doesn't mean you have succeeded [0].

[0] https://duckduckgo.com/?q=dilbert+random+generator+nine+nine... (couldn't find a good link to the comic)

tedunangst

The exploit doesn't work in a VM.

wmf

In theory the PSP should probably attest the microcode but I don't know if that exists.

bpye

SEV-SNP VMs can obtain an attestation report [0].

[0] - https://www.amd.com/content/dam/amd/en/documents/epyc-techni...

anon2025b

What does it actually attest though?

The running microcode's revision ID?

Or the running microcode's ROM version plus loaded patch lines plus active match registers plus whatever settings were adjusted in config registers during the act of loading?

That is, attest the actual and complete config that is running, or some pointless subset that instills a false sense of security?

It would be good for AMD (and Intel etc.) to provide better details here.

RachelF

Will the OS fix the microcode, or will a BIOS flash be required?

alberth

Don't microcode updates require a restart as well.

formerly_proven

Microcode updates aren't persistent, they're loaded into on-CPU-SRAM by firmware and/or kernel.

monocasa

They do not.

rincebrain

I don't think you can, necessarily, except by basically declaring bankruptcy on the old trust root on the systems and teaching everyone not to trust the old root.

As long as the vulnerability doesn't let them actually extract the secrets necessary to simulate completely arbitrary operations including with any future keys, I _think_ you can trust the new attestation chain afterward?

I've not been paid to work on this, though, and it would be pretty easy to have accidentally built it in a way where this is a world-ending event, and truly paranoid workloads in the future are going to insist on only using silicon that can't have ever been compromised by this either way.

cluckindan

hugs the Q6600 don’t you ever die on me, you precious space heater.

account42

"This vulnerability allows an adversary with local administrator privileges (ring 0 from outside a VM) to load malicious microcode patches."

"Vulnerability"

These restrictions should never have been in place in the first place.

ChocolateGod

If the attacker has ring 0 outside a VM, don't they have full access to the memory and execution state anyway?

adrian_b

On AMD server CPUs, the administrator of the cloud/datacenter is supposed to not have access to the encrypted memory used by customers' VMs.

This vulnerability breaks this assumption.

ZiiS

"This vulnerability allows a local administrator (ring 0 from outside a VM) to load clean microcode patches free of the vendor's malicious features."

nubinetwork

I don't know how exploitable this really is, as a lot of Linux systems load microcode at boot time... once it's been loaded, I don't think it's possible to load another one (outside of rebooting).

homebrewer

It is possible, but it's generally not a good idea.

https://wiki.archlinux.org/title/Microcode#Late_loading

https://docs.kernel.org/arch/x86/microcode.html#late-loading

although quotes from this article claim that it's fine specifically on AMD systems:

https://www.phoronix.com/news/AMD-Late-Loading-Microcode

rincebrain

To my understanding, part of the reason that was a problem was that Intel wanted to killswitch feature bits like SGX, but since Linux saves the feature bit state when it inits the CPUs, and then other things might change codepaths based on that, if you then killswitch it later, boom might go the dynamite.

(I believe this example would also still break on AMD-based systems, AMD just hasn't killswitched a CPUID feature flag yet AFAIR...)

Kab1r

Does anyone know if this is the same vulnerability that ASUS leaked in a beta BIOS?

bri3d

To the very best of my knowledge, yes, it is.

null

[deleted]

SubzeroCarnage

Reminder that AMD has stopped providing microcode updates for consumer platforms via linux-firmware.

index of linux-firmware, 41 cpus supported: https://github.com/divestedcg/real-ucode/blob/master/index-a...

index of my real-ucode project, 106 cpus supported: https://github.com/divestedcg/real-ucode/blob/master/index-a...

sadly, unless you have this recent agesa update you can no longer load recent microcodes due to this fix

which very well means quite a substantial amount of models whose vendors don't provide a bios update for this (since it goes back to zen1) will not be able to load any future fixes via microcode

9cb14c1ec0

Move over Android ROM enthusiasts, real geeks run their own microcode.

techwiz137

How do they know the internal microcode structure of instructions or even format?

dboreham

> CPU uses an insecure hash function in the signature validation

Do we know what this "insecure hash function" is/was?

dmitrygr

$5 says CRC32

ngneer

More likely a SHA variant that is susceptible to a second preimage (collision) attack.

thayne

High seems a bit extreme to me. If you have something malicious running in ring 0, you are already in big trouble.

ec109685

The idea of confidential compute / AMD Secure Encrypted Virtualization is that even with root outside the VM, you can't read the memory of the workloads running within a VM. Additionally, those inner workloads can attest that they are running in a secure environment.