Skip to content(if available)orjump to list(if available)

Reverse engineering the 386 processor's prefetch queue circuitry

myself248

I remember reading about naive circuits like ripple-carry, where a signal has to propagate across the whole width of a register before it's valid. These seem like they'd only work in systems with very slow clocks relative to the logic itself.

In this writeup, something that jumps out at me is the use of the equality bus, and Manchester carry chain, and I'm sure there are more similar tricks to do things quickly.

When did the transition happen? Or were the shortcuts always used, and the naive implementations exist only in textbooks?

kens

Well, the Manchester carry chain dates back to 1959. Even the 6502 uses carry skip too increment the PC. As word sizes became larger and transistors became cheaper, implementations became more complex and optimized. And mainframes have been using these tricks forever.

kens

Author here. I hope you're not tired of the 386... Let me know if you have any questions.

sitkack

I'll never tire of any analysis you do. But if you are taking requests, I'd love two chips.

The AMD 29000 series, a RISC chip with many architectural advances that eventually morphed into the K5.

And the Inmos Transputer, a Forth like chip with built in scheduling and networking, designed to be networked together into large systems.

https://en.wikipedia.org/wiki/AMD_Am29000

https://en.wikipedia.org/wiki/Transputer

kens

Those would be interesting chips to examine, if I ever get through my current projects :-)

Zeetah

If you are doing requests, I'd love to see the M68k series analyzed.

sitkack

At what number of layers is it difficult to reverse engineer a processor from die photos? I would think at some point, functionality would be too obscured to able to understand the internal operation.

Do they ever put a solid metal top layer?

kens

I've been able to handle the Pentium with 3 metal layers. The trick is that I can remove metal layers to see what is underneath, either chemically or with sanding. Shrinking feature size is a bigger problem since an optical microscope only goes down to about 800 nm.

I haven't seen any chips with a solid metal top layer, since that wouldn't be very useful. Some chips have thick power and ground distribution on the top layer, so the top is essentially solid. Secure chips often cover the top layer with a wire that goes back and forth, so the wire will break if you try to get underneath for probing.

anyfoo

Never, the 386 is way too important.

neuroelectron

Ok, now do 486.

kens

I'm not as interested in the 486; I went stright to the Pentium: https://www.righto.com/2025/03/pentium-multiplier-adder-reve...

guerrilla

I totally agree with your methodology. Stick to the classic leaps.

neuroelectron

Fair enough. But why?

siliconunit

very nice analysis! personally I'm a DEC alpha fan.. but I guess that's a too big endeavor.. (or maybe a selected portion?)

kens

So many chips, so little time :-)

lysace

I miss those dramatic performance leaps in the 80s. 10x in 5 years, give or take.

Now we get like 2x in a decade (single core).

rasz

There was no performance improvement clock for clock between 286 and 386 when running contemporary 16 bit code https://www.vogons.org/viewtopic.php?t=46350