Skip to content(if available)orjump to list(if available)

A Photonic SRAM with Embedded XOR Logic for Ultra-Fast In-Memory Computing

Scene_Cast2

Something I've never quite understood is where, on the spectrum of mainstream vs niche, in memory computing approaches lie. What are the proposed use cases?

I understand that you can get highly power efficient XORs, for example. But if we go down this path, would they help with a matrix multiply? Or the bias term of a FFN? Would there be any improvement (i.e. is there anything to offload) in regular business logic? Should I think of it as a more efficient but highly limited DSP? Or a fixed function accelerator replacement (e.g. "we want to encrypt this segment of memory")

roflmaostc

The main promises in optical computing are energy consumption, latency and single core speed.

For example, in this work Lin, Z., Shastri, B.J., Yu, S. et al. 120 GOPS Photonic tensor core in thin-film lithium niobate for inference and in situ training. Nat Commun 15, 9081 (2024). https://doi.org/10.1038/s41467-024-53261-x

they achieve a "weight update speed of 60Ghz" which is much faster than the average ~3-4Ghz CPU.

GloamingNiblets

The von Neumann architecture is not ideal for all use cases; ML training and inference is hugely memory bound and a ton of energy is spent moving network weights around for just a few OPs. Our own squishy neural networks can be viewed as a form of in-memory computing: synapses both store network properties and execute the computation (there's no need to read out synapse weights for calculation elsewhere).

It's still very niche but could offer enormous power savings for ML inference.

larodi

sooner or later we get a NRAM - neural ram as extension which is basically this neuromorphic lattice that can be wired on the very low level, perhaps also photonic level, and then the whole AI thing trains/lives in it.

IBM experimenting in this direction or at least they claim to here https://www.ibm.com/think/topics/neuromorphic-computing

there is another CPU which was recently featured which has again a lattice which is sort of FPGA but very fast, where different modules are loaded with some tasks, and each marble pumps data to some other, where the orchestrator decides how and what goes in each of these.

oneseven

You're referring to Evolution, seems to be a CGRA

https://news.ycombinator.com/item?id=44685050

woodrowbarlow

perhaps one use-case is fully-homomorphic-encryption, which performs functions on encrypted data without ever decrypting it. this paper appears to be about how in-memory processing can improve the performance of FHE: https://arxiv.org/abs/2311.16293

rapatel0

Geez if this works. It makes TCAMs free.

Ouch found the killer it takes up 0.1 mm^2 in area. That's a show stopper. Hopefully they can scale it down or use it for server infra.

latchkey

This might not be used in actual computing the way you're thinking, it might be in a network switch or transceiver, and increase speeds and reduce power usage.