Don’t Look Up: Sensitive internal links in the clear on GEO satellites [pdf]
satcom.sysnet.ucsd.edu
NanoChat – The best ChatGPT that $100 can buy
github.com
Dutch government takes control of Chinese-owned chipmaker Nexperia
cnbc.com
Why Study Programming Languages
people.csail.mit.edu
Why the push for Agentic when models can barely follow a simple instruction?
forum.cursor.com
Palisades Fire suspect's ChatGPT history to be used as evidence
rollingstone.com
No science, no startups: The innovation engine we're switching off
steveblank.com
Copy-and-Patch: A Copy-and-Patch Tutorial
transactional.blog
Ultrasound is ushering a new era of surgery-free cancer treatment
bbc.com
Sony PlayStation 2 fixing frenzy
retrohax.net
America is getting an AI gold rush instead of a factory boom
washingtonpost.com
First device based on 'optical thermodynamics' can route light without switches
phys.org
Show HN: SQLite Online – 11 years of solo development, 11K daily users
sqliteonline.com
KDE celebrates the 29th birthday and kicks off the yearly fundraiser
kde.org
Modern iOS Security Features – A Deep Dive into SPTM, TXM, and Exclaves
arxiv.org
Smartphones and being present
herman.bearblog.dev
JIT: So you want to be faster than an interpreter on modern CPUs
pinaraf.info
LLMs are getting better at character-level text manipulation
blog.burkert.me
DDoS Botnet Aisuru Blankets US ISPs in Record DDoS
krebsonsecurity.com
NVIDIA DGX Spark In-Depth Review: A New Standard for Local AI Inference
lmsys.org
vali, a C library for Varlink
emersion.fr
Strudel REPL – a music live coding environment living in the browser
strudel.cc
New York Times, AP, Newsmax and others say they won't sign new Pentagon rules
apnews.com
When GPUs started being used for deep learning (after AlexNet), GPUs were not at all matmul machines. They were machines that excel in most kinds of heavily parallel workloads. And this holds to this day, with the exception of the tensor core, which is an additional hardware block designed to accelerate this specific task.
Matrix multiplication didn't "win" because HW was designed for it. It won because matrix multiplication is a fundamental part of linear algebra and is very effective in deep learning (most kinds of functions you might want to write for deep learning can be expressed as a matmul). Acceleration of it became later. Additionally, matrix multiplication is a good fit for physics, as you can design the HW so that data movement is minimized, and most of the chip area and power are spent in actual computation, and not moving data around.
Fundamentally speaking, you also want to make your algorithm compatible with real-world physics. The need for heavy parallelism is required by the fact that you cannot physically make a fast chip that processes dependent operations. It's just not possible to propagate signals through transistors fast enough to make it possible. Even CPUs, even if they present a non-parallel programming environment, have to rely on expensive tricks like speculative out-of-order execution to make "sequential" code parallel to make it fast.
In general though, I personally would wish that chips would be made with taking programmability in mind. A fixed-function matrix multiplier might be slightly more efficient than a parallel computing chip with smaller matrix multipliers. But it would be significantly more programmable, and you can design much more interesting (and potentially more efficient) algorithms for it.