Skip to content(if available)orjump to list(if available)

HN

Google to require developer verification to install and sideload Android apps

Google's Liquid Cooling

chipsandcheese.com

Social media's next evolution: decentralized, open-source, and scalable

newpublic.substack.com

The Annotated Transformer

nlp.seas.harvard.edu

Show HN: Base, an SQLite database editor for macOS

WiFi-3D-Fusion – Real-time 3D motion sensing with Wi-Fi

A visual introduction to big O notation

macOS 26 Tahoe's Dead Canary Utility App Icons

daringfireball.net

Study finds gaps in evidence for air-cleaning technologies to prevent infections

news.cuanschutz.edu

Turning a Decommissioned iPhone into a UniFi Protect Camera

How RubyGems.org protects OSS infrastructure

blog.rubygems.org

Climbing catfish filmed scaling waterfalls

Reverse Engineering All the Raspberry Pis

jeffgeerling.com

DeepWiki: Understand Any Codebase

Fenster: Most minimal cross-platform GUI library

Launch HN: April (YC S25) – Voice AI to manage your email and calendar

Mob Programming

mobprogramming.org

A small change to improve browsers for keyboard navigation

Show HN: Timep – A next-gen profiler and flamegraph-generator for bash code

Exploring the tragedy of the Counter-Strike 2 server browser

Building the mouse Logitech won't make

samwilkinson.io

What is a color space?

makingsoftware.com

How to make things slower so they go faster

AI Is Slowing Down Tracker

aislowdown.replit.app

The Annotated Transformer

The Annotated Transformer

4 comments

·August 24, 2025

internetguy

wow - this is really well made! i've been doing research w/ Transformer-based audio/speech models and this is made with incredible detail. Attention as a concept itself is already quite unintuitive for beginners due to is non-linearity, so this also explains it very well

roadside_picnic

> Attention as a concept itself is already quite unintuitive

Once you realize that Attention is really just a re-framing of Kernel Smoothing it becomes wildly more intuitive [0]. It also allows you to view Transformers as basically learning a bunch of stacked Kernels which leaves them in a surprisingly close neighborhood to Gaussian Processes.

0. http://bactra.org/notebooks/nn-attention-and-transformers.ht...

adityamwagh

It’s a very popular article that has been around for a long time!

gdiamos

It's so good it is worth revisiting often