I Designed and Printed a Custom Nose Guard to Help My Dog with DLE
snoutcover.com
Learning Music with Strudel
terryds.notion.site
Mistral 3 family of models released
mistral.ai
Nixtml: Static website and blog generator written in Nix
github.com
Addressing the adding situation
xania.org
100000 TPS over a billion rows: the unreasonable effectiveness of SQLite
andersmurphy.com
Poka Labs (YC S24) Is Hiring a Founding Engineer
ycombinator.com
Advent of Compiler Optimisations 2025
xania.org
Python Data Science Handbook
jakevdp.github.io
Lowtype: Elegant Types in Ruby
codeberg.org
Show HN: Marmot – Single-binary data catalog (no Kafka, no Elasticsearch)
github.com
A series of vignettes from my childhood and early career
jasonscheirer.com
Apple Releases Open Weights Video Model
starflow-v.github.io
What will enter the public domain in 2026?
publicdomainreview.org
YouTube increases FreeBASIC performance (2019)
freebasic.net
4.3M Browsers Infected: Inside ShadyPanda's 7-Year Malware Campaign
koi.ai
Comparing AWS Lambda ARM64 vs. x86_64 Performance Across Runtimes in Late 2025
chrisebert.net
Lazier Binary Decision Diagrams for set-theoretic types
elixir-lang.org
Apple to beat Samsung in smartphone shipments for first time in 14 years
sherwood.news
Beej's Guide to Learning Computer Science
beej.us
How Brian Eno Created Ambient 1: Music for Airports (2019)
reverbmachine.com
Show HN: RunMat – runtime with auto CPU/GPU routing for dense math
github.com
Progress on TypeScript 7 – December 2025
devblogs.microsoft.com
Hi, I’m Nabeel. In August I released RunMat as an open-source runtime for MATLAB code that was already much faster than GNU Octave on the workloads I tried. https://news.ycombinator.com/item?id=44972919
Since then, I’ve taken it further with RunMat Accelerate: the runtime now automatically fuses operations and routes work between CPU and GPU. You write MATLAB-style code, and RunMat runs your computation across CPUs and GPUs for speed. No CUDA, no kernel code.
Under the hood, it builds a graph of your array math, fuses long chains into a few kernels, keeps data on the GPU when that helps, and falls back to CPU JIT / BLAS for small cases.
On an Apple M2 Max (32 GB), here are some current benchmarks (median of several runs):
* 5M-path Monte Carlo * RunMat ≈ 0.61 s * PyTorch ≈ 1.70 s * NumPy ≈ 79.9 s → ~2.8× faster than PyTorch and ~130× faster than NumPy on this test.
* 64 × 4K image preprocessing pipeline (mean/std, normalize, gain/bias, gamma, MSE) * RunMat ≈ 0.68 s * PyTorch ≈ 1.20 s * NumPy ≈ 7.0 s → ~1.8× faster than PyTorch and ~10× faster than NumPy.
* 1B-point elementwise chain (sin / exp / cos / tanh mix) * RunMat ≈ 0.14 s * PyTorch ≈ 20.8 s * NumPy ≈ 11.9 s → ~140× faster than PyTorch and ~80× faster than NumPy.
If you want more detail on how the fusion and CPU/GPU routing work, I wrote up a longer post here: https://runmat.org/blog/runmat-accel-intro-blog
You can run the same benchmarks yourself from the GitHub repo in the main HN link. Feedback, bug reports, and “here’s where it breaks or is slow” examples are very welcome.