Skip to content(if available)orjump to list(if available)

HN

Less is safer: How Obsidian reduces the risk of supply chain attacks

If all the world were a monorepo

jtibs.substack.com

Things managers do that leaders never would

High-performance read-through cache for object storage

Hidden risk in Notion 3.0 AI agents: Web search tool abuse for data exfiltration

codeintegrity.ai

Feedmaker: URL + CSS selectors = RSS feed

feedmaker.fly.dev

Ants that seem to defy biology – They lay eggs that hatch into another species

smithsonianmag.com

Show HN: WeUseElixir - Elixir project directory

weuseelixir.com

A 3D-Printed Business Card Embosser

Internet Archive's big battle with music publishers ends in settlement

arstechnica.com

Ruby Central's Attack on RubyGems [pdf]

The best YouTube downloaders, and how Google silenced the press

Supporting Our AI Overlords: Redesigning Data Systems to Be Agent-First

Faster Argmin on Floats

algorithmiker.github.io

Three-Minute Take-Home Test May Identify Symptoms Linked to Alzheimer's Disease

smithsonianmag.com

Show HN: Zedis – A Redis clone I'm writing in Zig

Kernel: Introduce Multikernel Architecture Support

Starfront Observatories

starfront.space

Your very own humane interface: Try Jef Raskin's ideas at home

arstechnica.com

An untidy history of AI across four books

hedgehogreview.com

Micro-LEDs boost random number generation

discovery.kaust.edu.sa

Shipping 100 hardware units in under eight weeks

farhanhossain.substack.com

Trump to impose $100k fee for H-1B worker visas, White House says

Faster Argmin on Floats

Faster Argmin on Floats

3 comments

·September 18, 2025

why_only_15

This trick is very useful on Nvidia GPUs for calculating mins and maxes in some cases, e.g. atomic mins (better u32 support than f32) or warp-wide mins with `redux.sync` (only supports u32, not f32).

TheDudeMan

How fast if you write a for loop and keep track of the index and value of the smallest (possibly treating them as ints)?

nine_k

I hazard to guess that it would be the same, because the compiler would produce a loop out of .iter(), would expose the loop index via .enumerate(), and would keep track of that index in .min_by(). I suppose the lambda would be inlined, maybe even along with comparisons.

I wonder could that be made faster by using AVX instructions; they allow to find the minimum value among several u32 values, but not immediately its index.