Skip to content(if available)orjump to list(if available)

HN

OpenAI and Nvidia Announce Partnership to Deploy 10GW of Nvidia Systems

PlanetScale for Postgres is now GA

planetscale.com

Cloudflare is sponsoring Ladybird and Omarchy

blog.cloudflare.com

Unified Line and Paragraph Detection by Graph Convolutional Networks (2022)

A simple way to measure knots has come unraveled

quantamagazine.org

Cap'n Web: a new RPC system for browsers and web servers

blog.cloudflare.com

Mentra (YC W25) Is Hiring to build smart glasses

Easy Forth (2015)

skilldrick.github.io

A board member's perspective of the RubyGems controversy

apiguy.substack.com

CompileBench: Can AI Compile 22-year-old Code?

A New Internet Business Model?

blog.cloudflare.com

How I, a beginner developer, read the tutorial you, a developer, wrote for me

anniemueller.com

What is algebraic about algebraic effects?

interjectedfuture.com

The Strange Tale of the Hotchkiss

The American nations across North America

colinwoodard.com

SGI demos from long ago in the browser via WASM

Morgan and Morgan takes Disney to court over 'Steamboat Willie' in ads

clickorlando.com

Beyond the Front Page: A Personal Guide to Hacker News

Human-Oriented Markup Language

Dear GitHub: no YAML anchors, please

blog.yossarian.net

California issues historic fine over lawyer's ChatGPT fabrications

A Beautiful Maths Game

UK Millionaire exodus did not occur, study reveals

SWE-Bench Pro

SWE-Bench Pro

5 comments

·September 22, 2025

gpt5

Slightly tangent question - they said that they have protected the public test set with a strong copyleft license to prevent training private models on them.

Does it actually work? Isn’t AI training so far simply ignores all license and copyright restrictions completely?

stri8ed

Not a chance. Even if American companies did abide by it, there is no reason Chinese companies would. And good luck definitely proving that a model trained on it.

stephendause

This is a key question in my opinion. It's one of the things that make benchmarking the SWE capabilities of LLMs difficult. It's usually impossible to know whether the LLM has seen a problem before, and coming up with new, representative problem sets is time-consuming.

ej88

https://scale.com/leaderboard/swe_bench_pro_commercial

I definitely trust the totally private dataset more.

siliconc0w

Looks like the associated article is: https://scale.com/research/swe_bench_pro (link in the repo is wrong)