Skip to content(if available)orjump to list(if available)

HN

If You're Remote, Ramble

Twenty Eighth International Obfuscated C Code Contest

Self-Employed, Self-Exhausted

theisolationjournals.substack.com

A Real PowerBook: The Macintosh Application Environment on a Pa-RISC Laptop

oldvcr.blogspot.com

Helsinki records zero traffic deaths for full year

helsinkitimes.fi

Micron rolls out 276-layer SSD trio for speed, scale, and stability

blocksandfiles.com

Build Your Own Minisforum N5 Inspired Mini NAS: A Comprehensive Guide

jackharvest.com

C++26 Reflections adventures and compile-time UML

reachablecode.com

6 weeks of Claude Code

blog.puzzmo.com

People still use our old-fashioned Unix login servers

utcc.utoronto.ca

Lina Khan points to Figma IPO as vindication of M&A scrutiny

Flourishing chemosynthetic life at the greatest depths of hadal trenches

Writing a basic service for GNU Guix

tannerhoelzel.com

PixiEditor 2.0 – A FOSS universal 2D graphics editor

A Bytecode VM for Arithmetic: The Parser

abhinavsarkar.net

We may not like what we become if A.I. solves loneliness

Anandtech.com now redirects to its forums

forums.anandtech.com

Online Collection of Keygen Music

Benchmarks in CI: Escaping the Cloud Chaos

Remote hosting for your telescope

sierra-remote.com

At a Loss for Words: A flawed idea is teaching kids to be poor readers (2019)

LangExtract: Python library for extracting structured data from language models

The Art of Multiprocessor Programming 2nd Edition Book Club

LangExtract: Python library for extracting structured data from language models

LangExtract: Python library for extracting structured data from language models

4 comments

·July 30, 2025

hm-nah

Oly Chit! This is a BIG deal! Sub-page citations…in-context RAG…built-in HTML UI…this is like the holy grail of deterministic text extraction. I’m trying this ASAP Rocky.

constantinum

There is also Unstract(open-source) that helps process structured data extraction. Key differences:

1. Unstract has a Pre-processing layer(OCR). Which converts documents into LLM readable formats.(helps improve accuracy, and control costs)

2. Unstract also connects to your existing data sources, making it an out-of-the-box ETL tool.

https://github.com/Zipstack/unstract

fudged71

Any idea how it compares with docetl?

oriettaxx

impressive, really