Skip to content(if available)orjump to list(if available)

Anthropic Economic Index: Insights from Claude 3.7 Sonnet

simonw

I loaded the CSVs from the release_2025_03_27 folder into a SQLite database so I could explore them in Datasette Lite:

    cd /tmp
    git clone https://huggingface.co/datasets/Anthropic/EconomicIndex
    cd EconomicIndex
    git lfs pull
    cd release_2025_03_27
    for file in *.csv; do sqlite-utils insert data.db "${file%.*}" "$file" --csv -d; done
https://lite.datasette.io/?metadata=https://gist.github.com/...

pram

Claude is amazing for UNIX janitoring. Having it go through installing torch and various other things, automatically making venvs, and so on, made me realize how much of my life I've wasted dealing with stuff like cpan and pip and npm.

varjag

I suspect "Education, Instruction and Library" bar on the graph mostly amounts to cheating and more broadly defeating the point of unsupervised study. But it's nice we have a metric for that now.

nine_k

I think this is overly cynical. There are a number of ways an LLM can help actual education and research, and people are willing to use the LLMs this way, as opposed to cheating.

An LLM can be a really great (written) language coach / sparring partner. It could give you a lot of sensible, valuable practice in a foreign language it knows well, in a free form, while noticing, explaining, and fixing your mistakes.

An LLM may be great at searching, analyzing, and summarizing large numbers of natural-language texts. It could be a great paper research assistant, helping you find relevant articles and books, or relevant passages in them, finding similarities, contrasts, prerequisites and consequences, agreement and objections.

Unsupervised study with unwilling subjects is a pretty poor idea, LLMs or not; they will spend their effort on cheating, not learning. If you want different outcomes, work on motivation first, the rest will fall into place.

simonw

I'm trying to find tasks that look like they're equivalent to cheating on homework, but it's not clear to me which categories might apply.

https://lite.datasette.io/?metadata=https://gist.github.com/... shows everything in task_pct_v2 which mentions "writ" but excludes "code" or "program" to try to filter out the code stuff.

cadamsdotcom

Exciting to see these numbers.

Anthropic if you’re listening - how about a breakdown by number & taxonomy of tools being made available to Claude? Or number of tool invocations in a given session?

That’ll show how much use it’s getting via agentic systems vs. as a chatbot. By proxy it should show how quickly agentic use is growing.

amelius

Anyone using LLMs to simulate economic systems?