FAWK: LLMs can write a language interpreter

44 comments

·November 21, 2025

runeks

> I only interacted with the agent by telling it to implement a thing and write tests for it, and I only really reviewed the tests.

Did you also review the code that runs the tests?

mjaniczek

Yes :)

ikari_pl

Today, Gemini wrote a python script for me, that connects to Fibaro API (local home automation system), and renames all the rooms and devices to English automatically.

Worked on the first run. I mean, the second, because the first run was by default a dry run printing a beautiful table, and the actual run requires a CLI arg, and it also makes a backup.

It was a complete solution.

igravious

I've gotten Claude Code to port Ruby 3.4.7 to Cosmopolitan: https://github.com/jart/cosmopolitan

I kid you not. Took between a week and ten days. Cost about €10 . After that I became a firm convert.

I'm still getting my head around how incredible that is. I tell friends and family and they're like "ok, so?"

RealityVoid

I am incredibly curious how you did that. You just told it... Port ruby to cosmopolitan and let it crank out for a week? Or what did you do?

I'll use these tools, and at tikes they give good results. But I would not trust it to work that much on a problem by itself.

rogual

It seems like AIs work how non-programmers already thought computers worked.

UltraSane

I've been surprised by how often Sonnet 4.5 writes working code the first try.

troupo

I've found it to depend on the phase of the moon.

It goes from genius to idiot and back a blink of an eye.

Razengan

Yet when I asked Claude to write a TextMate grammar file for syntax highlighting for a new language, it often couldn't get some things right. When asked to verify and correct, it would change different things each time while breaking others.

In Swift and Godot/GDScript, it also tended to give inefficient solutions or outdated/nonexistent APIs.

Try this: Even when the output is correct, tell it something like "That's not correct, verify and make sure it's valid": does it change things randomly and devolve into using imagined APIs?

I think coding-by-AI is still only good for things that you already know about, to just reduce typing time for boilerplate etc.; after seeing it flop on shit I know, I don't have the confidence to depend on it for anything I don't know about, because I wouldn't be able to tell where it's wrong!

ikari_pl

working, configurable via command-line arguments, nice to use, well modularized code.

badsectoracula

A related test i did around the beginning of the year: i came up with a simple stack-oriented language and asked an LLM to solve a simple problem (calculate the squared distance between two points, the coordinates of which are already in the stack) and had it figure out the details.

The part i found neat was that i used a local LLM (some quantized version of QwQ from around December or so i think) that had a thinking mode so i was able to follow the thought process. Since it was running locally (and it wasn't a MoE model) it was slow enough for me to follow it in realtime and i found fun watching the LLM trying to understand the language.

One other interesting part is the language description had a mistake but the LLM managed to figure things out anyway.

Here is the transcript, including a simple C interpreter for the language and a test for it at the end with the code the LLM produced:

https://app.filen.io/#/d/28cb8e0d-627a-405f-b836-489e4682822...

vidarh

It's a fun post, and I love language experiments with LLMs (I'm close to hitting the weekly limit of my Claude Max subscription because I have a near-constantly running session working on my Ruby compiler; Claude can fix -- albeit with messy code sometimes -- issues that requires complex tracing of backtraces with gdb, and fix complex parser interactions almost entirely unaided as long as it has a test suite to run).

But here's the Ruby version of one of the scripts:

    BEGIN {
      result = [1, 2, 3, 4, 5]
        .filter {|x| x % 2 == 0 }
        .map {|x| x * x}
        .reduce {|acc,x| acc + x }
     puts "Result: #{result}"
    }

The point being that running a script with the "-n" switch un runs BEGIN/END blocks and puts an implicit "while gets ... end" around the rest. Adding "-a" auto-splits the line like awk. Adding "-p" also prints $_ at the end of each iteration.

So here's a more typical Awk-like experience:

    ruby -pe '$_.upcase!' somefile.txt ($_ has the whole line)

Or:

    ruby -F, -ane '$F[1]' # Extracts the second field field -F sets the default character to split on, and -a adds an implicit $F = $_.split.

That is not to detract from what he's doing because it's fun. But if your goal is just to use a better Awk, then Ruby is usually better Awk, and so, for that matter, is Perl, and for most things where an Awk script doesn't fit on the command line the only reason to really use Awk is that it is more likely to be available.

UltraSane

So I have had to work very hard to use $80 worth of my $250 free Claude code credits. What am I doing wrong?

qsort

The money shot: https://github.com/Janiczek/fawk

Purely interpretive implementation of the kind you'd write in school, still, above and beyond anything I'd have any right to complain about.

skydhash

Commendable effort, but I expected at least a demo, which would showcase working code (even if it’s hacky). It’s like someone talking about a sheet music without playing it once.

johnisgood

See https://github.com/Janiczek/fawk and .fawk files in https://github.com/Janiczek/fawk/tree/main/tests.

epolanski

Even more, it's like talking about a sheet without seeing the sheet itself.

Blamklmo

[dead]

slybot

I did AoC 2021 until D10 using awk, it was fun but not easy and couldn't proceed further: https://github.com/nusretipek/Advent-of-Code-2021

oguz-ismail2

[dead]

oguz-ismail

[dead]

jamesu

A few months ago I used ChatGPT to rewrite a bison based parser to recursive descent and was pretty surprised how well it held up - though I still needed to keep prompting the AI to fix things or add elements it skipped, and in the end I probably rewrote 20% of it because I wasn't happy with its strange use of C++ features making certain parts hard to follow.

artpar

I wrote two

jslike (acorn based parser)

https://github.com/artpar/jslike

https://www.npmjs.com/package/jslike

wang-lang ( i couldn't get ASI to work like javascript in this nearley based grammar )

https://www.npmjs.com/package/wang-lang

https://artpar.github.io/wang/playground.html

https://github.com/artpar/wang

Y_Y

I've been trying to get LLMs to make Racket "hashlangs"† for years now, both for simple almost-lisps and for honest-to-god different languages, like C. It's definitely possible, raco has packages‡ for C, Python, J, Lua, etc.

Anyway so far I haven't been able to get any nice result from any of the obvious models, hopefully they're finally smart enough.

† https://williamjbowman.com/tmp/how-to-hashlang/

‡ https://pkgd.racket-lang.org/pkgn/search?tags=language

keepamovin

Yes! I'm currently using copilot + antigravity to implement a language with ergonomic syntax and semantics that lowers cleanly to machine code targeting multiple platforms, with a focus on safety, determinism, auditability and fail-fast bugs. It's more work than I thought but the LLMs are very capable.

I was dreaming of a JS to machine code, but then thought, why not just start from scratch and have what I want? It's a lot of fun.

lionkor

Curious why you do this with AI instead of just writing it yourself?

You should be able to whip up a Lexer, Parser and compiler with a couple weeks of time.

My_Name

Because he did it in a day, not a few weeks.

If I want to go from Bristol to Swindon, I could walk there in about 12 hours. It's totally possible to do it by foot. Or I could use a car and be there in an hour. There and back, with a full work day in-between done, in a day. Using the tool doesn't change what you can do, it speeds up getting the end result.

bgwalter

There is no end result. It's a toy language based on a couple of examples without a grammar where apparently the LLM used its standard (plagiarized) parser/lexer code and reiterated until the examples passed.

Automating one of the fun parts of CS is just weird.

So with this awesome "productivity" we now can have 10,000 new toy languages per day on GitHub instead of just 100?

epolanski

I'm not the previous user, but I imagine that weeks of investment might be a commitment one does not have.

I have implemented an interpreter for a very basic stack-based language (you can imagine it being one of the simplest interpreters you can have) and it took me a lot of time and effort to have something solid and functional.

Thus I can absolutely relate to the idea of having an LLM who's seen many interpreters lay out the ground for you and make you play as quickly as possible with your ideas while procrastinating delving in details till necessary.

64718283661

What's the point of making something like this if you don't get to deeply understand what your doing?

afpx

How deep do you need to know?

"Imagination is more important than knowledge."

At least for me that fits. I have quite enough graduate-level knowledge of physics, math, and computer science to rarely be stumped by a research paper. That may get me scorn from those tested on those subjects. Yet, I'm still an effective ignoramus.

johnisgood

I have made a lot of things using LLMs and I fully understood everything. It is doable.

My_Name

What's the point of owning a car if you don't build it by hand yourself?

Anyway, all it will do is stop you being able to run as well as you used to be able to do when you had to go everywhere on foot.

purple_turtle

What is the point of car that on Mondays changes colour to blue and on each first Friday of the year explodes?

If neither you not anyone else can fix it, without more cost than making a proper one?

TeodorDyakov

So you are using a tool to help you write code because you dont enjoy coding in order to make a tool used for coding(a computer language). Why?

victorbjorklund

There are lots of different things people can find interesting. Some people love the typing of loops. Some people love the design of the architecture etc. That’s like saying ”how can you enjoy woodworking if you use a CNC machine to automate parts of it”

killerstorm

Coding has many aspects: conceptual understanding of problem domain, design, decomposition, etc, and then typing code, debugging. Can you imagine person might enjoy conceptual part more and skip over some typing exercises?

bgwalter

The whole blog post does not mention the word "grammar". As presented, it is examples based and the LLM spit out its plagiarized code and beat it into shape until the examples passed.

We do not know whether the implied grammar is conflict free. We don't know anything.

It certainly does not look like enjoying the conceptual part.

cl3misch

For the same reason we have Advent of Code: for fun!

I mean, he's not solving the puzzles with AI. He's creating his own toy language to solve the puzzles in.

HN

FAWK: LLMs can write a language interpreter

FAWK: LLMs can write a language interpreter