Small Programs and Languages

40 comments

·June 6, 2025

tromp

> Kolmogorov complexity (wikipedia.org) is: "… the length of a shortest computer program (in a predetermined programming language) that produces the object as output."

> Small programs are interesting. But I’m also interested in small programming languages. Generally speaking, the smaller the language, the less expressive it is.

It was the study of Kolmogorov complexity that motivated me to come up with the smallest and most expressive language for writing minimal size programs, and define the various flavors of Kolomogorov complexity in terms of it [1].

[1] https://gist.github.com/tromp/86b3184f852f65bfb814e3ab0987d8...

js8

Hi, I like your work. Do you think there could be a universal combinator (like iota) and some binary encoding that would produce even smaller interpreters than BLC?

I read your paper and you show that a simple encoding of S and K combinators seem to have longer interpreters. However..

What I don't fully understand, the complexity measure seems to depend on the binary encoding. So in theory, you could have a primitive combinator such that the interpreter is really short to express but the combinators for pair, true and false are rather long (when expressed in that primitive), thus hiding complexity in them.

Makes me wonder, if you should only measure complexity of the interpreter combinator, and not also complexity of the "quoting" combinator, which would transform already existing encoded combinator into doubly encoded one, forcing you to represent pair, true and false combinators as well.

tromp

> Do you think there could be a universal combinator (like iota) and some binary encoding that would produce even smaller interpreters than BLC?

No; I don't think so. A iota based code (1 for application, and 0 for iota) loses out badly to the SK based code (1 for application, 00 for K, 01 for S). It would instead be better to pick slightly larger bases, picking up some elements like I,B,C, or W.

> So in theory, you could have a primitive combinator such that the interpreter is really short to express but the combinators for pair, true and false are rather long (when expressed in that primitive), thus hiding complexity in them.

Of course, we're not just optimizing for self-interpreter size, but for overall size across a large set of tasks, such as the important theorems of AIT, or such as the milestones in a busy beaver (surpassing Graham's Number / TREE(3) / Loader's Number [1]).

The terms you mention, namely pair, true and false are not tasks, but the tiniest of data structures that you commonly use in programming tasks. Optimizing just for those is not as effective as for larger tasks that need to balance lots of data types and control structures.

[1] https://oeis.org/A333479

veqq

Here's an attempt to implement it: https://github.com/aartaka/cl-blc

tromp

Thanks! Implementations of BLC in many other languages can be found at [1].

[1] https://rosettacode.org/wiki/Universal_Lambda_Machine

karmakaze

Fun read. I grew up with small everything by default writing machine/assembly on 8-bits.

One thing about small capable languages is that they're sort of proto-languages where it's used to create a specific flavour/dialect with programming abstractions which are then used to implement a program. This is true of all languages but especially the tiny ones like assembly language or Lisp. This is the reason I believe that Lisps never got more popular because each project is it's own dialect and not applicable elsewhere. It's why Clojure a larger more opinionated language with 'batteries included' is more useful elsewhere.

antilisp

(Author of Antilisp, a Lisp for metaprogramming in non-Lisp languages) Not as small a project, but writing a feature-complete implementation of a small language is a very good learning experience that I would recommend to everyone.

There are only so many language implementers, and many opportunities to innovate in language design become obvious when it's your turn to decide what to implement.

For example, I made macros and axioms first-class citizens of the language as special cases of the function type, because using an interpreter allowed me to decide how things should be evaluated at runtime. I accidentally found a trivial way to bootstrap (quote) in my language. I also found a way to make a much more general reader macro system by simply writing a modular value reader that can be extended by adding new elements to a list of parsers. This make non-lisp syntaxes like JSON much easier to implement than with the CL reader macro system where reader macros are bound to some special parameters.

Note for context: In my case, the Lisp interpreter is under 3KLoC, and most of the abstractions are bootstrapped from macros and reflection primitives, or primitive that are shadowed by higher level builtins.

J_McQuade

Off topic, but this is the first I've heard of Antilisp and I love the idea - seems aimed at the sort of problems that I've definitely solved with a tangle of unmaintainable elisp before. Now I just need to forget about the whole thing before I talk myself into writing a Helm module for it... or an HCL one... or... NO.

lioeters

I relate to that feeling of having to resist the siren call of Lisp.

antilisp

Thanks! Antilisp is still WIP, so I would not recommend spending time writing modules for complex languages yet. I am (slowly) fixing things and writing more/better wrappers so that users don't have to dedicate their own time before deriving value from the language.

forinti

That's what made computer magazines so much fun in the 1980s.

Every month there would be small projects (fractals, colliding galaxies, small games) that would teach you a new subject and be small enough that you could type in.

colanderman

Also the Beagle Bros software and books for the Apple // series. They even had a semi regular "contest" where readers would submit "two-liners" which they'd then judge and distribute via floppy. That culture grew and cemented my love of the hobby, I loved and miss it.

WillAdams

Thank you for mentioning Beagle Bros.! (that's a name I haven't heard in a long while)

To save folks a search:

https://beagle.applearchives.com/

cmontella

Haha yeah! When I was a kid before we could afford a computer, I would type magazine programs out on a type writer. I remember that I had to keep starting over because I would make a mistake, but that was okay because the program was short enough I was confident I could get through without any mistakes eventually.

asplake

Something I was looking for recently: is there a tiny, "just for fun" language in the ML line, or something similarly based on typed lambda calculus? Or something smaller, something more Lisp-like but curried?

chubot

There are some mini typed functional languages here - https://plzoo.andrej.com/

Although they are implemented in OCaml, which feels like more magic than say a Forth or Lisp written in C. In C you can still "see" how it runs on hardware

I think Standard ML is actually the "usable mini ML"

These types of languages will always be bigger than Lisp/Forth/Tcl, because they require lexer /parser / type checker / GC runtime

---

Although I would like to see someone do the smallest possible C version of an ML

Something like c4 - C in 4 Functions (https://github.com/rswier/c4) - which actually has a type checker! (which is hard to see on first reading)

asplake

That’s great, thanks! I see there is a mini Haskell there too. But yes, a smallest possible ML is what I had in mind.

jinwoo68

Gleam[1] is influenced a lot by ML and is a very simple language. But it's not a "just for fun" language. I like it a lot.

[1] https://gleam.run/

null

[deleted]

asplake

Yes I’ve played with Gleam. A nice small language but bigger than I had in mind. That’s mainly to look under the covers, which in Gleam’s case involves other language platforms, so less interesting in this context.

taolson

I don't know if these match your definition of "tiny", but there's Miranda:

https://www.cs.kent.ac.uk/people/staff/dat/miranda/

and I wrote a pure, lazy, functional language and self-hosted compiler based upon Miranda:

https://github.com/taolson/Admiran

asplake

Nice!

kryptiskt

SASL (the predecessor to Miranda) isn't statically typed, but it's very small and very much feels like a language in the ML tradition.

anthk

Nils from T3X has a micro ML language too.

http://t3x.org/mlite/index.html

dunham

Meta II is also an interesting small language for writing compilers (including itself).

wikipedia: https://en.wikipedia.org/wiki/META_II

paper: https://hcs64.com/files/pd1-3-schorre.pdf

xkriva11

I recommend this Lua implementation. Truly beautiful and more accessible. https://github.com/melvinzhang/meta2-lua

xelxebar

I love these explorations! Thanks for sharing.

> Hui can list the entire interpreter in the appendix because it’s just that single page, 122 lines of incredibly cryptic C, which you can view here:

I count 41 lines, but more interestingly and by pure luck, I just happened to make a post about this less than an hour ago:

https://blog.wilsonb.com/posts/2025-06-06-readable-code-is-u...

bradain

I really dislike the tiny lisp implementation mentioned in the article. Part of the beauty of a real "lisp in lisp" interpreter is that it is meta-circular -- that is, the interpreter is able to interpret itself.

Using pmatch and some other tricks makes the interpreter shorter but prevents meta-circular evaluation, in which case you could have just defined the lisp interpreter as just being eval.

agentultra

Smol coding and size coding are fun. If you like this sort of stuff check out the demo scene, lovebyte, etc.

I got nerd sniped by a conversation with a colleague and am, in my spare time, working on the smallest event store database I can write.

It’s a nice way to take a break from the enterprise software world.

norir

I find the notion of kolmogorov complexity both interesting and potentially a bit misleading. The reason it is misleading is because of the extra implicit context that is not present in the artifact but is necessary for interpretation. Sure, it is possible to write a ray tracer on a business card, but you can't write the c compiler or runtime on a business card so I don't agree that it is a business card sized concept when stripped to its essence.

Now, you could take the Guy Steele "Growing a language route" and try to define a self-consistent program that defines every word that is used, but even then you will have to rely on some axioms that cannot be defined solely within the artifact. So then the question becomes what is the size of the axioms?

In other words, Kolmogorov complexity does point to a real thing -- and I strongly agree that smaller generally corresponds to lower complexity given the same set of axioms -- but the deeper sense of complexity seems inherently unmeasurable to me.

Or put yet another way, complexity can only measured relationally.

tromp

When Kolmogorov complexity is expressed directly in terms of the pure untyped lambda calculus, the oldest and simplest model of computation (except perhaps for SK combinatory logic, which is less expressive bit-for-bit), you cannot really complain of hidden complexity in the axioms. Even numbers have to be built from scratch.

LegionMammal978

Yeah, examples of 'small' interpreters (including 'small' self-interpreters) have always made me uneasy when they depend on a powerful runtime environment. E.g., in the limit, you can create a language where the 0-byte program is a self-interpreter, and all other programs are interpreted as Python or whatever. You can then point to "look how small the description is!" when it's really just sleight of hand. To put it as a hot take, bootstrapping is very overrated.

Of course, every human definition will similarly hide complexity to some extent, so there's no real way to win.

elia_42

Really interesting reflection on small languages, the relationship between simplicity and expressiveness. Also interesting is the broadening of the theme with the beautiful final quotations of a philosophical nature as well, about the importance of the habits of those who like to enclose themselves and build their own little world that does not exclude the real world, since it depends, for them, on their own little world.

codr7

I've been working on a substrate/bootstrap for small languages in multiple host languages lately:

https://github.com/codr7/shi

HN

Small Programs and Languages

Small Programs and Languages