My first verified imperative program

48 comments

·July 7, 2025

Joker_vD

Naturally, this proof only works for arbitrary-precision integers: when you use fixed-precision integers, the algorithm will wrongfully report "false" for arrays like e.g. [INT_MIN, -1] or (if you insist on C semantics) [UINT_MAX, 1].

Hopefully the proof would break if one tried to transfer it over?

DavidVoid

> the algorithm will wrongfully report "false" for arrays like e.g. [INT_MIN, -1]

If you have INT_MIN along with any other negative number in the array then your program has undefined behavior in C. Signed integer overflow is UB (but unsigned overflow is not).

Jtsummers

> the algorithm will wrongfully report "false" for arrays like e.g. [INT_MIN, -1]

`INT_MIN + -1` is not 0 so it should report false in that case.

For UINT_MAX, the algorithm would need to be reconsidered, though, since it's written with signed integers in mind.

> Hopefully the proof would break if one tried to transfer it over?

Hopefully. The proof would have to be modified to account for the actual types. If you're using bounded integers you'd need to write a different proof.

junon

INT_MIN - 1 is undefined behavior in C.

Jtsummers

Sure, can you find an example where INT_MIN - 1 results in 0 though?

derdi

> For UINT_MAX, the algorithm would need to be reconsidered, though, since it's written with signed integers in mind.

The algorithm is written assuming that unary - produces the additive inverse. That is also true for C's unsigned integers. -1U == UINT_MAX, -UINT_MAX == 1U. It Just Works.

necunpri

This is the strength of typing, right?

If I can specify the type of my input I can ensure the verification.

andrepd

But false is the correct result for those cases. Addition is addition and overflow is undefined (= can assume that doesn't happen), it's not addition modulo 2^n.

Joker_vD

We are not talking about C here. Imagine it was e.g. Java, or C#, or Rust in release mode, or heck, even Lean itself but with fixed-precision integers.

zelphirkalt

Wait, so much effort and it doesn't even consider this widely known issue? That would mean, that even though all this effort has been spent, a decent programmer still has a better idea of whether something is correct than the proof system used here. And worse this might lull one into thinking, that it must be correct, while actually for a simple case it breaks.

Jtsummers

> a decent programmer still has a better idea of whether something is correct than the proof system used here.

The proof is correct in the language it's written for, Lean. If you change the context (axioms) of a proof then the proof may be invalidated. This is not a surprising thing to anyone who spends a second thinking about it.

Joker_vD

> anyone who spends a second thinking about it.

Except most programmers don't spend even a second to think about it, and we end up with "int mid = (low + high) / 2;" bugs in standard implementations of binary search in e.g. Java. And that implementation even had a written proof accompanying it!

munchler

Lean is awesome and this is an impressive new feature, but I can't help but notice that the proof is significantly longer and more complex than the program itself. I wonder how well this will scale to real-world programs.

armchairhacker

Real-world programs can be verified by formally proving properties on a small part of the code (called the kernel) in a way that transitively guarantees those for the remaining code.

For example, Rust's borrow checker guarantees* memory safety of any code written in Rust, even a 10M+ LOC project. Another example is sel4, a formally-verified micro-kernel (https://sel4.systems/About/seL4-whitepaper.pdf).

* Technically not; even if the code doesn't use `unsafe`, not only is Rust's borrow checker not formally verified, there are soundness holes (https://github.com/rust-lang/rust/issues?q=is%3Aopen%20is%3A...). However, in theory it's possible to formally prove that a subset of Rust can only encode memory-safe programs, and in practice Rust's borrow checker is so effective that a 10M+ LOC project without unsafe still probably won't have memory issues.

serbuvlad

What's a memory issue?

If I access beyond the end of an array in Rust, the panic handler runs and starts unwinding my stack. If I access beyond the end of an array in C++ with .at() the excwption handler runs and starts unwining my stack. If I access beyond the end of an array in C the SIGSEGV handler may (*) run and I could, if I wanted to, start unwinding my stack.

Ah, but in C, sometimes if I access the wrong memory, I get garbadge instead of a panic.

Sure, and if I store my data in a Rust array and store indexes into that array around the place as sort of weak references (something I've seen Rust programmers use and talk about all the time), I can easily fetch the wrong data too.

Rust provides a robust type system and a borrow checker which avoids a lot of common problems at the expence of adhering to a particular programming style. That's fine. That's worth advocating for.

But it's no pannacea. Not even close.

My favorite memory about this is a programmer lambasting Go's strings (which are basically immutable byte vectors) for not enforcing UTF-8, like Rust strings.

He then said that this means that in Go you can print filenames to the screen that can break your terminal session because of this if they contain invalid UTF-8, which Rust forces you to escape explicitly. The irony, of couse, is that the characters that can break your terminal session are perfectly valid UTF-8.

Rust's type safety convinced this guy that his Rust program was immune to a problem that it was simply not immune to.

amw-zero

Research points to there being a quadratic relationship between automated proof and code size: https://trustworthy.systems/publications/nictaabstracts/Mati....

Specifically, the relationship is between the _specification_ and the proof, and it was done for proofs written in Isabelle and not Lean.

The good news is that more and more automation is possible for proofs, so the effort to produce each proof line will likely go down over time. Still, the largest full program we've fully verified is much less than 100,000 LOC. seL4 (verified operating system) is around 10,000 lines IIRC.

grumbelbart

Long-term this would be done using LLMs. It would also solve LLMs' code quality issues - they could simply proof that the code works right.

codebje

Maybe very long term. I turn off code assistants when doing Lean proofs because the success rate for just suggestions is close to zero.

teiferer

[dead]

teiferer

[dead]

null

[deleted]

jeremyscanvic

That's really neat! I'm very excited for the future of Lean.

null

[deleted]

norir

My brain has been slowly trained to reject imperative programming. This example could be rewritten in a tail recursive manner using an immutable set which would be simpler to verify for correctness even without a formal verifier.

I have found that while there is a learning curve to programming using only recursion for looping, code quality does go significantly up under this restriction.

Here is why I personally think tail recursion is better than looping: with tail recursion, you are forced to explicitly reenter the loop. Right off the bat, this makes it difficult to inadvertently write an infinite loop. The early exit problem is also eliminated because you just return instead of making a recursive call. Moreover, using recursion generally forces you to name the function that loops which gives more documentation than a generic for construct. A halfway decent compiler can also easily detect tail recursion and rewrite it as a loop (and inline if the recursive function is only used in one place) so there need not to be any runtime performance cost of tail recursion instead of looping.

Unfortunately many languages do not support tail call optimization or nested function definitions and also have excessively wordy function definition syntax which makes loops more convenient to write in those languages. This conditions one to think in loops rather than tail recursion. Personally I think Lean would be better if it didn't give in and support imperative code and instead helped users learn how to think recursively instead.

taeric

This feels overly strong? I've certainly messed up my fair share of recursive calls.

I don't know why, but I have actually gotten a bit stronger on the imperative divide in recent years. To the point that I found writing, basically, a GOTO based implementation of an idea in lisp to be easier than trying to do it using either loops or recursion. Which, really surprised me.

I /think/ a lot of the difference comes down to how localized the thinking is. If I'm able to shrink the impact of what I want to do down to a few arguments, then recursion helps a ton. If I'm describing a constrained set of repetitive actions, loops. If I'm trying to hold things somewhat static as I perform different reduction and such, GOTO works.

I think "functional" gets a bit of a massive boost by advocates that a lot of functional is presented as declarative. But that doesn't have to be the case. Nor can that help you, if someone else hasn't done the actual implementation.

We can get a long way with very mechanical transformations, in the form of compilation. But the thinking can still have some very imperative aspects.

tuveson

> I've certainly messed up my fair share of recursive calls.

It’s a common enough problem that the “why is my program crashing” website is basically named after it.

ngruhn

Another case for recursion is that you have to think of the base case. With loops people pathologically forget to handle 0, [], [[]], "", etc.

louthy

I agree, but also folds, traversals, list-comprehensions, and recursion-schemes work well and can be even more resistent to common bugs than regular recursion.

Although it’s hard to fault the simple elegance of recursion!

kevindamm

Which languages do support TCO at this point? From my recollection we have

* Scheme

* Haskell

* Elixir

* Erlang

* OCaml

* F#

* Scala

* (not Clojure)

* the JVM could remove tail-recursive calls, but IIRC this still hasn't been added for security reasons

* Racket

* Zig

* Lua

* Common Lisp, under certain compilers/interpreters

* Rust? (depends)

* Swift? (sometimes)

geoffhill

Both Clang and GCC have musttail attributes than can force tail calls at specific return statements in C/C++.

louthy

> * F#

The .NET CLR supports the ‘.tail’ opcode which means that any .NET based language could support it. I’m hoping one day the C# team will get around to it. It seems like such low hanging fruit.

taeric

I don't understand the security reasons on not removing tail calls. Any chance you have a good place to read up on that?

kevindamm

It was raised in one of the initial proposals, back in 2002

https://bugs.java.com/bugdatabase/view_bug?bug_id=4726340

but that looks like a dead link and no wayback archive..

IIRC, basically it's because some parts of the JVM use stack unwinding to figure out what userland code is calling certain system code.. also the current stack frame has metadata about lock status used for allowing re-entrant locks that you lose if you elide the entire recursive call (which the initial proposal did by only removing the few bytecode instructions that set up the callstack frame and return from it).

A more informal proposal from ~2016 allows for soft tail calls and hard (annotated) tail calls, with some restrictions that evidently avoid issues with system calls and lock/reentry maintenance:

https://web.archive.org/web/20161112163441/https://blogs.ora...

And a video by one of the JVM architects at Oracle about adding TCO for Scala

https://www.youtube.com/watch?v=2y5Pv4yN0b0&t=1h02m18s

Also previously featured here on HN, a way to do it that avoids security concerns, by using goto instead of strictly deleting bytecode instructions:

https://news.ycombinator.com/item?id=22945725

nhubbard

Kotlin as well, through the ‘tailrec’ marker on a function.

kevindamm

ah, thanks, good to know.. but does that make it optional? I kind of like how ocaml requires a letrec annotation on any recursive definition and I don't know when you wouldn't want to add tailrec

zabzonk

C++, depending on compiler and other stuff.

alexisread

Freeforth (implicit) and Ableforth (deliberately explicit)

teiferer

[dead]

null

[deleted]

null

[deleted]

null

[deleted]

b0a04gl

[dead]

SUPERX_112

nice