Why don't you use dependent types?

42 comments

·November 2, 2025

lacker

Dependent types are very useful for some things. For example, I wish Python had the ability to express "a 10 x 5 matrix of float32s" as a type, and typecheck that.

The Curry-Howard correspondence, using dependent type system to have "proofs" be equivalent to "types", is powerful, but it can be really confusing. From a human point of view, there is a huge difference between "Your proof is wrong" and "You wrote a statement that fails typechecking".

Intuitively, when you make an error with types, it should be something fairly trivial that you just read the error and fix it up. When you make an error in a proof, it's understandable if it's very complicated and requires thought to fix. The natural UI is different.

So I agree with the author that the greatest benefit of Lean is not its typesystem per se, but its community. Specifically the fact that Lean's library of mathematics, mathlib, is organized like an open source community with pull requests. Whereas Isabelle's library of mathematics, the AFP, is organized like a scientific journal with referees.

I'm working on a dependent type system at the moment for a new theorem prover - Acorn, at https://acornprover.org - and my hope is to combine the good points of both Lean and Isabelle. It's nice that Lean has the power to cleanly express the simple dependent types that mathematicians often use, like vector spaces or quotients. But if you use dependent types too much then it does get complicated to debug what's happening.

Sharlin

> For example, I wish Python had the ability to express "a 10 x 5 matrix of float32s" as a type

To clarify, as long as 5 and 10 are constants, this is entirely possible in C++ and Rust^1, neither of which are dependently typed (or at most are dependently typed in a very weak sense). In general, neither can ensure at compile time that an index only known at runtime is in bounds, even if the bounds themselves are statically known. A proper dependently-typed language can prevent runtime out-of-bound errors even if neither the indices nor the bounds are known at type check time.

^1 And weakly in many other languages whose builtin array types have compile-time bounds. But C++ and Rust let user-defined generic types abstract over constant values.

thomasahle

You can do that in python using https://github.com/patrick-kidger/torchtyping

looks like this:

    def batch_outer_product(x:   TensorType["batch", "x_channels"],
                            y:   TensorType["batch", "y_channels"]
                            ) -> TensorType["batch", "x_channels", "y_channels"]:

    return x.unsqueeze(-1) * y.unsqueeze(-2)

There's also https://github.com/thomasahle/tensorgrad which uses sympy for "axis" dimension variables:

    b, x, y = sp.symbols("b x y")
    X = tg.Variable("X", b, x)
    Y = tg.Variable("Y", b, y)
    W = tg.Variable("W", x, y)
    XWmY = X @ W - Y

thesz

"In general, neither can ensure at compile time that an index only known at runtime is in bounds, even if the bounds themselves are statically known."

I remember being able to use telescopes [1] in Haskell long time ago, around 2012 or so.

[1] https://www.pls-lab.org/en/telescope

Haskell was not and is not properly dependently typed.

saghm

> In general, neither can ensure at compile time that an index only known at runtime is in bounds, even if the bounds themselves are statically known.

Yeah, this seems like matrixes might not be a great first example for explaining the value of dependent types. It's fully possible to define a matrix that uses a generic type as the index for each dimension that doesn't allow expressing values outside the expected range; it would just be fairly cumbersome, and the usual issues would creep back in if you needed to go from "normal" integers back to indexes (although not if you only needed to convert the indexes to normal integers).

I find that the potential utility of dependent types is more clear when thinking about types where the "dimensions" are mutable, which isn't usually how I'd expect most people to use matrixes. Even a simple example like "the current length of a list can be part of the type, so you define a method to get the first element only on non-empty lists rather than needing them to return an optional value". While you could sort of implement this in a similar as described above with a custom integer-like type, the limitations of this kind of approach for a theoretically unbounded length are a lot more apparent than a matrix with constant-sized dimensions.

zozbot234

Yes, the point of dependent types is that they give you the ability to do some sort of almost arbitrary (though not strictly Turing-complete) "computation" as part of type checking, which essentially dispenses with the phase separation between "compiling" and "running" code - or at least makes compile-time computation unusually powerful. So if you want to replicate the properties of dependent typing in these existing languages you'll need to leverage their existing facilities for compile-time metaprogramming.

> A proper dependently-typed language can prevent runtime out-of-bound errors even if neither the indices nor the bounds are known at type check time.

Yes, but the way this is done is by threading a proof "this index is within bounds" throughout the code. At runtime (e.g. within 'extracted' code, if you're using common dependently-typed systems), this simply amounts to relying on a kind of capability or ghost 'token' that attests to the validity of that code. You "manufacture" the capability as part of an explicit runtime check when needed (e.g. if the "index" or "bounds" come from user input) and simply rely on it as part of the code.

gpderetta

You can do arbitrary computations as part of type checking in C++, yet I don't think it should be considered dependently typed.

It seems to me that dependent typing strictly requires going from runtime values to types.

(You can parametrize types by runtime values in c++ in a trivial sense, by enumerating a finite set a compile time and then picking the correct type at runtime according to a runtime value, still I don't think it counts as the set of valid types would be finite).

tmtvl

In Common Lisp:

  (defparameter *test-array*
    (make-array '(10 5)
                :element-type 'Float
                :initial-element 0.0))

  (typep *test-array* '(Array Float (10 5)))

And the type check will return true.

lacker

That's a good point, for example in Eigen you can do

Eigen::Matrix<float, 10, 5>

I just really want it in Python because that's where I do most of my matrix manipulation nowadays. I guess you also would really like it to handle non-constants. It would be nice if these complicated library functions like

torch.nn.MultiheadAttention(embed_dim, num_heads, dropout=0.0, bias=True, add_bias_kv=False, add_zero_attn=False, kdim=None, vdim=None, batch_first=False, device=None, dtype=None)

would actually typecheck that kdim and vdim are correct, and ensure that I correctly pass a K x V matrix and not a V x K one.

zozbot234

Python is a dynamic language where everything happens at run time, including checks of variable types. So you can already do this.

etbebl

The latter is what would be most useful imo. Even Matlab can type check matrix sizes with constants these days, but I often wish I could use variables to express relationships between the sizes of different dimensions of inputs to a function.

uecker

C has such types and can guarantee that there is no out-of-bounds access at run-time in the scenarios you describe: https://godbolt.org/z/f7Tz7EvfE This is one reason why I think that C - despite all the naysayers - is actually perfectly positioned to address bounds-safe programming.

Often in dependently-types languages one also tries to prove at compile-time that the dynamic index is inside the dynamic bound at run-time, but this depends.

Sharlin

-fsanitize-bounds uses a runtime address sanitizer, surely? The program compiles fine. In a (strongly) dependently typed language, something like the following would refuse to typecheck:

  int foo(int i) {
      int bar[4] = { 1, 2, 3, 4 };
      return bar[i]
  }

The type checker would demand a proof that i is in bounds, for example

  int foo(int i) {
      int bar[4] = { 1, 2, 3, 4 };
      if i < 4
          return bar[i]
      else 
          return 0
  }

In languages with an Option type this could of course be written without dependent types in a way that's still correct by construction, for example Rust:

  fn foo(i: 32) -> i32 {
      let bar = [1, 2, 3, 4];
      bar.get(i)       // returns Option<i32>, not a raw i32
         .unwrap_or(0) // provide a default, now we always have an i32
  }

But ultimately, memory safety here is only guaranteed by the library, not by the type system.

turndown

I thought that this was a GCC extension(you need to use #define n 10 instead of int n = 10). Is this not the case anymore?

aatd86

> using dependent type system to have "proofs" be equivalent to "types"

Do you mean proposition as types? And proof has a program?

Or do you see it at somewhat a higher order, since values are proof for a type, and types can perhaps be proof at a higher level if they are first class citizen in the language?

(Which makes me think that dependent types is really a stairway between two type systems?)

Just wondering, since I had never personally seen it described in that fashion.

Other question: What is the function and runtime treatment of dependent types? Can they be instantiated on the spot by runtime values? A bit like defining/declaring functions that return arrays of runtime-known length? Does it involve two aspects of type checking? compile and runtime? Implementation-wise, doesn't it require too many indirections?

pron

It's interesting just how much of the debate in modern logic boils down to aesthetic preferences. On the other hand, I guess that if there were overwhelming practical advantages, there wouldn't be much to debate...

BTW, here's a "discussion paper" by Paulson and Leslie Lamport about typing in specification laguages from 1999: https://www.cl.cam.ac.uk/~lp15/papers/Reports/lamport-paulso.... Paulson represents the "pro-type" view, but note that since that paper was written, there have been developments in mechanised theorem proving of untyped formalisms, including in Lamport's own (TLA+).

paulddraper

I wouldn’t call it “aesthetics” per se.

More like “No free lunch.”

You can gain advantages, e.g. more complete compile time guarantees, but at disadvantages, e.g. greater program complexity, or longer compile times.

The subjectivity is whether the tradeoff is “worth” it.

Gajurgensen

Very interesting. My takeaway is that Dr. Paulson's answer to the question is that there is not anything necessarily wrong with dependent types, but that he doesn't believe they are necessary.

I would have liked to read more about Lean's alleged performance issues, and the issues around intentional equality. For the latter, I understand one can run into the need for heterogeneous equality (https://lean-lang.org/doc/reference/latest/Basic-Proposition...) when types are propositionally equal, but not definitionally equal. It has been some time I worked seriously in a dependently-typed language, but I recall coming to the conclusion that dependent types are best used as little as possible, for exactly this reason. If something may be stated as a theorem after the fact instead of putting it in the type, that was my preference.

Certainly there is something strongly aesthetically appealing about dependent type theory. The unification of programs and proofs and the natural emergence of independent proof objects. I am open to the idea that overly-dogmatic insistence on a type-theoretic basis to a theorem prover could lead to pragmatic issues, but I'd need to see more examples to be convinced there is a better foundation.

Anyway, I agree with Dr. Paulson's point that dependent types aren't necessary to verify interesting systems. He talked more of pure mathematics, but I am more interested in software verification. I work heavily in ACL2 which, not only does it not have dependent types, it doesn't have static typing at all! It is, however, also a first order logic and the both of these facts can sometimes be frustrating. Various libraries have been introduced to simulate typing and higher-ordered reasoning.

hibikir

Talking about non-necessary is IMO a cop-out: I bet we can verify systems with even fewer features that he is using, or just a different set of features that get him to the same spot. The interesting question is always whether a feature is useful enough.

You get into types at the end. And sure, we don't need static types. Just like, outside of verification, we don't need garbage collection, or bounds checking, or even loops. But are the features useful? What takes us to the goal faster? And remember that also changes depending on who is doing the tasks. A lot of differents in tooling selection, across all kinds of work, come down to preference, not general utility, and they sure have nothing to do with necessity

cwzwarich

The bigger problem with HOL (or simple type theory) is not the lack of dependencies, but rather the lack of logical strength. Simple type theory is equivalent in logical strength to bounded Zermelo set theory (i.e. ZF without Foundation or Replacement, and with Separation restricted to formulas with bounded quantifiers). This is unfortunately too weak to formalize post-WW2 mathematics in the same style as is done by ordinary mathematicians. Similarly, it does not offer a great way to deal with the size issues that arise in e.g. category theory.

zozbot234

You can always add axioms to improve logical strength. For example, one common approach for dealing with size issues in set theory is positing so-called 'inaccessible cardinals' which amount to something quite similar to the 'universes' of type theory.

cwzwarich

Adding axioms to simple type theory is more awkward than adding them to a set theory like ZFC. One approach to universes I’ve seen in Isabelle/HOL world is to postulate the existence of a universe as a model of set theory. But then you’re stuck reasoning semantically about a model of set theory. Nobody has scaled this up to a large pluralist math library like Mathlib.

stevan

> But people have regularly asked why Isabelle dispenses with proof objects. The two questions are essentially the same, because proof objects are intrinsic to all the usual type theories. They are also completely unnecessary and a huge waste of space.

I believe proof by reflection relies on proof objects? Georges Gonthier's proof of the four-colour theorem crucially uses proof by reflection.

zozbot234

Proof by reflection is accomplished by running some arbitrary program during proof checking that has been proven to only return a "true" result if the goal is true. You can do the exact same thing in an LCF system, and in fact that's arguably what a complex LCF "tactic" amounts to in the first place. If anything, the viability of proof by reflection simply shows that the divide with LCF-like checkers is not really that large.

griffzhowl

Great, I love this stuff.

See here for a summary of the many results of the author and team's research project on formalization:

https://www.cl.cam.ac.uk/~lp15/Grants/Alexandria/

Especially interesting for me is the work on formalizing quantum computing algorithms and theorems (open access):

https://link.springer.com/article/10.1007/s10817-020-09584-7

obeavs

So, we've been down this rabbithole at Phosphor (phosphor.co) and have explored/made a couple of really big technology bets on it.

The most unique/useful applications of it in production are based on combining dependent types with database/graph queries as a means. This enables you to take something like RDF which is neat in a lot of ways but has a lot of limitations, add typing and logic to the queries, in order to generally reimagine how you think about querying databases.

For those interested in exploring this space from a "I'd like to build something real with this", I'd strongly recommend checking out TypeDB (typedb.com). It's been in development for about a decade, is faster than MongoDB for vast swaths of things, and is one of the most ergonomic frameworks we've found to designing complex data applications (Phosphor's core is similar in many ways to Palantir's ontology concept). We went into it assuming that we were exploring a brand new technology, and have found it to work pretty comprehensively for all kinds of production settings.

ubercore

Can you expand on

"We build non-dilutive growth engines for industrial and climate technology companies by creating high quality development pipelines for institutional capital."

obeavs

Sure. Would contextualize by saying that infrastructure is a financial product: climate/industrial projects are sited in the physical world and have a hard upfront cost to produce a long term stream of cash flows, which, from a finance perspective, makes it look a lot like debt (e.g. I pay par value in order to achieve [x] cash flows with [y] risk).

When you drive past a solar project on the side of the road, you see the solar technology producing energy. But in order for a bank to fund $100M to construct the project, it has to be "developed" as if it were a long-term financial product across 15 or so major agreements (power offtake, lease agreement, property tax negotiations, etc). The fragmentation of tools and context among all the various counterparties involved to pull this sort of thing together into a creditworthy package for funding is enormously inefficient and as a result, processes which should be parallelize-able can't be parallelized, creating large amounts of risk into the project development process.

While all kinds of asset class-specific tools exist for solar or real estate or whatever, most of them are extremely limited in function because almost of those things abstract down into a narrative that you're communicating to a given party at any given time (including your own investment committee), and a vast swath of factual information represented by deterministic financial calculations and hundreds if not thousands of pages of legal documentation.

We build technology to centralize/coordinate/version control these workflows in order to unlock an order of magnitude more efficiency across that entire process in its totality. But instead of selling software, we sell those development + financing outcomes (which is where _all_ of the value is in this space), because we're actually able to scale that work far more effectively than anyone else right now.

jaggederest

Reminds me a lot of AngelList, which was initially nominally just a mailing list that connected angels and early stage startups, but eventually found that the restriction was in special purpose vehicles and automated the hard legal work of making many individual funding vehicles, and thus was behind the scenes actually a legal services company, if you squint.

zozbot234

The claim that dependently typed languages are inherently reliant on fully written-out proof objects looks quite wrong to me. You could easily imagine a proof term with opaque typed "holes" (written `_`) where the content of each "hole" is simply replaced by a LCF-like proof script that was somehow proven (in entirely unspecified ways, having to do with the peculiar programming language that the LCF-like checker uses for its implementation - so the soundness boundary has been expanded a lot, we have given up on having an easily checkable 'kernel'!) to generate some term of the correct type, starting from its environment. Since the content is opaque, no other part of the proof development can tell what exactly was in the hole, and we can dispense with writing that part of the proof term out.

whatshisface

That doesn't sound that easy.

zozbot234

If you mean that implementing the LCF architecture OP advocates for or evaluating any one implementation of it for soundness isn't easy, I absolutely agree. But assuming that you've solved that part, making use of it within a system that otherwise uses dependent types is not that hard.

heikkilevanto

I hate titles like "Why don't you use blah-blah". Usually because blah-blah might be an acceptable (maybe good?) solution to a problem which I don't have. Let me ask in return: Why should I even care about blah-blah. If the first (two?) paragraphs don't give a clear answer to that, never mind!

anonzzzies

You just prove / use dependently typed languages / tla+ where it makes sense, not for everything. The latter might make sense if it's mostly automated maybe, but it takes really painful elaborate work to get full coverage and for sure most stuff really doesn't need that. I always think these formal methods + unit/integration tests cover so much that you are already far more robust than most on earth.

golemotron

The juice isn't worth the squeeze.

fluffypony

Agree with this. The punchline here is not "dependent types bad", it is "choose your battles". Isabelle/HOL pushed frighteningly far without proof objects or dependent types, from schemes to BSG, and never hit the mythical wall. What moved the needle was automation, libraries, and legible proofs, not a fancier core calculus. Lean is great, but if the toolchain bogs down and equality games leak into your day, your fancy types are like Tesla FSD: impressive demo energy, unpredictable commute (no offense to anyone who uses it regularly). Knowing when not to use them is the real superpower imho.

If you need finely indexed invariants, sure, reach for DT. For the other 95%, HOL plus type classes and locales, backed by a small kernel and big libraries, will get you to production faster and with fewer regrets. Milner's LCF insight still pays the bills. And yes, croissants are delicious, but optional axioms are a risky breakfast.

HN

Why don't you use dependent types?

Why don't you use dependent types?