Zig's comptime is bonkers good

267 comments

·January 7, 2025

noelwelsh

It would be nice to have a more indepth discussion of the issues that have been found with compile-time programming, rather than uncritical acclaim. Staged programming is not new, and people have run into many issues and design tradeoffs in that time. (E.g. the same stuff has been done in Lisps for decades, though most Lisps don't have a type system, which makes things a bit more complicated.)

Some of the issues that come to mind:

* Implementing generics in this way breaks parametricity. Simply put, parametricity means being able to reason about functions just from their type signature. You can't do this when the function can do arbitrary computation based on the concrete type a generic type is instantiated with.

* It's not clear to me how Zig handles recursive generic types. Generally, type systems are lazy to allow recursion. So I can write something like

type Example = Something[Example]

(Yes, this is useful.)

* Type checking and compile-time computation can interact in interesting ways. Does type checking take place before compile-time code runs, after it runs, or can they be interleaved? Different choices give different trade-offs. It's not clear to me what Zig does and hence what tradeoffs it makes.

* The article suggests that compile-time code can generate code (not just values) but doesn't discuss hygiene.

There is a good discussion of some issues here: https://typesanitizer.com/blog/zig-generics.html

MatthiasPortzel

I'm a pretty big fan of Zig--I've been following it and writing it on-and-off for a couple of years. I think that comptime has a couple of use-cases where it is very cool. Generics, initializing complex data-structures at compile-time, and target-specific code-generation are the big three where comptime shines.

However, in other situations seeing "comptime" in Zig code has makes me go "oh no" because, like Lisp macros, it's very easy to use comptime to avoid a problem that doesn't exist or wouldn't exist if you structured other parts of your code better. For example, the OP's example of iterating the fields of a struct to sum the values is unfortunately characteristic of how people use comptime in the wild--when they would often be better served by using a data-structure that is actually iterable (e.g. std.enums.EnumArray).

bunderbunder

This feels like it's it's a constant problem with all more advanced language features. I've had the same reaction to uses of lisp macros, C-style macros, Java compiler extensions, Ruby's method_missing, Python monkey patching, JavaScript prototype inheritance, monads, inheritance...

Maybe the real WTF is the friends we made along the way. <3 <3 <3

paulddraper

That’s because it’s a human problem not a technology one.

Can only be fixed by fixing humans.

arccy

this is why Go is so great....

PaulHoule

Lately I read Graham's On Lisp and first felt it was one the greatest programming books I'd ever read and felt it was so close to perfect that the little things like he made me look "nconc" up in the CL manual (so far he'd introduced everything he talked about) made me want to go through and do just a little editing. And his explanation of how continuations work isn't very clear to me which is a problem because I can't find a better one online (the only way I think I'll understand continuations is if I write the explanation I want to read)

Then I start thinking things like: "if he was using Clojure he wouldn't be having the problems with nconc that he talks about" and "I can work most of the examples in Python because the magic is mostly in functions, not in the macros" and "I'm disappointed that he doesn't do anything that really transform the tree"

(It's still a great book that's worth reading but anything about Lisp has to be seen in the context the world has moved on... Almost every example in https://www.amazon.com/Paradigms-Artificial-Intelligence-Pro... can be easily coded up in Python because it was the garbage collection, hashtables on your fingertips, first class functions that changed the world, not the parens)

Lately I've been thinking about the gradient from the various tricks such as internal DSLs and simple forms of metaprogramming which are weak beer compared to what you can do if you know how compilers work.

lispm

> if he was using Clojure he wouldn't be having the problems with nconc that he talks about"

Yeah, one would write the implementation in Java.

Common Lisp (and Lisp in general) often aspires to be written in itself, efficiently. Thus it has all the operations, which a hosted language may get from the imperative/mutable/object-oriented language underneath. That's why CL implementations may have type declarations, type inference, various optimizations, stack allocation, TCO and other features - directly in the language implementation. See for example the SBCL manual. https://sbcl.org/manual/index.html

For example the SBCL implementation is largely written in itself, whereas Clojure runs on top of a virtual machine written in a few zillion lines of C/C++ and Java. Even the core compiler is written in 10KLOC of Java code. https://github.com/clojure/clojure/blob/master/src/jvm/cloju...

Where the SBCL compiler is largely written Common Lisp, incl. the machine code backends for various platforms. https://github.com/sbcl/sbcl/tree/master/src/compiler

The original Clojure developer made the conscious decision to inherit the JIT compiler from the JVM, write the Clojure compiler in Java and reuse the JVM in general -> this reuses a lot of technology maintained by others and makes integration into the Java ecosystem easier.

The language implementations differ: Lots of CL + C and Assembler compared to a much smaller amount of Clojure with lots of Java and C/C++.

CL has for a reason a lot of low-level, mutable and imperative features. It was designed for that, so that people code write efficient software largely in Lisp itself.

jcrites

Let me take a shot at explaining continuations.

In normal programming, functions "return" their values. In Continuation Passing Style (CPS), functions never return. Instead, they take another function as input; and instead of returning, they call that function (the "continuation"). Instead of returning their output, they pass their output as input to the continuation.

(Some optimizations are used such that this style of call, the "tail call", does not cause the stack to grow endlessly.)

Why would you write code in this style? Generally, you wouldn't. It's typically used as an internal transformation in some types of interpreters or compilers. But conceptualizing control flow in this way has certain advantages.

Then there are terms like the "continuation" of a program at a certain point in the code, which just means "whatever the program is going to do next, after it returns (or would return) from the code that it's about to execute". That's what "call with current continuation" (call/cc) is about. It captures (or reifies) "what will the program do next after this?" as a function that can be called to do, well, do that thing. If your code is about to call `f();`, then the 'continuation' at that point is whatever the code will do next after `f()` returns with its return value.

Thus if you had some code `g(f())`, then the continuation just as you call `f()` is to call `g()`. CPS restructures this so that `f()` takes the "thing to do next" as input, which is `g()` in this case. The CPS transformation of this code would be `f(g)`, where `g` is the continuation that `f` will invoke when it's done. Instead of returning a value, `f` invokes `g` passing that value as input.

You can use continuations to implement concepts like coroutines. With continuations, functions never need to "return". It's possible to create structures like two functions where the control flow directly jumps between between them, back and forth (almost like "goto", but much more structured than that). Neither one is "calling" the other, per se, because neither one is returning. The control flow jumps directly between them as appropriate, when one function invokes a continuation that resumes the other. The functions are peers, where both can essentially call into the other's code using continuations.

That's probably a little muddy as a first exposure to continuations, but I'm curious what you think. I generally think of continuations as a niche thing that will likely only be used by language or library implementors. Most languages don't support them.

Also, I'd probably argue that regular asynchronous code is a better way to structure similar program logic in modern programming languages. Or at least, it's likely just as good in most ways that matter, and may be easier to reason about than code that uses continuations.

For example, one use-case for coroutines is a reader paired with a writer. It can be elegant because the reader can wait until it has input, and then invoke the continuation for the writer to do something with it (in a direct, blocking fashion, with no "context switch"). But you can model this with asynchronous tasks pretty easily and clearly too. It might have a little more overhead, to handle context switching between the asynchronous tasks, but unlikely enough to matter.

marhee

Interesting points.

> Implementing generics in this way breaks parametricity. Simply put, parametricity means being able to reason about functions just from their type signature. You can't do this when the function can do arbitrary computation based on the concrete type a generic type is instantiated with.

Do you mean reasoning about a function in the sense of just understanding what a functions does (or can do), i.e. in the view of the practical programmer, or reasoning about the function in a typed theoretical system (e.g. typed lambda calculus or maybe even more exotic)? Or maybe a bit of both? There is certainly a concern from the theoretical viewpoint but how important is that for a practical programming language?

For example, I believe C++ template programming also breaks "parametricity" by supporting template specialisation. While there are many mundane issues with C++ templates, breaking parametricity is not a very big deal in practice. In contrast, it enables optimisations that are not otherwise possible (for templates). Consider for example std::vector<bool>: implementations can be made that actually store a single bit per vector element (instead of how a bool normally is represented using an int or char). Maybe this is even required by the standard, I don't recall. My point is that in makes sense for C++ to allow this, I think.

noelwelsh

In terms of implementation, you can view parametricity as meaning that within the body of a function with a generic type, the only operations that can be applied to values of that type are also arguments to that function.

This means you cannot write

fn sort<A>(elts: Vec<A>): Vec<A>

because you cannot compare values of type A within the implementation of sort with this definition. You can write

fn sort<A>(elts: Vec<A>, lessThan: (A, A) -> Bool): Vec<A>

because a comparison function is now a parameter to sort.

This helps both the programmer and the compiler. The practical upshot is that functions are modular: they specify everything they require. It follows from this that if you can compile a call to a function there is a subset of errors that cannot occur.

In a language without parametricity, functions can work with only a subset of possible calls. If we take the first definition of sort, it means a call to sort could fail at compile-time, or worse, at run-time, because the body of the function doesn't have a case that knows how to compare elements of that particular type. This leads to a language that is full of special cases and arbitrary decisions.

Javascript / Typescript is an example of a language without parametricity. sort in Javascript has what are, to me, insane semantics: converting values to strings and comparing them lexicographically. (See https://developer.mozilla.org/en-US/docs/Web/JavaScript/Refe...) This in turn can lead to amazing bugs, which are only prevented by the programmer remembering to do the right thing. Remembering to do the right thing is fine in the small but it doesn't scale.

Breaking parametricity definitely has uses. The question becomes one about the tradeoffs one makes. That's why I'd rather have a discussion about those tradeoffs than just "constime good" or "parametricity good". Better yet are neat ideas that capture the good parts of both. (E.g. type classes / implicit parameters reduce the notational overhead of calling functions with constrained generic types, but this bring their own tradeoffs around modularity and coherence.)

dmvdoug

Do you have a blog or other site where you post your writing? Your explanations are quite good and easy to follow for someone like me, an interested/curious onlooker.

James_K

Functions can crash anyway. I don't see how what you describe is different from a function on integers that errors on inputs that are too big. The programmer has to actively choose to make function break parametricity, and they can equally chose not to do that.

nimish

This is _also_ doable with the ability to constrain generics.

sort<A> where A implements Comparable

Simpler explanation IMO.

norir

Fair point about parametricity. A language could in the macro expansion do the equivalent of a scala implicit lookup for a sorting function for the type and return an error at macro expansion time if it can't find one. That avoids the doing the right thing requires discipline problem but I agree it is still less clear from the type signature alone what the type requirements are.

SkiFire13

> For example, I believe C++ template programming also breaks "parametricity" by supporting template specialisation.

C++ breaks parametricity even with normal templates, since you can e.g. call a method that exists/is valid only on some instantiations of the template.

The issue is that the compiler can't help you check whether your template type checks or not, you will only figure out when you instantiate it with a concrete type. Things get worse when you call a templated function from within another templated function, since the error can then be arbitrarily levels deep.

> My point is that in makes sense for C++ to allow this, I think.

Whether it makes sense or not it's a big pain point and some are trying to move away from it (see e.g. Carbon's approach to generics)

marhee

> C++ breaks parametricity even with normal templates

I might be wrong here, but as I understand it "parametricity" means loosely that all instantiations use the same function body. To quote wikipedia:

"parametricity is an abstract uniformity property enjoyed by parametrically polymorphic functions, which captures the intuition that all instances of a polymorphic function act the same way"

In this view, C++ does not break parametricity with "normal" (i.e. non-specialised) templates. Of course, C++ does not type check a template body against its parameters (unless concepts/trairs are used), leading to the problems you describe, but it's a different thing as far as I understand.

Quekid5

> Consider for example std::vector<bool>: implementations can be made that actually store a single bit per vector element (instead of how a bool normally is represented using an int or char).

Your example is considered a misfeature and demonstrates why breaking parametricity is a problem: the specialized vector<bool> is not a standard STL container even though vector<anythingelse> is. That's at best confusing -- and can leads to very confusing problems in generic code. (In this specific case, C++11's "auto" and AAA lessens some of the issues, but even then it can cause hard-to-diagnose performance problems even when the code compiles)

See https://stackoverflow.com/a/17797560 for more details.

HelloNurse

The C++ vector<bool> specialization is bad because breaking many important implicit contracts about taking the address of vector<> elements makes it practically unusable if a normal vector<> is expected, but it isn't specialized incorrectly in a formally meaningful sense: all errors are outside the class (unsatisfied expectations from client code) and implicit (particularly for C++ as it was at the time).

beached_whale

Vector bool may not have to store a range of space optimized bool values but the interface is still different enough and guarantees different enough that is is largely thought of as a mistake. For one the const reference type is bool and not bool const &. Plus other members like flip… mostly the issue is in generic code expecting a normal vector

nialv7

one thing you can reason about a function is: does it exist at all? if you don't have parametricity, you can't even be sure about that. in Rust, as long as your type satisfies a generic function's bounds, you can be sure instantiating that function with this type will compile; in C++ you don't have that luxury.

ScottRedig

Hi, article author here. I was motivated to write this post after having trouble articulating some of its points while at a meetup, so that's why the goal of this post was focused on explaining things, and not being critical.

So at least address your points here:

* I do agree this is a direct trade-off with Zig style comptime, versus more statically defined function signatures. I don't think this affects all code, only code which does such reasoning with types, so it's a trade-off between reasoning and expressivity that you can make depending on your needs. On the other hand, per the post's view 0, I have found that just going in and reading the source code easily answers the questions I have when the type signature doesn't. I don't think I've ever been confused about how to use something for more than the time it takes to read a few dozen lines of code.

* Your specific example for recursive generic types poses a problem because a name being used in the declaration causes a "dependency loop detected" error. There are ways around this. The generics example in the post for example references itself. If you had a concrete example showing a case where this does something, I could perhaps show you the zig code that does it.

* Type checking happens during comptime. Eg, this code:

  pub fn main() void {
      @compileLog("Hi");
      const a: u32 = "42";
      _ = a;
      @compileLog("Bye");
  }

Gives this error:

  when_typecheck.zig:3:17: error: expected type 'u32', found '*const [2:0]u8'
   const a: u32 = "42";
                  ^~~~
  Compile Log Output:
  @as(*const [2:0]u8, "Hi")

So the first @compileLog statement was run by comptime, but then the type check error stopped it from continuing to the second @compileLog statement. If you dig into the Zig issues, there are some subtle ways the type checking between comptime and runtime can cause problems. However it takes some pretty esoteric code to hit them, and they're easily resolved. Also, they're well known by the core team and I expect them to be addressed before 1.0.

* I'm not sure what you mean by hygiene, can you elaborate?

mananaysiempre

“Hygiene” in the context of macro systems refers to the user’s code and the macro’s inserted code being unable to capture each other’s variables (either at all or without explicit action on part of the macro author). If, say, you’re writing a macro and your generated code declares a variable called ‘x’ for its own purposes, you most probably don’t want that variable to interfere with a chunk of user’s code you received that uses an ‘x’ from an enclosing scope, even if naïvely the user’s ‘x’ is shadowed by the macro’s ‘x’ at the insertion point of the chunk.

It’s possible but tedious and error-prone to avoid this problem by hand by generating unique identifier names for all macro-defined runtime variables (this usually goes by the Lisp name GENSYM). But what you actually want, arguably, is an extended notion of lexical scope where it also applies to the macro’s text and macro user’s program as written instead of the macroexpanded output, so the macro’s and user’s variables can’t interfere with each other simply because they appear in completely different places of the program—again, as written, not as macroexpanded. That’s possible to implement, and many Scheme implementations do it for example, but it’s tricky. And it becomes less clear-cut what this even means when the macro is allowed to peer into the user’s code and change pieces inside.

(Sorry for the lack of examples; I don’t know enough to write one in Zig, and I’m not sure giving one in Scheme would be helpful.)

throwawaymaths

zig comptime is not a macro system and you can't really generate code in a way that makes hygeine a thing to worry about (there is no ast manipulation, you can't "create variables"). the only sort of codegen you can do is via explicit conditionals (switch, if) or loops conditioned on compile time accessible values.

thats still powerful, you could probably build a compile time ABNF parser, for example.

Validark

Zig disallows ALL shadowing (basically variable name collisions where in the absence of the second variable declaration the first declaration would be reachable by the same identifier name).

Generating a text file via a writer with the intent to compile it as source code is no worse in Zig than it is in any other language out there. If that's what you want to do with your life, go ahead.

jmull

> being able to reason about functions just from their type signature.

This has nothing to do with compile-time execution, though. You can reason about a function from its declaration if it has a clear logical purpose, is well named, and has well named parameters. You can consider any part of a parameter the programmer can specify as part of the name, including label, type name, etc.

> There is a good discussion of some issues here: https://typesanitizer.com/blog/zig-generics.html

That's actually not a great article. While I agree with the conclusion stated in the title, it's a kind of "debate team" approach to argumentation which tries to win points rather than make meaningful arguments.

The better way to frame the debate is flexibility vs complexity. A fixed function generics system in a language is simpler (if well designed) than a programmable one, but less flexible. The more flexibility you give a generics system, the more complex it becomes, and the closer it becomes to a programming language in its own right. The nice thing about zig's approach is that the meta-programming language is practically the same thing as the regular programming language (which, itself, is a simple language). That minimizes the incremental complexity cost.

It does introduce an extra complexity though: it's harder for the programmer to keep straight what code is executing at compile time vs runtime because the code is interleaved and the context clues are minimal. I wonder if a "comptime shader" could be added to the language server/editor plugin that puts a different background color on comptime code.

jasode

>You can _reason_ about a function from its declaration if it has a clear logical purpose, is well named, and has well named parameters.

I think "reason" in gp's context is "compile-time reasoning" as in the compiler's deterministic algorithm to parse the code and assign properties etc. This has downstream effects with generating compiler errors, etc.

It's not about the human programmer's ability to reason so any "improved naming" of function names or parameters still won't help the compiler out because it's still just an arbitrary "symbol" in the eyes of the parser.

jmull

Downstream effects with generating compiler errors is still about the human programmer's ability to reason about the code, and error messages can only reference the identifier names provided.

The compiler doesn't do anything you, the programmer, don't tell it to do. You tell it what to do by writing code using a certain syntax, connecting identifiers, keywords, and symbols. That's it. If the meaning isn't in the identifiers you provide and how you connect them together with keywords and symbols, it isn't in there at all. The compiler doesn't care what identifier names you use, but that's true whether the identifier is for a parameter label, type name, function name or any other kind of name. The programmer gives those meaning to human readers by choosing meaningful names.

Anyway, zig's compile errors seem OK to me so far.

Actually, the zig comptime programmer can do better than a non-programmable compiler when it comes to error messages. You can detect arbitrary logical errors and provide your own compiler error messages.

noelwelsh

I elaborated on parametricity in this comment: https://news.ycombinator.com/item?id=42621239

There are many ways one can reason about functions, and I think all of us use multiple methods. Parametricity provides one way to do so. One nice feature is that its supported by the programming language, unlike, say, names.

jmull

I saw that. But I don't think it has bearing on zig comptime.

zig generates a compile error when you try to pass a non-conforming type to a generic function that places conditions/restrictions on that type (such as by calling a certain predicate on instances of that type).

It's probably important to note that parametricity is a property of specific solution spaces, and not really in the ultimate problem domain (writing correct and reliable software for specific contexts), so isn't necessarily meaningful here.

withoutboats3

100%. So tiring that the discourse around this is based on 15 minute demos and not actual understandings of the trade offs. Varun Gandhi's post that you link to is great.

Based on my experience with Rust, a lot of what people want to do with its "constant generics" probably would be easier to do with a feature like comptime. Letting you do math on constant generics while maintaining parametricity is hard to implement, and when all you really want is "a trait for a hash function with an output size of N," probably giving up parametricity for that purpose and generating the trait from N as an earlier codegen step is fine for you, but Rust's macros are too flexible and annoying for doing it that way. But as soon as you replace parametric polymorphism with a naive code generation feature, you're in for a world of hurt.

anonymoushn

> type Example = Something[Example]

You can't use the binding early like this, but inside of the type definition you can use the @This() builtin to get a value that's the type you're in, and you can presumably do whatever you like with it.

The type system barely does anything, so it's not very interesting when type checking runs. comptime code is type checked and executed. Normal code is typechecked and not executed.

comptime is not a macro system. It doesn't have the ability to be unhygienic. It can cleverly monomorphize code, or it can unroll code, or it can omit code, but I don't think it can generate code.

shakna

Until version 0.12.0 (April 2024), you could make arbitrary syscalls, allowing you to generate code at comptime, and promote vars between comptime and runtime. [0] Before then, you could do some rather funky things with pointers and memory, and was very much not hygienic.

[0] https://ziglang.org/download/0.12.0/release-notes.html#Compt...

miki123211

And I would add:

* Documentation. In a sufficiently-powerful comptime system, you can write a function that takes in a path to a .proto file and returns the types defined in that file. How should this function be documented? What happens when you click a reference to such a generated type in the documentation viewer?

* IDE autocompletions, go to definition, type hinting etc. A similar problem, especially when you're working on some half-written code and actual compilation isn't possible yet.

adonovan

Also: security. Does this feature imply that merely building someone else’s program executes their code on your machine?

badsectoracula

Considering that pretty much every non-toy project isn't built by directly calling the compiler but through build tools like make, cmake, autotools, etc or even scripts like `build.sh` that can call arbitrary commands and that even IDEs have functionality to let you call arbitrary commands before and after builds (and had since the 90s at least), i do not see this as a realistic concern worth of limiting a language's functionality.

mpalmer

Syscalls aren't available to comptime code

MathMonkeyMan

Scheme has a "hygienic" macro system that allows you to do arbitrary computation and code alteration at compile time.

The language doesn't see wide adoption in industry, so maybe its most important lessons have yet to be learned, but one problem with meta-programming is that it turns part of your program into a compiler.

This happens to an extent in every language. When you're writing a library, you're solving the problem "I want users to be able to write THIS and have it be the same as if they had written THAT." A compiler. Meta-programming facilities just expand how different THIS and THAT can be.

Understanding compilers is hard. So, that's at least one potential issue with compile-time programming.

Validark

By your definition practically any code is a compiler unless you literally typed out every individual thing the machine should do, one by one.

"Understanding compilers is hard."

I think this is just unnecessarily pessimistic or embracing incompetence as the norm. It's really not hard to understand the concept of an "inline" loop. And so what if I do write a compiler so that when I do `print("%d", x)` it just gives me a piece of code that converts `x` to a "digit" number and doesn't include float handling? That's not hard to understand.

WalterBright

D had it 17 years ago! D features steadily move into other languages.

> Here the comptime keyword indicates that the block it precedes will run during the compile.

D doesn't use a keyword to trigger it. What triggers it is being a "const expression". Naturally, const expressions must be evaluatable at compile time. For example:

    int sum(int a, int b) => a + b;

    void test()
    {
        int s = sum(3, 4); // runs at run time
        enum e = sum(3, 4); // runs at compile time
    }

By avoiding use of non-constant globals, I/O and calling system functions like malloc(), quite a large percentage of functions can be run at compile time without any changes.

Even memory can be allocated with it (using D's automatic memory management).

WalterBright

Here's one of my favorite uses for it. I used to write a separate program to generate static tables. With compile time function execution, this was no longer necessary. Here's an example:

    __gshared uint[256] tytab = tytab_init;
    extern (D) private enum tytab_init =
    () {
        uint[256] tab;
        foreach (i; TXptr)        { tab[i] |= TYFLptr; }
        foreach (i; TXptr_nflat)  { tab[i] |= TYFLptr; }
        foreach (i; TXreal)       { tab[i] |= TYFLreal; }
        /* more lines removed for brevity */
        return tab;
    } ();

The initializer for the array `tytab` is returned by a lambda that computes the array and then returns it.

A link to the full glory of it:

https://github.com/dlang/dmd/blob/master/compiler/src/dmd/ba...

Another common use for CTFE is to use it to create a DSL.

optymizer

Walter, I'll take any chance I can get to say: thank you for creating D! One thing I was wondering about is the limits of compile time execution.

How does the D compiler ensure correctness if the machine the compiler runs on is different from the machine the program will execute on?

For example, how does the compiler know that "int s = sum(100000, 1000000)" is the same value on every x86 machine?

I'm thinking there could be subtle differences between generations of CPU, how can a compiler guarantee that a computation on the host machine will result in the same value on the target machine in practice, or is it assuming that host and target are sufficiently similar, as long as the architecture matches? (which is fine, I'm wondering as to what approaches exist)

WalterBright

> thank you for creating D!

My pleasure!

> is the same value on every x86 machine?

It's the same value on all machines, because integer types are fixed size (not implementation dependent) and 2's complement arithmetic is mandated.

Floating point results can vary, however, due to different orders in which constants are evaluated. The x87, for example, evaluates to a higher precision and then rounds it only when writing to memory.

chainingsolid

I'll second thanking you for making D. I still haven't found a language with more compile time capabilities that I can/would actually use. So I'm still using D.

Any thoughts on adding something like Zig's setFloatMode(strict)? I have a project idea or 2 where for some of the computation I need determinism then performance. But very much still need the performance floating point can provide.

WalterBright

D's ImportC also can do CTFE with C code!

    int sum(int a, int b) { return a + b; }

    _Static_assert(sum(3, 4) == 7, "look ma, check at compile time!");

Why doesn't the C Standard add this? It works great!

flohofwoe

Tbf, Zig allows that too (calling the same function in a runtime and comptime context):

    fn square(num: i32) i32 {
        return num * num;
    }

    pub fn main() void {
        _ = square(2);
        _ = comptime square(3);
    }

...and the comptime invocation will produce a compile error if anything isn't comptime-compatible (which IMHO is an important feature, because it rings the alarm bells if code that's expected to run at comptime accidentially moves into runtime because some input args have changed from comptime- to runtime-evaluated).

WalterBright

D does not need the extra keyword. The extra keyword is the same fault that C++ `constexpr` has. It's simply not necessary. The trigger for D is `constant-expression` in the grammar.

We have a lot of D users that come from C++, and not one of them has ever asked for `constexpr` that I've heard.

flohofwoe

I see the keyword as a feature, not a crutch ;)

It guarantees that this area of code will always be evaluated at comptime, and otherwise fail to compile. Compilers (or rather their optimizer passes I guess) already try to fold a lot of code into a constant if it can be evaluated at compile time, but it will silently fall back to runtime evaluation when a variable slips into the code later. With a keyword I can at least say "please fail when that happens".

skocznymroczny

Zig looks interesting, I just wish it had operator overloading. I don't really buy most of the arguments against operator overloading. A common argument is that with operator overloading you don't know what actually happens under the hood. Which doesn't work, because you might as well create a function named "add" which does multiplication. Another argument is iostreams in C++ or boost::spirit as examples of operator overloading abuse. But I haven't really seen that happen in other languages that have operator overloading, it seems to be C++ specific.

LAC-Tech

I feel like the ocaml solution would fit zigs usecase well.

In ocaml you can redefine operators... but only in the context of another module.

So if I re-define + in some module Vec3, I can do:

  Vec3.(a + b + c + d)

Or even:

  let open Vec3 in
  a + b + c + d

So there you go, no "where did the + operator come from?" questions when reading the source, and still much nicer than:

  a.add(b).add(c).add(d)

I doubt zig will change though. The language is starting to crystallize and anything that solved this challenge would be massive.

flohofwoe

For something as simple as a vec3 type, Zig has a @Vector 'meta-type', it's quite bare bones though:

https://ziglang.org/documentation/master/#Vector

hiccuphippo

You don't know the amout of magic that goes behind the scenes in python and php with the __ functions. I think zig's approach is refreshing. Being able to follow the code trumps the seconds wasted typing the extra code.

magicalhippo

Depends on domain I think. In some cases it can be very beneficial to keep the code close to the source, say math equations, to ensure they've been correctly implemented.

In this case the operators should be unsurprising, so they do what one would expect based on the source domain. Multiplying a vector and a scalar for example should return the scaled vector, but one should most likely not implement multiplication between vectors as that would likely cause confusion.

melodyogonna

I don't know about PHP, what amount of magic goes in behind Python's dunder methods? You can open it and see

akkad33

There are many gotchas to Python dunder methods. An example is there is a bunch of functions that can be called when you do something like 's.d' where s is an object. Does it call "getattr" on the object, getattr on the class or get a property, or execute a descriptor? It is very hard to tell unless you're an expert

zoogeny

In my humble opinion, a lot of the dislike of operator overloading is related to unexpected runtime performance.

My ideal solution would be for the language to introduce custom operators that clearly indicate an overload. Something like a prefix/postfix (e.g. `let c = a |+| b`). That way it is clear to the person viewing the code that the |+| operation is actually a function call.

This is still open to abuse but I think it at least removes one of the major concerns.

flohofwoe

In C++ I ever only used operator overloading for vector/matrix math (where it is indeed very useful). I'd be fine if a language implements the vector math syntax directly (like shading languages do). Zig at least has a @Vector() [1] type which is a bit similar to Clang's Vector Extension (but unfortunately not the Extended Vector Extension) [2].

[1] https://ziglang.org/documentation/master/#Vector

[2] https://clang.llvm.org/docs/LanguageExtensions.html#vectors-...

ptrwis

Maybe such operators for basic linear algebra (for arrays of numbers) should be just built into the language instead of overloading operations. I'm not sure if such a proposal doesn't already exists.

spiffyk

There is a specialized `@Vector` builtin for SIMD operations like this.

bigpingo

Yeah I never got the aversion to operator overloading either.

"+ can do anything!" As you said, so can plus().

"Hidden function calls?" Have they never programmed a soft float or microcontroller without a div instruction? Function calls for every floating point op.

mk12

The problem is not that + calls a function. The problem is that + could call one of many different functions, i.e. it is overloaded. Zig does not allow overloading plus() based on the argument types. When you see plus(), you know there is exactly one function named “plus” in scope and it calls that.

gpderetta

operator + is overloaded even in plain C: it will generate different instructions for pointers, floats, integers, _Complex, _Atomic and the quasi standard __float128. Sometimes it will even generate function calls.

I suspect zig might be similar.

kps

Not if `plus` is a pointer. Then `plus()` is a conditional branch where the condition can be arbitrarily far away in space (dynamically scoped) and time. That's why I think invisible indirection is a mistake. (C used to require `(*plus)()`.)

elcritch

Ah ‘fieldNames’, looks very similar to Nim’s ‘fieldPairs’. It’s an incredibly handy construct! It makes doing efficient serialization a breeze. I recently implemented a compile time check for thread safety checks on types using ‘fieldPairs’ in about 20 lines.

This needs to become a standard feature of programming languages IMHO.

It’s actually one of the biggest things I find lacking in Rust which is limited to non-typed macros (last I tried). It’s so limiting not to have it. You just have to hope serde is implemented on the structs in a crate. You can’t even make your own structs with the same fields in Rust programmatically.

drogus

At some point there was a discussion about compile time reflection, which I guess could include functionality like that, but I think the topic died along with some kind of drama around it. Quite a bummer, cause things like serde would have been so much easier to imeplement with compile time reflection

cb321

Another example applying compile-time reflection is something like https://github.com/c-blake/cligen { but it helps if your host prog.lang has named parameters like Python's foo(a=1, b=2) }.

ptrwis

With comp-time reflection you can build frameworks like ORMs or web frameworks. The only trade-off is that you have to include such a library in the form of source code.

pakkarde

After having written a somewhat complete C parser library I don't really get the big deal about needing meta programming in the language itself. If I want to generate structs, serialization, properties, instrumentation, etc, I just write a regular C program that processes some source files and output source files and run that first in by build script.

How do you people debug and test these meta programs? Mine are just regular C programs that uses the exact same debuggers and tools as anything else.

coldtea

>I don't really get the big deal about needing meta programming in the language itself. If I want to generate structs, serialization, properties, instrumentation, etc, I just write a regular C program that processes some source files and output source files and run that first in by build script.

This describes exactly what people don't want to do.

dboreham

But exactly why?

jerf

If you just walked up to me out of the blue and asked "what computer language do you know is the worst for processing strings?", well, technically I might answer "assembler", but if you excluded that, my next answer would be C.

Furthermore, you want some sort of AST representation, at one level of convenience or another (I include this compgen-style "being 'in' the AST" to be included in that, even if it doesn't necessarily directly manipulate AST nodes), and C isn't particularly great at manipulating those, either, in a lot of different ways.

A consequence of C being the definitive language that pretty much every other language has had to react to, one way or another through however many layers of indirection, for the past 40+ years, is that pretty much every language created since then is better than C at these things. C's pretty long in the tooth now, even with the various polishings it has received over the years.

0x696C6961

Because after enough hands have touched a codegen script, debugging it becomes impossible.

pjc50

C# (strictly, Roslyn/dotnet) provides this in a pretty nice way: because the compiler is itself written in the language, you can just drop in plugins which have (readonly!) access to the AST and emit C# source.

Debugging .. well, you have to do a bit more work to set up a nice test framework, but you can then run the compiler with your plugin from inside your standard unit test framework, inside the interactive debugger.

modernerd

Yes, this is the same approach Ryan Fleury and others advocate, and it's perfectly good:

> Arbitrary compile-time execution in C:

> cl /nologo /Zi metaprogram.c && metaprogram.exe

> cl /nologo /Zi program.c

> Compile-time code runs at native speed, can be debugged, and is completely procedural & arbitrary

> You do not need your compiler to execute code for you

https://x.com/ryanjfleury/status/1875824288487571873

zamalek

The only benefit that some (certainly more rare) compilers can provide is type metadata/compile-time reflection. Otherwise, totally.

account42

This gets unwieldy quick as soon as your compiler host and target are different platforms.

chikere232

MS DOS choice of / for commandline arguments and \ for paths always hurts my eyes

nox101

I don't know about zig bit the power of lisp is that youre manipulating the s-expressions or to put it another way, you're manipulating the ast. To do that in C you'd need to write a full C parser for your C program that processes source files.

Certhas

I used to do that in Python with the numba jit. Write Python code that generates a Python code that then gets compiled.

It's a fragile horrible mess, and the need to do this was a major reason for me to switch away from Python. It's a bit like asking why we don't just pass all arguments to functions as strings. Yeah, people write stringly typed code, but it should rarely be necessary, and your language should provide means to avoid it.

agentkilo

Well put. I always have the feeling that any language which has an `eval` function or an invokable compiler can do meta program. That said, I think the "big deal" is in UX/DX. It's really nice to have meta programming support built-in to the language when you need it.

jmull

Whether you consider it a big deal or not is up to you, but with zig's approach you don't have to write/maintain a separate parser, nor worry about whether it's complete enough to process your source files.

I don't know a lot about debugging zig comptime, though. I use printf-style debugging and the built-in unit test blocks. That's all I've needed so far. (Perhaps that's all there is.)

benob

> How do you people debug and test these meta programs?

I couldn't find any other answer than using @compileLog to print-debug [1]. In lisp, apparently some implementations allow to trace macros [2]. Couln'd find anything about Nim's macro debugging capabilities.

This whole thing looks like a severe limitation that is not balanced by the benefit of having all code in the same place. Do you know other languages that provide sensible meta-programming facilities?

[1] https://www.reddit.com/r/Zig/comments/jkol30/is_there_a_way_... [2] https://stackoverflow.com/questions/44872280/macros-and-how-...

disentanglement

In lisp, macros are just ordinary functions whose input and output is an AST. So you can debug them as you would any other function, by tracing, print debugging, unit tests or even stepping through them in a debugger.

michaelsbradley

To debug macros in Nim, you'll likely need to print arguments and expansions at compile-time, inspect the output, change things to see what happens, repeat...

https://nim-lang.org/docs/macros.html#toStrLit%2CNimNode

https://nim-lang.org/docs/macros.html#astGenRepr%2CNimNode

https://nim-lang.org/docs/macros.html#dumpAstGen.m%2Cuntyped

https://nim-lang.org/docs/macros.html#treeRepr%2CNimNode

https://nim-lang.org/docs/macros.html#dumpTree.m%2Cuntyped

https://orlybooks.com/books/changing-stuff-and-seeing

koe123

Another interesting pattern is the ability to generate structs at compile time.

Ive ran experiments where a neural net is implemented by creating a json file from pytorch, reading it in using @embedFile, and generating the subsequent a struct with a specific “run” method.

This in theory allows the compiler to optimize the neural network directly (I havent proven a great benefit from this though). Also the whole network lived on the stack, which is means not having any dynamic allocation (not sure if this is good?).

anonymoushn

I've done this sort of thing by writing a code generator in python instead of using comptime. I'm not confident that comptime zig is particularly fast, and I don't want to run the json parser that generates the struct all the time.

koe123

Another thing I tried as an alternative is using ZON (zig object notation) instead of json. This can natively be included directly as a source file. It involved writing a custom python exporter though (read: I gave up).

mk12

FWIW the goal for comptime Zig execution is to be at least as fast as Python. I can’t find it now but I remember Andrew saying this in one of his talks at some point.

DanielHB

I believe that Zig build system can cache comptime processes, so if the JSON didn't change it doesn't run again.

Validark

I think if you integrated with the build system, yes, Zig can do things only when the file changed. But I'm not sure that Zig figured out incremental comptime yet. That's way harder to accomplish.

0x1ceb00da

How does this affect the compile times?

koe123

They become quite long, but it was surprisingly tolerable. I recall it vaguely but a 100MB neural network was on the order of minutes with all optimizations turned on. I guess it would be fair to say it scaled more or less linearly with the file size (from what I saw). Moreover I work in essentially a tinyml field so my neural networks are on the order of 1 to 2 MB for the most part. For me it wouldve been reasonable!

I guess in theory you could compile once into a static library and just link that into a main program. Also there will be incremental compilation in zig I believe, maybe that helps? Not sure on the details there.

erichocean

It's nothing like C++ templates.

pjmlp

While interesting, this is one of the cases, where I agree with "D did it first" kind of comments.

sixthDot

sure, and hygienically, it's not a preprocessor thing.

Tiberium

If you're surprised by Zig's comptime, you should definitely take a look at Nim which also has compile-time code evaluation, plus a full AST macro system.

foretop_yardarm

Nim is a fun language but I wouldn't consider it for "serious" work. It has the same issues as other niche languages (ie. ecosystem), plus: a polarising maintainer (most core contributors don't seem to last long) and primarily funded by a crypto company (if you care about that). Then again, 10 years ago none of that would have bothered me.

planetis

All these organizations[1] using nim in production must disagree with you then.

[1]: https://github.com/nim-lang/Nim/wiki/Organizations-using-Nim

zamalek

Zig has the feature of not having exceptions. I see that Nim is trying to move away from them, but exceptions color functions, which means that you have to account for them even if you don't use functions that throw them[1]. Life is too short to deal with invisible control flow.

[1]: https://github.com/status-im/nim-stew

cb321

Whether you want to handle every error is context dependent. StatusIM has long running servers & clients as primary products and so tilt away from exceptions, but for a CLI utility you might want the convenience of a stack trace instead. I've seen this many times in Python CL apps, for example.

Alternatively, there is also a Nim effects tracking system that lets the compiler help you track the hidden control flow for you. So, at the top of a module, you can say {.push raises: [].} to make sure that you handled all exceptions somewhere. So, it may not be as "Wild West" as other exceptions systems that you are used to.

As with so many aspects, Nim is Choice. For many choice is good. For others they want language designers to constrain choice a lot (Go is probably a big recent example, because fresh out of school kids need to be kept away from sharper tools or similar rationales). A lot of these prog.lang. battles mirror bigger societal debates/divides between centralized controls and more laissez-faire arrangements. Nim is more in the Voltaire/Spiderman's Uncle Ben "With great power comes great responsibility" camp, but how much power you use is usually "up to you" (well, and things you choose to depend upon).

zamalek

> {.push raises: [].}

Will this transitively enforce exception handling? i.e. if a 3rd-party dependency that I am using calls into another dependency that raises exceptions, but doesn't handle them in any way (including not using that pragma), will Nim assert that? Otherwise, that's precisely the function coloring problem I mentioned: if you can't statically assert that a callee, or it's descendant callees, doesn't throw an exception then you have to assume that it will.

SMP-UX

Zig is overall pretty good as a language and it does what it needs to: staying in the lane of the purpose is very important. It is why I do not particularly care for some languages being used just because.

bryango

I hope we can have something that combines the meta-programming capabilities of Zig with the vast ecosystem, community and safety of Rust.

Looking at the language design, I really prefer Zig to Rust, but as an incompetent, amateur programmer, I couldn't write anything in Zig that's actually useful (or reliable), at least for now.

norman784

I Agree. I tried briefly Zig and quickly gave up because, as someone used to Rust, the compiler wasn't helping me find those issues at compile time. I know that Zig doesn't make those promises, but for me, it's a deal breaker, so I suppose Zig isn't the language for me.

On the other hand, I do like the concept of comptime vs Rust macros.

raptorfactor

Please keep the Rust community away from Zig. (I joke. Mostly...)

Validark

[flagged]

drogus

"the community" meaning one weirdo commenter you've seen on HN? Cause I can assure you no people I know in the Rust community think that way.

Personally I would really like to code some stuff in Zig if I had more time. It's not really appealing to me in many ways (like I prefer to spend a bit more on designing types for my programs and have safety guartantees), so I wouldn't probably ue it long term, but I admit stuff like comptime is interesting.

LAC-Tech

The Rust Discord is one of my favourite places on the internet. I've met so many helpful, friendly, interesting and incredibly smart people here. And this is coming from someone who's had a lot of frustrations on my rust journey, and who does evil things like use small unsafe blocks instead of bringing in dependencies - they even refused to ban me for this!

There are definitely a few loud, unpleasant voices in the rust community. But quite frankly the Rust discord was a lot more pleasant than the Zig one.

YMMV, I do have a fondness for Zig, and I honestly did find Loris's "safety coomer" comment really funny. But I've had such good experiences with rustaceans that I feel I must defend their honour every time it's besmirched like this.

ArtixFox

bruh, zig's VP called rust users safety coomers. Its internet shitposting who cares.

source: I was there when it happened and it was GLORIOUS.

LAC-Tech

LOL hey Artix.

That made me laugh from Loris too. I can't believe it became a big deal, it was a funny comment.

And I believe it was one single lone frustrated rustacean... a frustacean, perhaps? ... who was making the unhinged comments.

bryancoxwell

I beg your pardon

littlestymaar

Comptime to replace macros is indeed good, comptime to replace generics on the other hand isn't and that really makes me think of the “when all you have is a hammer” quote.

cyco130

It's a tradeoff. Advanced generic programming as implemented in many other languages requires you to learn a completely new language. That new language is better suited for some use cases. Functions that take types and return types, on the other hand, can be more intuitive in other cases.

littlestymaar

The “new language ” is dramatically simpler than most programming language though because its expressing power is limited and as you say, you only need it for “advanced generics” which is only a small fraction of all generic code one writes.

It's actually a DSL, tuned to the specific use-case it's doing.

cyco130

It’s usually not dramatically simpler though. Many of them (C++, TypeScript, Rust…) end up growing into esoteric, “accidentally Turing-complete” monstrosities that are much harder to reason about than simple if statements and loops.

bsder

The problem is that the alternatives to comptime for generics generally seems to have a hideous effect on compile times (see: C++ and Rust).

Is there a language that does generics in such a way that doesn't send compile times to the moon?

edflsafoiewq

Shouldn't comptime have the same compile time implications as templates? In both cases you're essentially recompiling the code for every distinct set of comptime args/template parameters.

bsder

Zig doesn't instantiate anything the doesn't get called. So, it doesn't have to generate a whole bunch of templated functions and then optimize down to the ones that actually get used.

The upside is that if you only call a generic function with a u32, you don't instantiate an f32 as well. The downside is that when you do decide to call that function with an f32, all the comptime stuff suddenly gets compiled for the f32 and might have an error.

In practice, I feel that I gain way more from the fast compile than I lose from having a path that accidentally never got compiled as my unit tests almost always force those paths to be compiled at least once.

hansvm

Zig is lazy, and C++ is eager. I can define an infinite set of mutually recursive types in Zig, and only the ones I actually use will be instantiated (not an everyday need, but occasionally interesting -- I had fun building an autodiff package that way with no virtual function overhead, and the set of type descriptors being closed under VJP meant that you could support arbitrary (still only finite) derivative-like tensors, not just first and second order).

pjmlp

As proven by C++ with modules and binary libraries, compile times can be better in C++.

Rust suffers because they compile everything from source, and the frontend sends piles of unprocessed LLVM IR to the traditional slow backend.

This can be improved with better tooling, one example is the Cranelift backend, there could be an interpreter, and so on.

Examples of languages that don't send compile times to the moon with similar polymorphic power, Standard ML, OCaml, Haskell, D, Ada.

littlestymaar

AFAIK Part of the problem with Rust is also that it compiles crates individually before linking them and because of that cannot use the upfront knowledge of what's going to be needed, and as such a generic function that crosses the crate boundary is going to be handled twice by the compiler.

This was initially done so that Rust could compile things in parallel between crates by with spawning more rustc processes, which is obviously much easier than building a parallel compiler directly, but in the end it's suboptimal for performance.

anonymoushn

comptime for generics is a superset of the things that C++ and Rust do for generics

shikck200

Ocaml

wtetzner

OCaml doesn't monomorphize functions. Instead references to every type are the same size (either a tagged int or a pointer). This is a sweet spot for OCaml, but doesn't really work for a language that doesn't allocate everything on the heap.

edflsafoiewq

Why?

brylie

Is anyone here using Zig for audio plugin development? It seems like a good candidate as an alternative to C++ but lacks the ecosystem (like JUCE). Are there any ongoing efforts to bring DSP/audio plugin development to Zig?

hiccuphippo

IIRC Andrew Kelley's original goal for developing Zig was to build a DAW.

p0nce

I'm using D for audio plugins and we do use CTFE extensively (named comptime in Zig). Zig might be a bit more fit maybe because of the easier C and C++ interop and targetting, but I'm not sure about the COM and OOP story.