Obvious things C should do

309 comments

·January 11, 2025

TheNewAndy

Header files are one of the things I miss the most about languages that aren't C. Having a very clear distinction between public and private, and interface and implementation is one of my favourite things about C code (at least the way I write it).

Being able to just read through a library's .h files to know how to use it is really nice. Typically, my .h files don't really look like my .c files because all the documentation for how to use the thing lives in the .h file (and isn't duplicated in the .c file). It would be entirely possible to put this documentation into the .c file, but it makes reading the interface much less pleasant for someone using it.

kouteiheika

> Header files are one of the things I miss the most about languages that aren't C. Having a very clear distinction between public and private, and interface and implementation is one of my favourite things about C code (at least the way I write it).

I always found this argument baffling, because the way some other language solve this problem is with tooling, which is a much better way to do it in my opinion.

Take Rust for example. You want to see the interface of a given library and see how to use it? Easy. Type in `cargo doc --open` and you're done. You get a nice interface with fully searchable API interface with the whole public API, and it's all automatic, and you don't have to manually maintain it nor have to duplicate code between your header and your source file.

TheNewAndy

This is probably something where it comes down to preference and familiarity. I would much prefer a simple text file for documentation that I can grep, open in my text editor, modify easily without switching context (oh, I should have been more explicit in the documentation I wrote - let me just fix that now), etc. All the features you mentioned "nice interface, fully searchable API interface, whole public API" are exactly what you get if you open a well written header file in any old text editor.

I used to be a big fan of doxygen etc, but for the stuff I've worked on, I've found that "pretty" documentation is way less important than "useful" documentation, and that the reformatting done by these tools tends to lead towards worse documentation with the people I have worked with ("Oh, I need to make sure every function argument has documentation, so I will just reword the name of the argument"). Since moving away from doxygen I have stopped seeing this behaviour from people - I haven't tried to get a really good explanation as to why, but the quality of documentation has definitely improved, and my (unproven) theory is that keeping the presentation as plain as possible means that the focus turns to the content.

I don't know if rust doc suffers the same issues, but the tooling you are mentioning just seems to add an extra step (depending on how you count steps I suppose, you could perhaps say it is the same number of steps...) and provide no obvious benefit to me (and it does provide the obvious downside that it is harder to edit documentation when you are reading it in the form you are suggesting).

But with all these things, different projects and teams and problem domains will probably tend towards having things that work better or worse.

metadat

> well written text file

The problem with this is no one agrees on the definition of "well-written", so consistency is a constant battle and struggle. Language tooling is a better answer for quality of life.

Yoric

Have you looked at how OCaml does it?

The historical way is to have a .ml and a .mli files. The .ml file contains the implementation. Any documentation in that file is considered implementation detail, will not be published by ocamldoc. The .mli file contains everything users need to know, including documentation, function signatures, etc.

Interestingly, the .mli and the .ml signatures do not necessarily need to agree. For instance, a global variable in the .ml does not need to be published in the .mli. More interestingly, a generic function in the .ml does not need to be exposed as generic in the .mli, or can have more restrictions.

You could easily emulate this in Rust, but it's not the standard.

thayne

> and that the reformatting done by these tools tends to lead towards worse documentation with the people I have worked with ("Oh, I need to make sure every function argument has documentation, so I will just reword the name of the argument")

That seems like an orthogonal issue to me. I've seen places where documentation is only in the source code, no generated web pages, but there is a policy or even just soft expectation to document every parameter, even if it doesn't dd anything. And I've also seen places that make heavy use of these tools that doesn't have any such expectation.

kouteiheika

> All the features you mentioned "nice interface, fully searchable API interface, whole public API" are exactly what you get if you open a well written header file in any old text editor.

No, you can't, and it's not even close.

You have a header file that's 2000 lines of code, and you have a function which uses type X. You want to see the definition of type X. How do you quickly jump to its definition with your "any old text editor"? You try to grep for it in the header? What if that identifier is used 30 times in that file? Now you have to go through all of other 29 uses and hunt for the definition. What if it's from another header file? What if the type X is from another library altogether? Now you need to manually grep through a bunch of other header files and potentially other libraries, and due to C's include system you often can't even be sure where you need to grep on the filesystem.

Anyway, take a look at the docs for one of the most popular Rust crates:

https://docs.rs/regex/1.11.1/regex/struct.Regex.html

The experience going through these docs (once you get used to it) is night and day compared to just reading header files. Everything is cross linked so you can easily cross-reference types. You can easily hide the docs if you just want to see the prototypes (click on the "Summary" button). You can easily see the implementation of a given function (click on "source" next to the prototype). You can search through the whole public API. If you click on a type from another library it will automatically show you docs for that library. You have usage examples (*which are automatically unit tested so they're guaranteed to be correct*!). You can find non-obvious relationships between types that you wouldn't get just by reading the source code where the thing is defined (e.g. all implementations of a given trait are listed, which are usually scattered across the codebase).

> I don't know if rust doc suffers the same issues, but the tooling you are mentioning just seems to add an extra step (depending on how you count steps I suppose, you could perhaps say it is the same number of steps...) and provide no obvious benefit to me (and it does provide the obvious downside that it is harder to edit documentation when you are reading it in the form you are suggesting).

Why would I want to edit the documentation of an external library I'm consuming when I'm reading it? And even if I do then the effort to make a PR changing those docs pales in comparison to the effort it takes to open the original source code with the docs and edit it.

Or did you mean editing the docs for my code? In that case I can also easily do it, because docs are part of my source files and are maintained alongside the implementation. If I change the implementation I have docs right there in the same file and I can easily edit them. Having to open the header file and hunt for the declaration to edit the docs "just seems to add an extra step" and "and provide no obvious benefit to me", if I may use your words. (:

panic

As someone who likes C header files, I enjoy manually maintaining them. Designing the interface separately from the implementation feels good to me, and a well-structured .h file is nicer to read than any auto-generated docs I've encountered.

chii

> Designing the interface separately from the implementation feels good to me

would you make the same argument for java then?

xigoi

The problem is that you have to write the interface not only separately grom the implementation, but together with the implementation as well, which leads to duplication of information;

nine_k

The include file mechanism is a hack that was acceptable at the time when machines were extremely underpowered, so only the simplest solutions had a chance to be implemented within a reasonable time frame.

By now, of course, precompiled headers exist, but their interplay with #define allows for fun inconsistencies.

And, of course, they leak implementation as much as the author wants, all the way to .h-only single-file libraries.

If you want an example of a sane approach to separate interface and implementation files from last century, take a look e.g. at Modula-2 with its .int and .mod files.

alextingle

Precompiled headers are a terrible misfeature. I ban them in any code base I am responsible for.

They encourage the use of large header files that group unrelated concerns. In turn that makes small changes in header files produce massive, unnecessary rebuilds of zillions of object files.

The clean practice is to push down #includes into .c files, and to ruthlessly eliminate them if at all possible. That speeds up partial rebuilds enormously. And once you adopt that clean practice, pre-compiled headers yield no benefit anyway.

pjmlp

Modules already existed in programming languages outside Bell Labs in the same decade, like the Modula-2 you quote.

salgernon

I’m with parent - what if you don’t have the tool? What if there’s a syntax error in some implementation or dependency such that the tool chokes early?

Human readable headers are accessible out of context if the implementation. They also help provide a clear abstraction - this is the contract. This is what I support as of this version. (And hopefully with appropriate annotations across versions)

kouteiheika

> I’m with parent - what if you don’t have the tool?

The "what if you don't have the tool" situation never happens in case of Rust. If you have the compiler you have the tool, because it's always included with the compiler. This isn't some third party tool that you install manually; it's arguably part of the language.

> What if there’s a syntax error in some implementation or dependency such that the tool chokes early?

In C I can see how this can happen with its mess of a build systems; in Rust this doesn't happen (in my 10+ years of Rust I've never seen it), because people don't publish libraries with syntax errors (duh!).

gary_0

The "what if you don't have the software" argument doesn't hold water for me. What if you don't have git? What if you don't have a text editor? What if you don't have a filesystem?

Most programming language communities are okay with expecting a certain amount of (modern) tooling, and C can't rely on legacy to remain relevant forever...

nox101

Sounds like COM/DCOM from ~1995. Every API had a public interface including a description. You could open the DCOM Inspector, browse all the APIs, and see the type signature of every function and its docs.

pjmlp

Still is COM from 2025, given its relevance on Windows, even more since Vista, as all Longhorn ideas were remade in COM.

However the tooling experience is pretty much ~1995, with the difference IDL is at version 3.0.

Brian_K_White

headers perform the same job for all code, not just code that's in some library.

Frankly your description of what you just called easy sounds terrible and pointlessly extra, indirection that doesn't pay for itself in the form of some overwhelming huge win somewhere else. It's easy only if the alternative was getting it by fax or something.

saghm

Having to make an entire separate file to mark something as public rather than just having a keyword in the language sounds to me "terrible and pointlessly extra". It's not like you can't just put all your public stuff in it's own file in Rust rather than putting private stuff in it as well; empirically though, people don't do this because it's just not worth the effort.

kevin_thibedeau

Header files are really a weak hack to deal with resource constrained platforms from the 70s. They only work if you stick to a convention and pale in comparison to languages like Ada with well architected specification for interfaces and implementation without ever needing to reparse over and over again.

I do enjoy using C but that is one area where it should have been better designed.

m463

I agree with you, but I don't.

The way C handles header files is sort of "seems-to-work" by just blindly including the text inline.

I know this is not a much-used language, but in comparison, Ada did a pretty nice thing. They have the concept of packages and package bodies. The package is equivalent to the header file, and the package body is the implementation of the package.

I remember (long ago when I used ada) that everyone could compile against the package without having the package body implementation ready so the interfaces could all work before the implementation was ready.

an in another direction, I like how python does "header files" with "import". It maps easily to the filesystem without having to deal with files and the C include file semantics.

jrmg

I think there may be a difference in thinking that underlies the difference in opinion here.

In my experience, having a header file nudges you to think about interface being a _different thing_ to implementation - something that (because you need to) you think about as more fundamentally separate from the implementation.

Folks who think this way bristle at the idea that interface be generated using tooling. The interface is not an artifact of the implementation - it’s a separate, deliberate, and for some even more important thing. It makes no sense to them that it be generated from the implementation source - that’s an obvious inversion of priority.

Of course, the reverse is also true - for folks used to auto-generated docs, they bristle at the idea that the interface is not generated from the one true source of truth - the implementation source. To them it’s just a reflection of the implementation and it makes no sense to do ‘duplicate’ work to maintain it.

Working in languages with or without separate interface files nudges people into either camp over time, and they forget what it’s like to think in the other way.

estebank

This thread feels weird to me because when I write code I do think about my public API, have even sketched it out separately looking at the desired usage pattern, but never felt the need to save that sketch as anything other than as part of the documentation. Which lives next to the code that implements that API.

I think it is telling that the handful of languages that still have something akin to .h files use them purely to define cross-language APIs.

juped

I would generate implementations from interfaces were it possible, but I never want to generate interfaces from implementations.

kode-tar-gz

Why not?

pjmlp

Available in most compiled module languages, either separately, Modula-2, Modula-3, Ada, Standard ML, Caml Light, OCaml, F#, D.

Or it can be generated either as text, or graphical tooling, Object Pascal, D, Haskell, Java, C#, F#, Swift, Go, Rust.

All with stronger typing, faster compilation (Rust and Swift toolchain still need some work), proper namespacing.

Unfortunately C tooling has always been more primitive than what was happening outside Bell Labs, and had AT&T been allowed to take commercial advantage, history would be much different, instead we got free lemons, instead of nice juicy oranges.

At least they did come up with TypeScript for C, and it nowadays supports proper modules, alongside bounds checked collection types.

thayne

I find it pretty frustrating to have the documentation in a different file from the source code.

When maintaining the code that means I have to go to a separate file to read what a function is supposed to do, or update the documentation.

And when reading the documentation, if the documentation is unclear, I have to go to a separate file to see what the function actually does.

Granted, the implementation can get in the way if you are just reading the documentation, but if you aren't concerned about the implementation, then as others have said, you can use generated documentation.

wruza

I used to think like this, but then I discovered generating (prj_root)/types.d.ts. It doesn’t do anything technical because types are in src/**/*, but I do that to generate a quick overview for a project I’m returning to after a month.

Maintaining header files is tedious and I often resorted to a kind of “OBHF.h” for common types, if you know what I mean. Otherwise it’s too much cross-tangling and forwards. Even in ts I do type-only src/types.ts for types likely common to everything, mostly because I don’t want pages of picky this-from-there this-from-there imports in every module.

As for public/private and sharing “friends” across implementation modules, we didn’t invent anything good anyway. I just name my public private symbols impl_foo and that tells me and everyone what it is.

That said, I wouldn’t want to make html out of it like these *-doc tools do. Using another program to navigate what is basically code feels like their editor sucks. My position on in-code documentation is that it should be navigatable the same way you write it. External tools and build steps kill “immersion”.

legobmw99

Some other languages have equivalents (OCaml comes to mind), but usually they’re less necessary

chacham15

While the author has WAY more knowledge/experience than me on this and so I wonder how he would solve the following issues:

Evaluating Constant Expressions

- This seems really complicated...if you're working within a translation unit, thats much simplified, but then you're much more limited in what you can do without repeating a lot of code. I wonder how the author solves this.

Compile Time Unit Tests

- This is already somewhat possible if you can express your test as a macro, which if you add in the first point, then this becomes trivial.

Forward Referencing of Declarations

- I think there may be a lot of backlash to this one. The main argument against this is that it then changes the compiler from a one-pass to two pass compiler which has its own performance implications. Given the number of people who are trying to compile massive codebases and go as far as parallelizing compilation of translation units, this may be a tough pill for them to swallow. (evaluating constant expressions probably comes with a similar/worse performance hit caveat depending on how its done)

Importing Declarations

- This is a breaking change...one of the ways I have kind of implemented templating in C is by defining a variable and importing a c file, changing the variable, and then reimporting the same c file. Another thing I've done is define a bunch of things and then import the SQLite C Amalgamation and then add another function (I do this to expose a SQLite internal which isnt exposed via its headers). All of these use cases would break with this change.

Are there any thoughts about these issues? Any ways to solve them perhaps?

WalterBright

> if you're working within a translation unit, thats much simplified, but then you're much more limited in what you can do without repeating a lot of code. I wonder how the author solves this.

You are correct in that the source code to the function being evaluated must be available to the compiler. This can be done with #include. I do it in D with importing the modules with the needed code.

> This is already somewhat possible if you can express your test as a macro, which if you add in the first point, then this becomes trivial.

Expressing the test as a macro doesn't work when you want to test the function. The example I gave was trivial to make it easy to understand. Actual use can be far more complex.

> Performance

D is faster at compiling than C compilers, mainly because:

1. the C preprocessor is a hopeless pig with its required multiple passes. I know, I implemented it from scratch multiple times. The C preprocessor was an excellent design choice when it was invented. Today it is a fossil. I'm still in awe of why C++ has never gotten around to deprecating it.

2. D uses import rather than #include. This is just way, way faster, as the .h files don't need to be compiled over and over and over and over and over ...

D's strategy is to separate the parse from the semantic analysis. I suppose it is a hair slower, but it also doesn't have to recompile the duplicate declarations and fold them into one.

Compile time function execution can be a bottleneck, sure, but that (of course) depends on how heavily it is used. I tend to use it with a light touch and the performance is fine. If you implement a compiler using it (as people have done!) it can be slow.

> one of the ways I have kind of implemented templating in C is by defining a variable and importing a c file, changing the variable, and then reimporting the same c file. Another thing I've done is define a bunch of things and then import the SQLite C Amalgamation and then add another function (I do this to expose a SQLite internal which isnt exposed via its headers). All of these use cases would break with this change.

I am not suggesting removing #include for C. The import thing would be additive.

> Are there any thoughts about these issues?

If you're using hacks to do templating in C, you've outgrown the language and need a more powerful one. D has top shelf metaprogramming - and as usual, other template languages are following in D's path.

chacham15

Thanks for taking the time to respond! I have a few followup questions if thats ok:

> You are correct in that the source code to the function being evaluated must be available to the compiler. This can be done with #include. I do it in D with importing the modules with the needed code.

> D's strategy is to separate the parse from the semantic analysis. I suppose it is a hair slower, but it also doesn't have to recompile the duplicate declarations and fold them into one.

I dont quite follow all the implications that these statements have. Does the compiler have a different way of handling a translation unit?

- Is a translation unit the same as in C, but since you're #including the file you would expect multiple compilations of a re-included C file? woudnt this bloat the resulting executable (/ bundle in case of a library)

- Are multiple translation units compiled at a time? Wouldnt this mean that the entire translation dependency graph would need to be simultaneously recompiled? Wouldnt this inhibit parallelization? How would it handle recompilation? What happens if a dependency is already compiled? Would it recompile it?

> Performance

I think a lot of this is tied to my question about compilation/translation units above, but from my past experience we have "header hygene" which forces us to use headers in a specific way, which if we do, we actually get really good preprocessor performance (a simple example being: dont use #include in a header), how would you compare performance in these kinds of situations vs a compiler without (i.e. either recompiled a full source file or looking up definitions from a compiled source)?

> If you're using hacks to do templating in C, you've outgrown the language and need a more powerful one. D has top shelf metaprogramming - and as usual, other template languages are following in D's path.

yes, as also demonstrated in the performance question, we do a lot to work within the confines of what we have when other tools would handle a lot more of the lifting for us and this is a fair criticism, but on the flip side, I dont have the power to make large decisions on an existing codebase like "lets switch languages" (even if for a source file or two...I've tried) as much as I wish I could, so I have to work with what I have.

WalterBright

> I dont have the power to make large decisions on an existing codebase like "lets switch languages"

We struggled with that for a long time with D. And finally found a solution. D can compile Standard C source files and make all the C declarations available to the D code. When I proposed it, there was a lot of skepticism that this could ever work. But when it was implemented and debugged, it's been a huge win for D.

> Performance

With D you can put all your source files on one command line invocation. That means that imports are only read once, no matter how many times it is imported. This works so well D users have generally abandoned the C approach of compiling each file individually and then linking them together. A vast amount of time is lost in C/C++ compilation with simply reading the .h files thousands of times.

Modules/imports are a gigantic productivity booster. They're not hard to implement, either. Except for the way C++ did it.

> re multiple translation units compiled at a time? Wouldnt this mean that the entire translation dependency graph would need to be simultaneously recompiled? Wouldnt this inhibit parallelization? How would it handle recompilation? What happens if a dependency is already compiled? Would it recompile it?

Yes, yes, yes, yes. And yet, it still compiles faster! See what I wrote above about not needing to read the .h files thousands of times. Oh, and building one large object file is faster than building a hundred and having to link them together.

thayne

> Is a translation unit the same as in C, but since you're #including the file you would expect multiple compilations of a re-included C file? woudnt this bloat the resulting executable (/ bundle in case of a library)

I think the idea is that compiling a translation unit produces two outputs, the object code (as it currently does), and an intermediate representation of the exported declarations, that could be basically a generated .h file, but it would probably be more efficient to use a different format. Then dependent translation units use those declaration files.

With this, you can still compile in parallel. You are constrained by the order of dependencies, but that is already kind of the case.

One complication is that ideally, if the signature doesn't change, but the implementation does, you don't need to re-compile dependent translation units. This is trivial if your build system detects changes based on content (like, say, bazel), but if it uses timestamps (like make) then the compiler needs to ensure the timestamp isn't updated when the declarations don't change.

But this really isn't a new concept. Basically every modern compiled language works fine without needing separate header files.

baranul

Nice explanation. Modules are the way forward. Looks to always have been. Not understanding the resistance, when the advantages are clear.

WalterBright

I do understand the resistance. C is a simple, comfortable language, and its adherents want it to stay that way, warts and all.

But in the context of that, what baffles me is the additions to the C Standard, such as useless (but complicated!) things like normalized Unicode identifiers, things with very marginal utility like generic functions, etc. Why those and not forward declarations?

daymanstep

Can't you use precompiled headers?

WalterBright

Interesting you brought that up. I implemented them for Symantec C and C++ back in the 90s.

I never want to do that again!

They are brittle and a maintenance nightmare. They did speed up compilations, though, but did not provide any semantic advantage.

With D I focused on fast compilation so much that precompiled headers didn't offer enough speedup to make them worth the agony.

ndesaulniers

I had an intern try to use precompiled headers for the Linux kernel. The road block they found was that the command line parameters used to compile the header must exactly match for all translation units which it is used. This is no the case for the Linux kernel. We could compile the header multiple times, but the build complexity was not something we could overcome during the course of one internship.

xigoi

I personally don’t like forward referencing because it makes code harder to read. You can no longer rely on the dependency graph being in topological order.

WalterBright

As the article writes, that forces the private leaf functions to be at the top, with the public interface at the end of the file. The normal way is the public interface at the top, and the implementation "below the fold", so to speak.

> topological order

You are correct. But its the reverse topological order, which is not the most readable ordering. One doesn't read a newspaper article starting at the bottom.

xigoi

Maybe it’s because I’m primarily a mathematician, but I like building complex stuff up from primitives and having the most important results at the end.

chikere232

People learn the ordering. If that is their biggest hurdle learning C they have a blessed life

billfruit

Every other language does seems to not require header file/forward declarations. I don't understand the backlash against that.

Are modern C compilers actually still single pass?

WalterBright

> Are modern C compilers actually still single pass?

All except ImportC, which effortlessly handles forward references. (Mainly because ImportC hijacks the D front end to do the semantics.)

UncleEntity

A bit of an aside but I was poking around in the SPIR-V spec yesterday and they can do forward references because the call site contains all the information to determine the function parameter types. Just thought it was interesting and not really something I had thought about before.

kreco

> Evaluating Constant Expressions

The examples are quite simple in the article but I believe more complex cases would significantly degrade the compiler speed (and probably the memory footprint as well) and would require a VM to leverage this.

Which is probably assumed "too complex" to go into the standard. I'm not saying it's impossible, but I kind of understand why this would not go into any kind of standard.

> Importing Declarations

I wish C++ (or even C) would have gone into this direction instead the weird mess of what is defined for C++20.

Additionally you might import module into some symbol, like:

  #import "string.c" as str

and every non-static symbols from the file can be accessed from like:

  str.trim(" Hello World ");

> __import dex;

This is totally tangential but I don't like when file paths are not explicit. In this specific case I don't know if I'm importing dex.d or dex.c.

WalterBright

> I kind of understand why this would not go into any kind of standard.

Other popular languages can do it. That aside, it is an immensely popular and useful feature in D.

And yes, as one would expect, the more it is used, there's compile time speed and memory consumption required. As for a VM, the constant folder is already a VM. This just extends it to be able to handle function calls. C has simple semantics, so it's not that bad.

> Additionally

Great minds think alike! Your suggestions are just what D imports do. https://dlang.org/spec/module.html#import-declaration

> In this specific case I don't know if I'm importing dex.d or dex.c

This issue does come up. The answer is setting up your import path. It's analogous to the C compiler include path.

thayne

> I believe more complex cases would significantly degrade the compiler speed (and probably the memory footprint as well) and would require a VM to leverage this.

I'm pretty sure most production grade c compilers already do some level of compiler time evaluation for optimization. And C already has constant expressions.

I think a bigger hurdle would be that the compiler needs access to the source code of the function, so it would probably be restricted to functions in the same translation unit.

And then there is the possibly even bigger people problem of getting a committee with representives from multiple compilers to agree on the semantics of such constant evaluation.

loeg

> The examples are quite simple in the article but I believe more complex cases would significantly degrade the compiler speed (and probably the memory footprint as well) and would require a VM to leverage this.

> Which is probably assumed "too complex" to go into the standard. I'm not saying it's impossible, but I kind of understand why this would not go into any kind of standard.

I mean, it's basically 1:1 with the constexpr feature in C++. Almost every C compiler is already a C++ compiler, supporting constexpr functions and evaluation in C can't be that bad, can it?

bjourne

I write unit tests for my C code all that time. It's not difficult if you use a good build system and if you are willing to stomach some boilerplate. Here is one test from my "test suite" for my npy library:

    void
    test_load_uint8() {
        npy_arr *arr = npy_load("tests/npy/uint8.npy");
        assert(arr->n_dims == 1);
        assert(arr->dims[0] == 100);
        assert(arr->type == 'u');
        npy_free(arr);
    }
    int
    main(int argc, char *argv[]) {
        PRINT_RUN(test_load_uint8);
        ...
    }

I know I could have some pre-processor generate parts of the tests, but I prefer to KISS.

WalterBright

Your function looks like it's doing I/O, which won't work at compile time test. Here's an example of a unittest for the ImportC compiler:

    struct S22079
    {
        int a, b, c;
    };

    _Static_assert(sizeof(struct S22079){1,2,3} == sizeof(int)*3, "ok");
    _Static_assert(sizeof(struct S22079){1,2,3}.a == sizeof(int), "ok");

The semantics are checked at compile time, so no need to link & run. With the large volume of tests, this speeds things up considerably. The faster the test suite runs, the more productive I am.

brabel

Hey Walter, importC is great but on Mac it doesn't work right now because Apple seems to have added the type Float16 to math.h (probably due to this: https://developer.apple.com/documentation/swift/float16) and DMD breaks on that.

Could you have a look at fixing that?

WalterBright

Aargh. Those sorts of extensions should not be in the system .h file.

uecker

It is difficult to imagine that compile-time interpretation of tests is faster than compiling and running them for anything more complex. And for trivial stuff it should not matter. Not being able to do I/O is a limitation not a feature.

WalterBright

Linkers are slow and clunky. Yes, there is a crossover point where executable tests are faster.

chikere232

This works in regular C as sizeof() is a constant expression, but perhaps that was your point?

TheNewAndy

You will be pleased to know that you are not the only one who does this.

I previously went down the rabbit hole of fancy unit test frameworks, and after a while I realised that they didn't really win much and settled on something almost identical to what you have (my PRINT_RUN macro has a different name, and requires the () to be passed in - and I only ever write it if the time to run all the tests is more than a second or so, just to make it really convenient to point the finger of blame).

The thing that I do which are potentially looked upon poorly by other people are:

1) I will happily #include a .c file that is being unit tested so I can call static functions in it (I will only #include a single .c file)

2) I do a tiny preprocessor dance before I #include <assert.h> to make sure NDEBUG is not defined (in case someone builds in a "release mode" which disables asserts)

wruza

This test/src separation always felt like html/css to me. When still using C, I wrote tests right after a function as a static function with “test_” in the name. And one big run-all-tests function at the end. All you have to do then is to include a c file and call this function. Why would I ever want to separate a test from its subject is a puzzling thought. Would be nice to have “testing {}” sections in other languages too, but in C you can get away with DCE, worst case #ifdef TESTING.

bjourne

Because tests also serve as api validations. If you can't write a test for functionality without fiddling with internal details the api is probably flawed. Separation forces access via the api.

PhilipRoman

I agree with all of the above. The only fancy thing which I added is a work queue with multiple threads. There really isn't any pressing need for it since natively compiled tests are very fast anyway, but I'm addicted to optimizing build times.

samiv

I really agree, I think that making the tests as easy as possible to get going goes a long way towards a code base that actually has tests.

I have something very similar.

https://github.com/ensisoft/detonator/blob/master/base/test_...

Borrowed heavily from boost.test.minimal and used to be a single header but but over the years I've had to add a single translation unit!

My takeaway is that if you keep your code base in a condition where tests are always passing you need much less complications in your testing tools and their error reporting and fault tolerance etc. !

kstenerud

Compile time unit tests are as bad of an idea as "unused import/variable/result" errors (rather than warnings). They're "nanny features" that take control away from the developer and inevitably cause you to jump through bureaucratic hoops just to get your work done.

These kinds of build-failing tests are great for your "I think I'm finished now" build, but not for your "I'm in the middle of something" builds (which are what 99% of your builds are).

It's like saying "You can't use the table saw until you put the drill away!"

chii

> These kinds of build-failing tests are great for your "I think I'm finished now" build, but not for your "I'm in the middle of something" builds

i tend to disagree.

If you tried to express some thought but the compile time tests tells you you're wrong, you might actually just have an incomplete thought, or have not thought through all of the consequences of said expression.

It's basically what type-checking is in haskell - you cannot compile a program that does not type-check correctly. This forces you as a programmer, to always, and only, express complete thoughts. Incomplete, or contradictory thoughts cannot be expressed.

This should, in theory, lead to programs that are more well thought out. It also makes the program harder to write, because it forces the programmer to discover corners of their program for which they "know" isn't valid but don't care.

kstenerud

> it forces the programmer to discover corners of their program for which they "know" isn't valid but don't care.

And this is precisely why I disagree with forcing it upon the developer at every stage of development. Generally, while in the thick of things, I just want to get things working with one part, not worry about what other parts this breaks (yet). But the pedantic "you have to fix this first" enforcement breaks my concentration because now I have to split my attention to things I don't want to even be bothered with yet. I'll get to it, but I sure as hell don't want you telling me WHEN I should.

fishstock25

> because now I have to split my attention to things I don't want to even be bothered with yet.

One of the reasons could that you realize you don't need those parts, so it would have been a waste of time to write tests for them.

Is that the same as saying I don't want to have to write types either? Maybe. Types are like lightweight incomplete specs.

marcosdumay

> I don't want to even be bothered with yet

Why did you get out of your way to write tests about something that you don't want to be bothered about?

d0mine

Test-driven development has its uses. But it is wrong to make it mandatory. I myself run static checks/unit tests almost all of the time. Still it is useful to skip them from time to time and just run the code to see the results (make it work before you make it "right" according to some linters rules).

omoikane

Maybe these compile time tests are more like `static_assert`, which is valuable for catching incompatible uses of library functions. Pretty good idea in my opinion.

kstenerud

Sure, enforcing invariants is a good thing to do right off the bat. But not "does this code do what it says on the tin?" kinds of tests. Those are better run gradually, at the (current) developer's behest (and most certainly, not blocking compilation).

tczMUFlmoNk

The article is literally talking about `_Static_assert`, yes. It's used in the code examples and described in the text.

null

[deleted]

steinuil

> The leaf functions come first, and the global interface functions are last.

To me that is backwards. I prefer code written in a topological order for a number of reasons:

- It mirrors how you write code within a function.

- It's obvious where you should put that function in the module.

- Most importantly, it makes circular dependencies between pieces of code in a module really obvious.

I'm generally not a fan of circular dependencies, because they make codebases much more entangled and prevent you from being able to understand a module as a contained unit. In Python they can even lead to problems you won't see until you run the code[0], but circular imports are probably so common that current type checkers disable that diagnostic by default[1].

I think languages that don't support forward references (C, but also OCaml and SML) let me apply the "principle of least surprise" to circular dependencies. OCaml even disallows recursive dependencies between functions unless you declare the functions with "let rec fn1 = .. and fn2 = ..", which may be a bit annoying while you're writing the code but it's important information when you're reading it.

[0]: https://gist.github.com/Mark24Code/2073470277437f2241033c200...

[1]: https://microsoft.github.io/pyright/#/configuration?id=type-... (see reportImportCycles)

pansa2

> the compiler only knows about what lexically precedes it. […] it drives programmers to lay out the declarations backwards. The leaf functions come first, and the global interface functions are last. It’s like reading a newspaper article from the bottom up. It makes no sense.

Defining functions on a “bottom-up” order like this is common even in languages like Python which allow forward references. [0]

Is that just a holdover from languages which don’t allow such references? Or does it actually make more sense for certain types of code?

[0] https://stackoverflow.com/a/73131538

almostgotcaught

You're confused. There are no forward references in python. It's simply that identifiers in the body of a function aren't resolved until the function itself is executed (if ever) and at the point everything in the module scope has been defined. You can test this yourself by just putting any name into a body and loading the module.

pansa2

> identifiers in the body of a function aren't resolved until the function itself is executed (if ever) and at the point everything in the module scope has been defined

Yes, so within a function you can refer to things that are defined later in the module. Isn't that's a "forward reference", even if the details are slightly different from how they work in D?

almostgotcaught

[flagged]

throwuxiytayq

In my code, the public interface always tends to bubble upwards, and implementation details go at the end of the file. I don’t even have a strict rule for this, it just reads more cleanly and seems to make sense. Especially if the implementation is large - you don’t want to scroll through all that before you get to the basics of what it all does. I’m curious how everyone else does things.

thayne

Some of my "obvious things c should do" for me would include things like

- add support for a slice type that encodes a pointer and length

- make re-entrant and ideally threadsafe APIs for things that currently use global state (including environment variables).

- standardize something like defer in go and zig, or gcc's cleanup attribute

- Maybe some portable support for unicode and utf-8.

zffr

Aren’t most of these things you want in the standard library, and not things that the language itself should do?

thayne

Only half of them. The first and third are language things.

The first could almost be done with macros. Except that separate declarations of an equivalent struct are considered different, so the best you cand do is a macro you can use define your owne typedef for a specific slice type. It could be done in the library if c supported something like a struct that had structural instead of nominal typing.

drpixie

> Everywhere a C constant-expression appears in the C grammar the compiler should be able to execute functions at compile time, too, as long as the functions do not do things like I/O, access mutable global variables, make system calls, etc.

That one is easily broken. Pick a function that runs for a lloooonngg time...

  int busybeaver(int n) {...}    // pure function returns max lifetime of n state busy beaver machine

  int x = busybeaver(99);

pajko

C23 has constexpr, which cannot be given for functions yet, but there's a proposal: https://www.open-std.org/jtc1/sc22/wg14/www/docs/n2976.pdf

James_K

I feel that much of the point of C is that it's easy to implement. Substantially increasing its scope doesn't seem like the best idea. Perhaps they could do something akin to Scheme and have a "small" and "large" version of the specification.

pjmlp

That is long gone, when looking at C23 and the myriad of compiler extensions.

oplaadpunt

They have that, to some degree. The standard library is mostly optional. Also, a lot of things are 'implementation defined', so you could just not implement those. That leaves quite a small language core.

uecker

I think the real question is why not everybody has already moved to D, if it is so much better and can do all the great things. The answer is that all these things have trade-offs, including implementation effort, changes in tooling, required training, backwards compatibility, etc. some of the features are also not universally seen as better (e.g. IMHO a language which requires forward declaration is better, I also like how headers work).

nine_k

You mean, of course, Zig and Rust, right?

Because D has a sizable runtime library and GC, which can be opted out of, but with very significant limitations, AFAICT.

HN

Obvious things C should do

Obvious things C should do