Reducing Cargo target directory size with -Zno-embed-metadata
14 comments
·June 2, 2025KolmogorovComp
wyldfire
Hyrum's law:
> With a sufficient number of users of an API, it does not matter what you promise in the contract: all observable behaviors of your system will be depended on by somebody.
ronsor
This is why you should randomize all behaviors which should not be depended on. Change things quickly and often if you're not making any promises.
drdaeman
While I can imagine some edge cases where this approach can be meaningful, isn't that generally counterproductive?
Not only one has to be actively aware about all the behaviors they don't document (which is surely not an easy task for any large project), they have to spend a non-negligible amount of time adding randomness to it in a way that would still allow all the internal use cases to work cohesively. This means you spend less time on doing something actually useful.
Instead of randomizing, it should be sufficient to just figure out the semantics for clearly communicating what's the public APIs and stable, and what's internal and subject to change at whim. And maybe slap a big fat warning "if something is not documented - it's internal, and $deity help you if you depend on it, for we make no guarantees except that it'll break on some fine day and that day won't be so fine anymore". Then it's not your problem.
madars
TLS does this with GREASE (Generate Random Extensions And Sustain Extensibility) - https://www.rfc-editor.org/rfc/rfc8701.html . HN discussion: https://news.ycombinator.com/item?id=39416277 (19 points, 8 comments)
Go's implementation of JSON format for protobufs also does this: https://protobuf.dev/reference/go/faq/#unstable-json
> To avoid giving the illusion that the output is stable, we deliberately introduce minor differences so that byte-for-byte comparisons are likely to fail.
keybored
Certain discussions on HN are just diagrams thanks to Laws(tm) and various one-liner tier references.
- Hyrum’s Law (85%)
- Emacs spacebar overheating (15%)
The only way to prevent the decision diagram is to anticipate them and spell them out in the last paragraph. But on the other than that doesn’t very fun right.
keybored
> But on the other than that doesn’t very fun right.
When you write something an hour after your bedtime.
epage
> It seems wild to consider such intermediate files as part of public API. Someone relying on it does not automatically make it a breaking change if it’s not documented.
To find what is considered an intermediate vs a final artifact from cargo, you need to check out https://doc.rust-lang.org/cargo/reference/build-cache.html
We are working on making this clearer with https://github.com/rust-lang/cargo/issues/14125 where there will be `build.build-dir` (intermediate files) and `build.target-dir` (final artifacts).
When you do a `cargo build` inside of a library, like `clap`, you will get an rlip copied into `build.target-dir` (final artifacts). This is intended for integration with other build systems. There are holes with this workflow though but identifying all of the relevant cases for what might be a "safe" breakage is difficult.
saghm
This metadata has been around for years, and Rust releases new versions every six weeks. Whether or not it's technically a "breaking change" or not, it's not unreasonable to spend a likely time to figure out if something will break for someone if they remove it; it's only another month and a half at most before the next chance to stabilize it comes.
At a higher level, as much as it's easier to pretend that "breaking" or "non-breaking" changes are a binary, the terms are only useful in how they describe the murkier reality of how people actually use something. The point of having those distinctions is in how they communicate things to users; developers are promising not to break certain things so that users can rely on them to remain working. That doesn't mean that other changes won't have any impact to users though, and there's nothing wrong with developers taking that into account.
As an analogy, imagine if I promise to mow your lawn every week, and then I mow your neighbor's lawn as well without making them the same promise. I notice that my old mower takes a long time to finish your lawn, and I realize that a newer electric mower with a higher power usage would help me do it faster. I need to make sure that higher power usage is safe for me to use on your property, but I'm not breaking my promise to you if I delay my purchase to check with your neighbor about whether it would be safe for theirs as well and take that into account in my decision. That doesn't mean I'm committing to only buying it if it's safe for their lawn, but it's information that still has some value for me to know in advance, and if it means that your lawn will continue to get cut with the old mower while I figure that out, it doesn't mean that I'm somehow elevating the concern of their lawn to the same level as yours. You might not choose to care about the neighbors lawn in my position, but I don't think it's particularly "wild" that some people might think it's worthwhile to take it into consideration.
merb
I mean yeah, some things are awkward. But well some people rely on things. And I mean it’s still possible to make the new behavior the default and add a switch to not have the metadata
> Currently, it seems like it might be considered to be a backwards compatibility break though, as the Cargo team is unsure if some people weren’t relying on the metadata being present in the .rlib files
It seems wild to consider such intermediate files as part of public API. Someone relying on it does not automatically make it a breaking change if it’s not documented.