If all the world were a monorepo

derefr

CRAN’s approach here sounds like it has all the disadvantages of a monorepo without any of the advantages.

In a true monorepo — the one for the FreeBSD base system, say — if you make a PR that updates some low-level code, then the expectation is that you 1. compile the tree and run all the tests (so far so good), 2. update the high-level code so the tests pass (hmm), and 3. include those updates in your PR. In a true centralized monorepo, a single atomic commit can affect vertical-slice change through a dependency and all of its transitive dependents.

I don’t know what the equivalent would be in distributed “meta-monorepo” development ala CRAN, but it’s not what they’re currently doing.

(One hypothetical approach I could imagine, is that a dependency major-version release of a package can ship with AST-rewriting-algorithm code migrations, which automatically push both “dependency-computed” PRs to the dependents’ repos, while also pushing those same patches as temporary forced overlays onto releases of dependent packages until such time as the related PRs get merged. So your dependents’ tests still have to pass before you can release your package — but you can iteratively update things on your end until those tests do pass, and then trigger a simultaneous release of your package and your dependent packages. It’s then in your dependents’ court to modify + merge your PR to undo the forced overlay, asynchronously, as they wish.)

chii

> In a true monorepo ...

ideally yes. However, such a monorepo can become increasingly complex as the software being maintained becomes larger and larger (and/or more and more people work on it).

You end up with massive changes - which might eventually become something that a single person cannot realistically contain within their brain. Not to mention clashes - you will have people making contradictory/conflicting changes, and there will have to be some sort of resolution mechanism outside (or the "default" one, which is first come first served).

Of course, you could "manage" this complexity by attributing api boundary/layers, and these api changes are deemed to be important to not change too often. But that simply means you're a monorepo only in name - not too different from having different repos with versioned artefacts with a defined api boundary.

joek1301

> One hypothetical approach I could imagine, is that a dependency major-version release of a package can ship with AST-rewriting-algorithm code migrations

Jane Street has something similar called a "tree smash" [1]. When someone makes a breaking change to their internal dialect of OCaml, they also push a commit updating the entire company monorepo.

It's not explicitly stated whether such migrations happen via AST rewrites, but one can imagine leveraging the existing compiler infrastructure to do that.

[1]: https://signalsandthreads.com/future-of-programming/#3535

ants_everywhere

I genuinely enjoy R. I use it for calculations daily. In comparison using Python feels tedious and clunky even though I know it better.

> CRAN had also rerun the tests for all packages that depend on mine, even if they don’t belong to me!

Another way to frame this is these are the customers of your package's API. If you broke them you are required to ship a fix.

I see why this isn't the default (e.g. on GitHub you have no idea how many people depend on you). But the developer experience is much nicer like this. Google, for example, makes this promise with some of their public tools.

Outside the word of professional software developers, R is used by many academics in statistics, economics, social sciences etc. This rule makes it less likely that their research breaks because of some obscure dependency they don't understand.

kazinator

> But… CRAN had also rerun the tests for all packages that depend on mine, even if they don’t belong to me!

When you propose a change to something that other things depend on, it makes sense to test those dependents for a regression; this is not earth shattering.

If you want to change something which breaks them, you have to then do it in a different way. First provide a new way of doing something. Then get all the dependencies that use the old way to migrate to the new way. Then when the dependents are no longer relying on the old way, you can push out a change which removes it.

esafak

> In what other ecosystem would a top package introduce itself using an eight-variable equation?

That's the objective function of Hastie et al's GLM. I had a good chuckle when I realized the author's last name is Tibshirani. If you know you know.

david_draco

And if I don't know, can I know?

esafak

Hastie and Tibshirani wrote a famous book on ML (https://hastie.su.domains/ElemStatLearn/), and extended GLMs into GAMs: https://en.wikipedia.org/wiki/Generalized_additive_model

bryanrasmussen

Robert Tibshirani has a daughter named Julie.

haberman

This was an interesting article, but it made me even more interested in the author's larger take on R as a language:

> In the years since, my discomfort has given away to fascination. I’ve come to respect R’s bold choices, its clarity of focus, and the R community’s continued confidence to ‘do their own thing’.

I would love to see a follow-up article about the key insights that the author took away from diving more deeply into R.

maxbond

> When declaring dependencies, most packages don’t specify any version requirements, and if they do, it’s usually just a lower bound like ‘grf >= 1.0’.

I like the perspective presented in this article, I think CRAN is taking an interesting approach. But this is nuts and bolts. Explicitly saying you're compatible with any future breaking changes!? You can't possibly know that!

I get that a lot of R programmers might be data scientists first and programmers second, so many of them probably don't know semver, but I feel like the language should guide them to a safe choice here. If CRAN is going to email you about reverse dependencies, maybe publishing a package with a crazy semver expression should also trigger an email.

jiggawatts

This (with some tweaks) is what I envision the future of NPM, Cargo, and NuGet should look like.

Automated tests, compilation by the package publisher, and enforcement of portability flags and SemVer semantics.

HN

If all the world were a monorepo

If all the world were a monorepo