Be Aware of the Makefile Effect

346 comments

·January 11, 2025

myhf

"A complex system that works is invariably found to have evolved from a simple system that worked. A complex system designed from scratch never works and cannot be patched up to make it work. You have to start over with a working simple system."

– John Gall (1975) Systemantics: How Systems Really Work and How They Fail

https://en.wikipedia.org/wiki/John_Gall_(author)#Gall's_law

jfengel

It's why I'm always very skeptical of new languages and frameworks. They often look great on a PowerPoint slide, but it's not clear how they'll look on something complex and long-lasting.

They usually pick up warts added for some special case, and that's a sign that there will be infinitely many more.

There's a fine line between "applying experience" and "designing a whole new system around one pet peeve". But it's a crucial distinction.

wellbehaved

With that attitude how would the presently accepted languages/frameworks have come about?

oblio

Probably slower and with more respect for existing tech.

But hey, now we have npm, so who cares anymore? :-)

notjoemama

> "designing a whole new system around one pet peeve"

BAHHAHAH! So…you mean React. If I hear the word hook as if it alone can solve complexity in web dev one more time I’ll…eh, I’ll do nothing actually. But my point still stands. React solves asynchronous event driven behavior well, but that’s all. Everything else in React projects is, well, everything else.

DrNosferatu

What about the divine watchmak - sorry - developer?

xeonmc

The 10x watchmaker, perhaps?

compumetrika

Great quote, commenting just to 'bookmark' it.

tantalor

You can "favorite" comments

sunnybeetroot

Great comment, replying to improve “reach”

mianos

I have an alternate theory: about 10% of developers can actually start something from scratch because they truly understand how things work (not that they always do it, but they could if needed). Another 40% can get the daily job done by copying and pasting code from local sources, Stack Overflow, GitHub, or an LLM—while kinda knowing what’s going on. That leaves 50% who don’t really know much beyond a few LeetCode puzzles and have no real grasp of what they’re copying and pasting.

Given that distribution, I’d guess that well over 50% of Makefiles are just random chunks of copied and pasted code that kinda work. If they’re lifted from something that already works, job done—next ticket.

I’m not blaming the tools themselves. Makefiles are well-known and not too verbose for smaller projects. They can be a bad choice for a 10,000-file monster—though I’ve seen some cleanly written Makefiles even for huge projects. Personally, it wouldn’t be my first choice. That said, I like Makefiles and have been using them on and off for at least 30 years.

huijzer

> That leaves 50% who don’t really know much beyond a few LeetCode puzzles and have no real grasp of what they’re copying and pasting.

Small nuance: I think people often don’t know because they don’t have the time to figure it out. There are only so many battles you can fight during a day. For example if I’m a C++ programmer working on a ticket, how many layers of the stack should I know? For example, should I know how the CPU registers are called? And what should an AI researcher working always in Jupyter know? I completely encourage anyone to learn as much about the tools and stack as possible, but there is only so much time.

kragen

If you spend 80% of your time (and mental energy) applying the knowledge you already have and 20% learning new things, you will very quickly be able to win more battles per day than someone who spends 1% of their time learning new things.

Specifically for the examples at hand:

- at 20%, you will be able to write a Makefile from scratch within the first day of picking up the manual, rather than two or three weeks if you only invest 1%.

- if you don't know what the CPU registers are, the debugger won't be able to tell you why your C++ program dumped core, which will typically enable you to resolve the ticket in a few minutes (because most segfaults are stupid problems that are easy to fix when you see what the problem is, though the memorable ones are much hairier.) Without knowing how to use the disassembly in the debugger, you're often stuck debugging by printf or even binary search, incrementally tweaking the program until it stops crashing, incurring a dog-slow C++ build after every tweak. As often as not, a fix thus empirically derived will merely conceal the symptom of the bug, so you end up fixing it two or three times, taking several hours each time.

Sometimes the source-level debugger works well enough that you can just print out C++-level variable values, but often it doesn't, especially in release builds. And for performance regression tickets, reading disassembly is even more valuable.

(In C#, managed C++, or Python, the story is of course different. Until the Python interpreter is segfaulting.)

How long does it take to learn enough assembly to use the debugger effectively on C and C++ programs? Tens of hours, I think, not hundreds. At 20% you get there after a few dozen day-long debugging sessions, maybe a month or two. At 1% you may take years.

What's disturbing is how many programmers never get there. What's wrong with them? I don't understand it.

remus

You make it sound easy, but I think it's hard to know where to invest your learning time. For example, I could put some energy into getting better at shell scripting but realistically I don't write enough of it that it'll stick so for me I don't think it'd be a good use of time.

Perhaps in learning more shell scripting I have a breakthrough and realise I can do lots of things I couldn't before and overnight can do 10% more, but again it's not obvious in advance that this will happen.

icameron

That’s an insightful comment, but there is a whole universe of programmers who never have to directly work in C/C++ and are productive in safe languages that can’t segfault usually. Admittedly we are a little jealous of those elite bitcrashers who unlock the unbridled power of the computer with C++… but yeah a lot of day jobs pay the bills with C#, JavaScript, or Python and are considered programmers by the rest of the industry

ajross

> I completely encourage anyone to learn as much about the tools and stack as possible, but there is only so much time.

That seems like a weird way to think about this. I mean, sure, there's no time today to learn make to complete your C++ ticket or whatever. But yesterday? Last month? Last job?

Basically, I think this matches the upthread contention perfectly. If you're a working C++ programmer who's failed to learn the Normal Stable of Related Tools (make, bash, python, yada yada) across a ~decade of education and experience, you probably never will. You're in that 50% of developers who can't start stuff from scratch. It's not a problem of time, but of curiosity.

Joker_vD

> I mean, sure, there's no time today to learn make to complete your C++ ticket or whatever. But yesterday? Last month? Last job?

That seems like a weird way to think about this. Of course there was no time in the past to learn this stuff, if you still haven't learned it by the present moment. And even if there were, trying to figure out whether there perhaps was some free time in the past is largely pointless, as opposed to trying to schedule things in the future: you can't change the past anyhow, but the future is somewhat more malleable.

silveraxe93

This is the 40% that OP mentioned. But there's a proportion on people/engineers that are just clueless and are incapable of understanding code. I don't know the proportion so can't comment on the 50% number, but hey definitely exist.

If you never worked with them, you should count yourself lucky.

silver_silver

We can’t really call the field engineering if this is the standard. A fundamental understanding of what one’s code actually makes the machine do is necessary to write quality code regardless of how high up the abstraction stack it is

kragen

Steam engines predate the understanding of not just the crystalline structure of steel but even the basics of thermodynamics by quite a few decades.

retros3x

The problem is that software is much more forgiving than real life engineering project. You can't build a skyscraper with duct tape. With software, especially the simple webapps most devs work on, you don't NEED good engineering skills to get it running. It will suck of course, but it will not fall apart immediately. So of course most "engineers" will go the path of least resistance and never leave the higher abstractions to dive deep in concrete fundamentals.

cudgy

Sure if you are doing embedded programming in C. How does one do this in web development though where there are hundreds of dependencies that get updated monthly and still add functionality and keep their job?

godelski

Funny enough I'm an A̶I̶ML researcher and started in HPC (High Performance Computing).

  >  if I’m a C++ programmer ... should I know how the CPU registers are called?

Probably.

Especially with "low level"[0] languages knowing some basics about CPU operations goes a long way. You can definitely get away without knowing these things but this knowledge will reap rewards. This is true for a lot of system based information too. You should definitely know about things like SIMD, MIMD, etc because if you're writing anything in C/C++ these days it should be because you care a lot about performance. There's a lot of stuff that should be parallelized that isn't. Even stuff that could be trivially parallelized with OpenMP.

  > what should an AI researcher working always in Jupyter know?

Depends on what they're researching. But I do wish a lot more knew some OS basics. I see lots of things in papers where they're like "we got 10x" performance on some speed measurement but didn't actually measure it correctly (e.g. you can't use time.time and be accurate because there's lots of asynchronous operations). There's lots of easy pitfalls here that are not at all obvious and will look like they are working correctly. There's things about GPUs that should be known. Things about math and statistics. Things about networking. But this is a broad field so there are of course lots of answers here. I'd at least say anyone working on AI should read at least some text on cognitive science and neuroscience because that's a super common pitfall too.

I think it is easy to not recognize that information is helpful until after you have that information. So it becomes easy to put off as not important. You are right that it is really difficult to balance everything though but I'm not convinced this is the problem with those in that category of programmers. There's quite a number of people who insist that they "don't need to" learn things or insist certain knowledge isn't useful based on their "success."

IMO the key point is that you should always be improving. Easier said than done, but it should be the goal. At worst, I think we should push back on anyone insisting that we shouldn't be (I do not think you're suggesting this).

[0] Quotes because depends who you talk to. C++ historically was considered a high level language but then what is Python, Lua, etc?

oweiler

That's why suggestions like RTFM! are stupid. I just don't have time to read every reference documentation of every tool I use.

aulin

you don't have the time because you spend it bruteforcing solutions by trial and error instead of reading the manual and doing them right the first time

kccqzy

I feel like your working environment might be to blame: maybe your boss is too deadline-driven so that you have no time to learn; or maybe there is too much pressure to fix a certain number of tickets. I encourage you to find a better workplace that doesn't punish people who take the time to learn and improve themselves. This also keeps your skills up to date and is helpful in times of layoffs like right now.

prerok

Seriously? Yes, you should read the docs of every API you use and every tool you use.

I mean, it's sort of ok if you read somewhere how to use it and you use it in the same way, but I, for one, always check the docs and more often even the implementation to see what I can expect.

adrian_b

Actually it is trivial to write a very simple Makefile for a 10,000 file project, despite the fact that almost all Makefiles that I have ever seen in open-source projects are ridiculously complicated, far more complicated than a good Makefile would be.

In my opinion, it is a mistake almost always when you see in a Makefile an individual rule for making a single file.

Normally, there should be only generic building rules that should be used for building any file of a given type.

A Makefile should almost never contain lists of source files or of their dependencies. It should contain only a list with the directories where the source files are located.

Make should search the source directories, find the source files, classify them by type, create their dependency lists and invoke appropriate building rules. At least with GNU make, this is very simple and described in its user manual.

If you write a Makefile like this, it does not matter whether a project has 1 file or 10,000 files, the effort in creating or modifying the Makefile is equally negligible. Moreover, there is no need to update the Makefile whenever source files are created, renamed, moved or deleted.

mianos

If everything in your tree is similar, yes. I agree that's going to be a very small Makefile.

While this is true, for much larger projects, that have lived for a long time, you will have many parts, all with slight differences. For example, over time the language flavour of the day comes and goes. Structure changes in new code. Often different subtrees are there for different platforms or environments.

The Linux kernel is a good, maybe extreme, but clear example. There are hundreds of Makefiles.

adrian_b

Different platforms and environments are handled easily by Make "variables" (actually constants), which have platform-specific definitions, and which are sequestered into a platform-specific Makefile that contains only definitions.

Then the Makefiles that build a target file, e.g. executable or library, include the appropriate platform-specific Makefile, to get all the platform-specific definitions.

Most of my work is directed towards embedded computers with various architectures and operating systems, so multi-platform projects are the norm, not the exception.

A Makefile contains 3 kinds of lines: definitions, rules and targets (typical targets may be "all", "release", "debug", "clean" and so on).

I prefer to keep these in separate files. If you parametrize your rules and targets with enough Make variables to allow their customization for any environment and project, you must almost never touch the Makefiles with rules and targets. For each platform/environment, you write a file with appropriate definitions, like the names of the tools and their command-line options.

The simplest way to build a complex project is to build it in a directory with a subdirectory for each file that must be created. In the parent directory you put a Makefile that is the same for all projects, which just invokes all the Makefiles from the subdirectories that it finds below, passing any CLI options.

In the subdirectory for each generated file, you just put a minimal Makefile, with only a few lines, which includes the Makefiles with generic rules and targets and the Makefile with platform-specific definitions, adding the only information that is special for the generated file, i.e. what kind of file it is, e.g. executable, static library, dynamic library etc., a list of directories where to search for source files, the strings that should be passed to compilers for their include directory list and their preprocessor definition list, and optionally and infrequently you may override some Make definitions, e.g. for providing some special tool options, e.g. when you generate from a single source multiple object files.

ori_b

Engineering is deciding that everything in your source tree will be designed to be similar, so you can have a small makefile.

imtringued

Sure, but this will require you to know how to tell the compiler to generate your Makefile header dependencies and if you end up making a mistake, this will cause silent failures.

Loic

I like Makefiles, but just for me. Each time I create a new personal project, I add a Makefile at the root, even if the only target is the most basic of the corresponding language. This is because I can't remember all the variations of all the languages and frameworks build "sequences". But "$ make" is easy.

1aqp

I'd say: you are absolutely using the right tool. :-)

choeger

You're probably using the wrong tool and should consider a simple plain shell script (or a handful of them) for your tasks. test.sh, build.sh, etc.

poincaredisk

I disagree. Make is - at it's simplest form - exactly a "simple plain shell script" for your tasks, with some very nice bonus features like dependency resolution.

Not the parent, bit I usually start with a two line makefile and add new commands/variables/rules when necessary.

nrclark

(not the parent)

Make is - at its core - a tool for expressing and running short shell-scripts ("recipes", in Make parlance) with optional dependency relationships between each other.

Why would I want to spread out my build logic across a bunch of shell scripts that I have to stitch together, when Make is a nicely integrated solution to this exact problem?

f1shy

I would just change the percentages, but is about as true as it gets.

mianos

I’d be curious to hear your ratio. It really varies. In some small teams with talented people, there are hardly any “fake” developers. But in larger companies, they can make up a huge chunk.

Where I am now, it’s easily over 50%, and most of the real developers have already left.

PS: The fakes aren’t always juniors. Sometimes you have junior folks who are actually really good—they just haven’t had time yet to discover what they don’t know. It’s often absolutely clear that certain juniors will be very good just from a small contribution.

f1shy

My personal experience: - 5% geniuses. This are people who are passionate about what they do, they are always up to date. Typically humble, not loud people. - 15% good, can do it properly. Not passionate, but at least have a strong sense of responsibility. Want to do “the right thing” or do it right. Sometimes average intelligence, but really committed. - 80% I would not hire. People who talk a lot, and know very little. Probably do the work just because they need the money.

That applies for doctors, contractors, developers, taxi drivers, just about anything and everything. Those felt percentages had been consistent across 5 countries, 3 continents and 1/2 a century of life

PS: results are corrected for seniority. Even in the apprentice level I could tell who was in each category.

afiori

Being able to set up things and truly understanding how they work are quite different imo.

I agree with the idea that a lot of productive app developers would not be able to set up a new project ex novo but often it is not about particularly true understanding but rather knowing the correct set of magic rules and incantations to make many tools work well together

znpy

> They can be a bad choice for a 10,000-file monster

Whether they are a bad choice really depends on what are the alternatives though

sebazzz

> That leaves 50% who don’t really know much beyond a few LeetCode puzzles and have no real grasp of what they’re copying and pasting.

Who likely wouldn't have a job if it weren't for LLMs.

raziel2p

pretty sure we've made this complaint about a subset of developers since way before chatgpt and the like.

f1shy

And that happens not only with developers, but in any profession, which gives me shivers when I go to the doctor!

mcdeltat

At my work I've noticed another contributing factor: tools/systems that devs need to interact with at some point, but otherwise provide little perceived value to learn day-to-day.

Example is build system and CI configuration. We absolutely need these but devs don't think they should be expected to deal with them day to day. CI is perceived as a system that should be "set and forget", like yeah we need it but really I have to learn all this just to build the app? Devs expect it to "just work" and if there are complexities then another team (AKA my role) deals with that. As a result, any time devs interact with the system, there's a high motivation to copy from the last working setup and move on with their day to the "real" work.

The best solution I see is meet the devs halfway. Provide them with tooling that is appropriate simple/complex for the task, provide documentation, minimise belief in "magic". Tools like Make kinda fail here because they are too complex and black-box-like.

nicoburns

For me the big problems with CI setups tend to be:

- They're often slow

- They're often proprietary

- They're often dealing with secrets which limits who can work on them

- You generally can't run them locally

So the feedback cycle for working on them is incredibly long. And working on them is therefore a massive pain.

sharkjacobs

> You generally can't run them locally

I recognize that this is such a disincentive for me taking the initiative to fiddle with and learn about anything like this

vintermann

Same goes for anything "enterprisey". Last time I set up a big project, I made a commitment that "we should be able to check out and build this whole thing, for as long as humanly possible".

pipes

The local part is my big problem too. I used azure Dev ops in work. I find clicking through the UI to be a miserable experience, Id love to have it running locally so I could view inputs and outputs on the file system. Also yaml is an awful choice, no one I know enjoys working with it. The white space issues just get worse and worse longer your files get.

tempodox

> You generally can't run them locally

GitLab CI gives you local runners. You can completely self-host CI.

Plasmoid

It's not self hosting. It's can I run the build from my local command line and get the same results.

Joker_vD

Well, yes, but aren't those runners have different configuration than the runners that are actually deployed and used by your company's CI/CD?

john-tells-all

Strong agree. The best workflow I've seen uses CICD as a very thin wrapper around in-tree scripts or make files.

If a Dev can run some/all of the "cicd" stuff locally, they can see, control, and understand it. It helps tremendously to have a sense of ownership and calm, vs "cicd is something else, la la la".

(This doesn't always work. We had a team of two devs, who had thin-wrapper CICD, who pretended it was an alien process and refused to touch it. Weird.)

peterldowns

+1. The only CI tool that I've seen really organize around this principle is Buildkite, which I've used and enjoyed. I'm currently using Github Actions and it's fine but Buildkite is literally sooooo good for the reasons you've mentioned.

exitb

The office coffee machine is not „set and forget”, but you wouldn’t expect the entire responsibility for it’s maintenance to be evenly distributed between all people that use it. Similarly, CI needs ownership and having it fall on the last developer that attempted to use it is not an efficient way of working.

IgorPartola

Make is one of the simplest build tools out there. Compared to something like Grunt, Webpack, etc. it’s a hammer compared to a mining drill.

The solution is to not use tools used by large corporations because they are used by large corporations. My unpopular opinion is that CI/CD is not needed in most places where it’s used. Figure out how to do your builds and deploys with the absolute fewest moving pieces even if it involves some extra steps. Then carefully consider the cost of streamlining any part of it. Buying into a large system just to do a simple thing is often times not worth it in the long run.

If you really do need CI/CD you will know because you will have a pain point. If that system is causing your developers pain, it isn’t the right fit.

tempodox

If you think `make` is “too complex and black-box-like” then you haven't seen `cmake`.

vintermann

If you think cmake is a good example of more complex than make, then you haven't seen automake/autoconf. The first thing I thought of. You can find tons of tons of configure scripts that check if you're running ancient versions of Unix, checks that a byte is 8 bits wide, and a ton of other pointless checks. They don't do anything with all that information, don't think for a moment that you can actually build the app on Irix, but the checks for it have been passed along for decades likes junk DNA.

tempodox

Firstly, automake/autoconf is not `make`, but a different piece of software, and secondly, that you know all those details about it is because it is not black-box-like.

pwdisswordfishz

I have seen both, and I consider them roughly similar.

internet_points

Yeah, I think this is the real issue. Too many different tool types that need to interact, so you don't get a chance to get deep knowledge in any of them. If only every piece of software/CI/build/webapp/phone-app/OS was fully implemented in GNU make ;-) There's a tension between using the best tool for the job vs adding yet another tool/dependency.

IgorPartola

Make and Makefiles are incredibly simple when they are not autogenerated by autoconf. If they are generated by autoconf, don’t modify them, they are a build artifact. But also, ditch autoconf if you can.

In the broader sense: yes this effect is very real. You can fall to it or you can exploit it. How I exploit it: write a bit of code (or copy/paste it from somewhere). Use it in a project. Refine as needed. When starting the next project, copy that bit of code in. Modify for the second project. See if changes can be backported to the original project. Once both are running and are in sync, extract the bit of code and make it into a library. Sometimes this takes more projects to distill the thing into what a library should be. In the best case, open source the library so others can use it.

stouset

They are also extremely limited. Timestamp-based freshness is often broken by modern VCSes. Git doesn’t record timestamps internally, so files can (and often do) have their mtime updated even when their contents are the same, causing unnecessary rebuilds.

They also are utterly unable to handle many modern tools whose inputs and/or outputs are entire directories or whose output names are not knowable in advance of running the tool.

I love make. I have put it to good use in spite of its shortcomings and know all the workarounds for them, and the workarounds for the workarounds, and the workarounds for those workarounds. Making a correct Makefile when you end up with tools that don’t perfectly fit into its expectations escalates rapidly in difficulty and complexity.

chuckadams

I started using ccache to speed up Make, but soon found that allowed me to replace Make entirely with a bash script using a few functions.

Quekid5

They are simple but very often wrong. It's surprisingly hard to write Makefiles that will actually do the right thing under anything other than "build from scratch" scenarios. (No, I'm not joking. The very existence of the idea of "make clean" is the smoking gun.)

jandrese

I disagree, but I think once a project gets beyond a certain level of complexity you may need to move beyond make. For simple projects though I usually do something like:

    CC=clang
    MODULES=gtk+-3.0 json-glib-1.0
    CFLAGS=-Wall -pedantic --std=gnu17 `pkg-config --cflags $(MODULES)`
    LDLIBS=`pkg-config --libs $(MODULES)`
    HEADERS=*.h
    EXE=app
    
    ALL: $(EXE)

    $(EXE): application.o jsonstuff.o otherstuff.o

    application.o: application.c $(HEADERS)

    jsonstuff.o: jsonstuff.c $(HEADERS)

    otherstuff.o: otherstuff.c $(HEADERS)

    clean:
            rm -f $(EXE) *.o

This isn't perfect as it causes a full project rebuild whenever a header is updated, but I've found it's easier to do this than to try to track header usage in files. Also, failing to rebuild something when a header updates is a quick way to drive yourself crazy in C, it's better to be conservative. It's easy enough that you can write it from memory in a minute or two and pretty flexible. There are no unit tests, no downloading and building of external resources, or anything fancy like that. Just basic make. It does parallelize if you pass -j to make.

Quekid5

That's effectively "make clean" just slightly more automated. Btw, what happens if you change CFLAGS? Does anything get compiled if no files have changed?

imtringued

Makefile Effect in action...

RHSeeger

I use makefiles all the time for my projects; projects that are actually built with something else (ex, gradle, maven, whatever). My makefiles have targets for build, clean, dependencies, and a variety of other things. And they also have inputs (like "NOTEST=true") for altering how they run. And then I use make to actually build the project; so I don't need to remember how the specific build tool for _this_ project (or the one of many build tools in a project) happens to work. It works pretty well.

mauvehaus

The idea that git offers a 'clean' command was revelatory to me. Your build system probably shouldn't need to know how to restore your environment to a clean state because your source control should already know what a clean state is.

That's sort essential to serving its purpose, after all.

I haven't yet run into a scenario where there was a clean task that couldn't be accomplished by using flags to git clean, usually -dfx[0]. If someone has an example of something complex enough to require a separate target in the build system, I'm all ears.

[0] git is my Makefile effect program. I do not know it well, and have not invested the time to learn it. This says something about me, got, or both.

withinboredom

The problem with `git clean` is -X vs -x. -x (lowercase) removes EVERYTHING including .env files and other untracked files. -X (uppercase) removes only ignored files, but not untracked files.

If there is a Makefile with a clean target, usually the first thing I do when I start is make it an alias for `git clean -X`.

Usually, you want to keep your untracked files (they are usually experiments, debugging hooks, or whatever).

rixed

make clean is supposed to clean the intermediary files only but keep the actual build targets (typically what you want to install)

IgorPartola

That’s why I usually write them from scratch and don’t let them get over 100 lines long at most. Usually they are around 30 with white space.

make clean makes lots of sense but is not even strictly necessary. In the world where all it does is find all the *.o files and deletes them it’s not a bad thing at all.

mattbillenstein

I think Makefile is maybe the wrong analogy - the problem with most people and makefiles is they write so few of them, the general idea of what make does is at hand, but the muscle memory of how to do it from scratch is not.

But, point taken - I've seen so much code copy-pasta'd from the web, there will be like a bunch of dead stuff in it that's actually not used. A good practice here is to keep deleting stuff until you break it, then put whatever that was back... And delete as much as possible - certainly everything you're not using at the moment.

sumanthvepa

This is exactly the problem I face with many tools, Makefiles, KVM setups, docker configurations, CI/CD pipelines. My solution so far has been to create a separate repository with all my notes, shell script example programs etc, for these tool, libraries or frameworks. Every time I have to use these tools, I refer to my notes to refresh my memory, and if I learn something new in the process, I update the notes. I can even point an LLM at it now and ask it questions.

The repository is personal, and contains info on tools that are publicly available.

I keep organisation specific knowledge in a similar but separate repo, which I discard when my tenure with a client or employer ends.

parasti

What if your client comes back?

On a more practical note, what structure, formats and tools do you use that enable you to feed it to an LLM?

sumanthvepa

I'm usually contractually obligated to destroy all client IP that I may posses at the end of an engagement. My contracts usually specify that I will retain engagement specific information for a period of six months beyond the end of the contract. If they come back within that time, then I'll have prior context. Otherwise it's gone. Occasionally, a client does come back after a year or two, but most of the knowledge would have been obsolete and outdated anyway.

As for LLMs. I have a couple of python scripts that concatenate files in the repo into a context that I pass to Google's Gemini API or Google AI studio, mostly the latter. It can get expensive in some situations. I don't usually load the whole repository. And I keep the chat context around so I can keep asking question around the same topic.

Scubabear68

The best term for this is Cargo Cult Development. Cargo Cults arose in the Pacific during World War II, where native islanders would see miraculous planes bringing food, alcohol and goods to the islands and then vanishing into the blue. The islanders copied what they saw the soldiers doing, praying that their bamboo planes and coconut gadgets would impress the gods and restart the flow of cargo to the area.

The issue of course is the islanders did not understand the science behind planes, Wallis talkies, guns, etc.

Likewise, cargo cult devs see what is possible, but do not understand first principles, so they mimic what they see their high priests of technology doing, hoping they can copy their success.

Hence the practice of copying, pasting, trying, fiddling, googling, tugging, pulling and tweaking hoping that this time it will be just right enough to kind of work. Badly, and only with certain data on a Tuesday evening.

lolinder

I don't think of this as being cargo cult development. Cargo culting has more to do with mimicking practices that have worked before without understanding that they only worked within a broader context that is now missing. It's about going through motions or rituals that are actually ineffective on their own in the hopes that you'll get the results that other companies got who also happened to perform those same motions or rituals.

What OP is describing isn't like this because the thing being copied—the code—actually is effectual in its own right. You can test it and decide whether it works or not.

The distinction matters because the symptoms of what OP calls the Makefile effect are different than the symptoms of cargo culting, so treating them as the same thing will make diagnosis harder. With cargo culting you're wasting time doing things that actually don't work out of superstition. With the Makefile effect things will work, provably so, but the code will become gradually harder and harder to maintain as vestigial bits get copied.

themanmaran

I would almost call this the "boilerplate effect".

Where people copy the giant boilerplate projects for React, K8, Terraform, etc. and go from there. Those boilerplates are ideal for mid to large scale projects. And it's likely you'll need them someday. But in the early stages of development it's going to impart a lot of architecture decisions that really aren't necessary.

peterldowns

That's a great phrase. A perfect example of what you're talking about is actually built-in to the `helm` tool for creating re-usable kubernetes configs: `helm create myapp` creates the most convoluted junk I've ever seen in my life. As a new `helm` user I was extremely confused, as I had been expecting a minimal example that I could start from. Instead, I got a maximal example. Thankfully a more experienced engineer on the infra team confirmed that it was mostly unnecessary boilerplate and I could remove it.

Something to consider for anyone else building tools — boilerplate has costs!

NomDePlum

Seeing this exact effect where I am currently working. Main available CI/CD tool is a customised and centrally managed Jenkins fleet. It's pretty much impossible to avoid using and seldom needs changed - until it does. Some attempts have been made at centralised libraries and patterns - but even that requires knowledge and study that most won't know is available or be given time to acquire.

So when the inevitable tweak or change is made it's made in the easiest, cheapest way - which is usually copying an existing example, which itself was copied from somewhere else.

I see exactly the same in other teams repositories. Easiest path taken to patch what already exists as the cost/benefit just isn't perceived to be there to worth prioritising.

godelski

  > only worked within a broader context that is now missing

  > because the thing being copied—the code—actually is effectual in its own right.

I don't understand how the second disproves the former.

In fact, a cargo cult works because there's the appearance of a casual linkage. It appears things work. But as we know with code, just because it compiles and runs doesn't mean "it works". It's not a binary thing. Personal I find that belief is at the root of a lot of cargo cult development. Where many programmers glue things together say "it works" because they passed some test cases but in reality code shouldn't be a Lovecraftian monster made of spaghetti and duct tape. Just because your wooden plane glides doesn't mean it's AC an actual plane

scubbo

> a cargo cult works

But...it doesn't? That's the whole definitional point of it. If action A _does_ lead to outcome B, then "if we do A, then B will happen" is not a cargo cult perspective, it's just fact.

lolinder

> Just because your wooden plane glides doesn't mean it's AC an actual plane

But if your wooden plane can somehow make it to Europe, collect cargo, and bring it back to your island, what you're doing is definitely not cargo culting.

It might not be actual engineering, maybe you don't understand aerodynamics or how the engine works, and maybe the plane falls apart when it hits the runway on the return flight, but if you got the cargo back you are doing something very different from cargo culting.

That's why copypasta doesn't count as cargo culting. It accomplishes the same task once copied as it did before. It may do so less reliably and less legibly, but it does do what it used to do in its original context.

woodruffw

(Author of the post.)

This is mentioned in footnote 1. Concretely, I don’t think this is exactly the same thing as cargo culting, because cargo culting implies a lack of understanding. It’s possible to understand a system well and still largely subsist on copy-pasting, because that’s what the system’s innate complexity incentivizes. That was the underlying point of the post.

cle

For me, there are many cases where I copy-paste stuff I've written in the past b/c some tool is a pain-in-the-ass and I can't afford the mental context switch. I usually do understand what's happening under the hood, but it's still cognitively heavy to switch into that "mode" so I avoid it when possible.

Tools that fall into this category are usually ops-y things with enormous complexity but are not "core" to the problem I'm solving, like CI/CD, k8s, Docker, etc. For Make specifically, I usually just avoid it at this point b/c I find it hard to avoid the context switch.

It has nothing to do with miraculous incantations--I know the tradeoff I'm making. But it still runs the risk of turning into the Makefile Effect.

sgarland

It’s always hoped (but rarely shown to be true) that by making templates, teams will put thought into their K8s deployments etc. instead of just copy/pasting. Alas, no – even when the only things the devs have to do is add resource requests and limits, those are invariably copy/pasted. If the app gets OOMkilled, they bump up memory limit until it doesn’t. If it’s never OOMkilled, it’s probably never touched, even if it’s heavily over-provisioned (though that would only matter for the request, of course).

This has spawned a cottage industry of right-sizing tooling, which does what a dev team could and should have done to begin with: profiling their code to see resource requirements.

At this point, I feel like continuing to make things easier is detrimental. I certainly don’t think devs need to know how to administer K8s, but I do firmly believe one should know how to profile one’s code, and to make reasonable decisions based on that effort.

cle

I do know how to profile my code, and I'll also continue to bias towards not doing it now. Even if it could mean more pain later.

I think part of the problem is that the pain it causes is quite visceral, but the opportunity cost is pretty abstract, so it's a lot easier to just focus on the pain and forget about what you're gaining.

remus

I agree, and I think the key distinction is in understanding. In a cargo cult there's a lack of understanding, whereas I'll often copy and paste code/config I understand to get something done. Usually this is for something I don't do very often (configuring nginx, writing some slightly complicated shell script etc.) I could spend an hour reading docs and writing the thing from scratch but that's likely gonna be wasted time because there's a good chance Im not going to look at that thing again for a few years.

raverbashing

Pretty much this

And of course every one of those tools has to have their own special language/syntax that makes sense nowhere else (think of all the tools beyond make, like autotools, etc)

I don't care about make. I don't care learning about make beyond what's needed for my job

Sure, it's a great tool, but I literally have 10 other things that deserve more of my attention than having my makefile work as needed

So yeah I'll copy/paste and be done with it

erosivesoul

Honestly is this not how it should be done? There's always going to be a more elegant approach for sure. But in general, we don't want developers to keep rewriting the same code again and again. Avoiding that is part of entire design paradigms. I'd like to talk to the dev who doesn't copy-paste and writes everything from scratch.

qwery

The article does kind of mention this in footnote '1', for what it's worth:

> The Makefile effect resembles other phenomena, like cargo culting, normalization of deviance, “write-only language,” &c. I’ll argue in this post that it’s a little different from each of these, insofar as it’s not inherently ineffective or bad and concerns the outcome of specific designs.

registeredcorn

I think I fall very much into the "beginner of beginner stages" of understanding programming. It sounds like then, if I want to avoid that "cargo cult" mindset, then a structured flow of:

education -> learning -> doing -> failing -> (repeat)

Would be needed then, right?

Does this then mean that, if someone truly wants to "escape the island, and fly the plane" as it were, it comes down to "university is the 'truest' way"?

Note: Yes, I realize it's hard to speak in absolutes, that there are plenty of exceptions to generalities, and that all people have various degrees of justifications of I-can't-do-that-itus; I'm talking more in terms of optimal theory. That, the optimal route to avoid cult-like behavior is to understand the whole thing, and that "the whole thing" comes from higher education, right?

Logically at least, it would seem that even diligent studying with books as a means to meet/surpass the "completeness" of university would still be... inadequate in some regard when compared to in-class time with learned educators. (Again, supposing that the same person worked just as hard doing either option, etc.)

magic_smoke_ee

An engineer should learn first principles and master the tool rather than dancing around it or reach immediately for replacing it with something else. This is why the "replacement" tool "just" is fundamentally terrible because it doesn't do dependency checking and optimizes for the wrong things. Instant loss of efficiency throwing away the power and simplicity of makefiles (GNU extensions often needed).

Instead, (GNU or vanilla) makefiles are ideals for very simple, portable projects. Make is everywhere.

For anything complicated, a proper build system that doesn't use autotools like cmake or bazel.

getnormality

Another factor is frequency of use. I use LaTeX to do big write-ups on the order of once per year or less. LaTeX at the level I use it is not a hard tool, but I generally start a new document by copy-pasting a previous document because there is a lot of detail about how to use it that I'm never going to remember given that I only use it for a few weeks once a year.

Linux-Fan

I usually try to avoid the "makefile effect" by learning the technolgoy I use reasonably frequently (like e.g. Makefiles, Shell Scripts, ...).

However, despite the fact that I used to use LaTeX very much, I always copy-pasted from a template. It is even worse with beamer presentations and TikZ pictures where I would copy-paste from a previous presentation or picture rather than a template.

For TikZ I am pretty sure that the tool is inherently complex and I just haven't spent enough time to learn it properly.

For LaTeX I have certainly spent enough time on learning it so I wonder whether it might be something different.

In my opinion it could very well be a matter of “(in)sane defaults”. Good tools should come with good defaults. However, LaTeX is not a good tool wrt. this metric, because basically all my documents start something like

~~~ \documentclass[paper=a4, DIV9, 12pt, abstracton, headings=normal, captions=tableheading]{scrartcl} \usepackage[T1]{fontenc} \usepackage[utf8]{inputenc} \usepackage[english,german,ngerman]{babel} \usepackage[english,german]{fancyref} % ... \usepackage{microtype} \usepackage{hyperref} ~~~

Most of this is to get some basic non-ASCII support that is needed for my native tongue or enable some sane defaults (A4 paper, microtype...) which in a modern tool like e.g. pandoc/markdown may not be needed...

Hence the purpose of copy-pasing the stuff around is often to get good defaults which a better tool might give you right out of the box (then without copy/paste).

kccqzy

Copy-pasting itself is not bad per se. What's bad is copy-pasting without understanding the why and how.

For LaTeX I also copy-paste a whole lot from older files, but I don't feel bad because (a) I wrote these files before, (b) I know exactly what each line is doing, (c) I understand why each line is needed in the new doc.

I wrote a relatively large amount of TikZ code earlier in my life (basically used it as a substitute for Illustrator) and for this library in particular, I think it just has so much syntax to remember that I cannot keep it all in my brain for ever. So I gladly copy from my old TikZ code.

fph

\usepackage[utf8]{inputenc} now is the default, at least; you don't need to include it anymore. And diacritics work out of the box, no need to write weird incantations like G\"{o}del anymore.

folmar

I use it more often and also start with copy-paste header, that includes:

* all packages needed for my language (fontenc, babel, local typography package) * typical graphicx/fancyhdr/hyperref/geometry packages that are almost always needed * a set of useful symbol and name definitions for my field

If you are not writing math or pure text in English only LaTeX is batteries not included.

yuppiepuppie

This is “Copy-Pasta Driven Development” [0] and it’s not even related to makefiles. It’s related to the entire industry copying code from here to there without even knowing what they are copying.

TBH I think copilot has made this even worse, as we are blindly accepting chucks of code into our code bases.

[0] https://andrew.grahamyooll.com/blog/copy-pasta-driven-develo...

pdimitar

Blame the business people. I tried becoming an expert in `make` probably at least 7 times in a row, was never given time to work with it daily until I fully memorized it.

At one point I simply gave up; you can never build the muscle memory and it becomes a cryptic arcane knowledge you have to relearn from scratch every time you need it. So I moved to simpler tools.

The loss of deep work is not the good programmers' fault. It's the fault of the business people.

skydhash

I wouldn't say so. Make is very simple and you can grasp the basis within an hour or so, if you're familiar with shell scripting (as it's basically a superset of shell scripts, with the dependency graph on top). Then all you have to do is just in time learning which is mostly searching for a simpler pattern that what you're currently doing.

pdimitar

Hm, I'm not sure that's the case. Some of make's weirdness comes from bash-isms, not shell scripting in particular.

markfsharp

Could not agree more!

donatj

If I had a nickel for every time I have seen a Makefile straight up copied from other projects and modified to "work" while leaving completely unrelated unnecessary build steps and targets in place.

It's a major pet peeve of mine.

bboygravity

How do you know what is and isn't related if nothing is documented?

Trial and error?

Well have fun with that :p

marcosdumay

Hum...

You know, a makefile is documentation. That's why you should probably never copy one (except for a single line here or there). There's space for commenting a few stuff, but your target names and variables should explain most of what is going there.

Anyway, the article and most people here seem to be talking about those autotools generated files. Or hand-built ones that look the same way. But either way, it's a bad solution caused by forcing a problem to be solved by a tool that wasn't aimed at solving it. We have some older languages without the concept of a "project" that need a lot of hand-holding for compiling, but despite make being intentionally created for that hand-holding, it's clearly not the best tool for that one task.

donatj

Exactly. Bonus points if the person who started the project moved on and you have to be the one to build and maintain it.

rmgk

You find the first part in your stack that is documented (e.g., make is documented, even if your makefile is not) and use that documentation to understand the undocumented part. You then write down your findings for the next person.

If you don’t have enough time, write down whatever pieces you understood, and write down what parts “seem to work, but you don’t understand“ to help make progress towards better documentation.

If you put the documentation as comments into the file, this can make copy&pasting working examples into a reasonably solid process.

layer8

There are certainly a lot of tools that are more complicated than necessary, but Make as a tool isn’t a good example of that, IMO. With modern tooling, more often than not the complexity problem is compounded by insufficient documentation, the existing documentation being predominantly cookbook-style and not explaining the conceptual models needed to reason about how the tool works, nor providing a detailed and precise enough specification of the tool. That isn’t the case for Make, which is well-documented and not difficult to get a good grasp on, if one only takes the time to actually read the documentation.

The cookbook orientation mentioned above in turn leads to a culture that underemphasizes the importance of learning and understanding the tools that one is using, and of having thorough documentation that facilitates that. Or maybe the direction of causation is the other way around. In any case, I see the problem more in too little time being spent in creating comprehensive and up-to-date documentation on tooling (and designing the tooling to be amenable to that in the first place), and in too little resources being allocated to teaching and learning the necessary tooling.

jcarrano

I wouldn't say this is necessarily a bad thing. I wrote my first version of a Makefile with automatic dependencies and out-of-tree builds 10+ years ago and I have been copying and improving it since. I do try to remove unneeded stuff when possible.

The advantage is that one can go in and modify any aspect of build process easily, provided one takes care to remove cruft so that the Makefile does not become huge. This is very important for embedded projects. For me, the advantages have surpassed the drawbacks (which I admit are quite a few).

You could, in theory, abstract much of this common functionality away in a library (whether for Make or any other software), however properly encapsulating the functionality is additional work, and Make does not have great built-in support for modularization.

In this sense I would not say Make is overly complex but rather the opposite, too simple. Imagine how it would be if in C global variables were visible across translation units. So, in a way, the "Makefile effect" is in part due to the nature of the problem being solved and part due to limitations in Make.

rini17

Can you imagine the makefile was made by someone else and you are now suddenly confronted with the result of 10 years of tuning.

jcarrano

I am that someone else because I seldom edit the makefiles and I forget things. That's why I try to trim unused targets and recipes and I try to keep it documented.

In the end it is no different from any code that's suffered from 10 years of tuning and it can get ugly. Maybe Make is even somewhat worse in this respect, but then again it does not need to be changed often.

imtringued

I would let the expert do his job and I do mine.

mongol

Is not this a very generic phenomenon? I would argue it applies broadly. For example budgeting, you usually start from last year's budget and tweak that, rather than start from scratch. Or when you write an application letter, or a ServiceNow ticket, or whatever. Now I regret that I have brought in ServiceNow in the discussion, it kills the good mood....

Over2Chars

There's also "zero based budgeting" (ZBB) that starts from zero and says "justify everything".

KineticLensman

Which in my experience sometimes involves copying last years justifications

Over2Chars

It might!

But as I understand it and I am not an accountant (IANAA?), for non-ZBB budgets last years budget is usually used as a starting point and increases are justified.

"Here's why I need more money to do the same things as last year, plus more money if you want me to do anything extra".

I'd be curious what our man Le Cost Cutter Elon Musk does for budgeting?

HN

Be Aware of the Makefile Effect

Be Aware of the Makefile Effect