Skip to content(if available)orjump to list(if available)

Bloat is still software's biggest vulnerability (2024)

GuB-42

I am beginning to think that the terrible situation with dependency management in traditional C and C++ is a good thing.

Now, with systems like npm, maven or cargo, all you need to do to get a package is to add a line in a configuration file, and it fetches all the dependencies you need automatically from a central repository. Very convenient, however, you can quickly find yourself with 100+ packages from who knows where and 100s of MB of code.

In C, traditionally, every library you include requires some consideration. There is no auto-download, and the library the user has may be a different version from the one you worked with, and you have to accommodate it, and so does the library publisher. Or you may have to ship is with your own code. Anyways, it is so messy that the simplest solution is often not to use a library at all and write the thing yourself, or even better, realize that you don't need the feature you would have used that library for.

Bad reason, and reinventing the wheel comes with its own set of problems, but at least, the resulting code is of a manageable size.

otikik

I thought about this several years ago and I think I hit the right balance with these 2 rules of thumb:

* The closer something is to your core business, the less you externalize.

* You always externalize security (unless security is your exclusive core business)

Say you are building a tax calculation web app. You use dependencies for things like the css generation or database access. You do not rely on an external library for tax calculation. You maintain your own code. You might use an external library for handling currencies properly, because it's a tricky math problem. But you may want to use your own fork instead, as it is close to your core business.

On the security side, unless that's your speciality, there's guys out there smarter than you and/or who have dedicated more time and resources than you to figure that stuff out. If you are programming a tax calculation web app you shouldn't be implementing your own authentication algorithm, even if having your tax information secure is one of your core needs. The exception to this is that your core business is literally implementing authentication and nothing else.

j_w

I feel like "shouldn't be implementing your own authentication" is overblown. Don't write the crypto algorithms. But how hard is it to write your own auth? If you are pulling in a third party dependency for that you still would need to audit it, and if you can audit authentication software why can't you implement it?

Just follow OWASP recommendations. A while back this was posted to HN and it also provides great recommendations: https://thecopenhagenbook.com/ .

bluefirebrand

The main challenge isn't necessarily implementing the algorithms, it is keeping up with the security space

Do you expect your team to be keeping up with new exploits in hardware and networking that might compromise your auth? That takes a lot of expertise and time, which they could instead be spending building features that add business value

It sounds cynical, and it kind of is, but offloading this onto external experts makes way more business sense and probably is what allows you to deliver at all. Security is just too big a space for every software company to have experts on staff to handle

pphysch

There have been major F-ups in recent history with Okta, CrowdStrike, and so on. Keycloak had some major long-standing vulnerabilities. I've had PRs accepted in popular open-source IAM libraries a bit too easily.

Yeah, we shouldn't roll our own cryptography, but security isn't as clean cut as this comment implies. It also frequently bleeds into your business logic.

Don't confuse externalizing security with externalizing liability.

ablob

As far as I know tacking on security after the fact usually leads to issues. It should be a primary concern from the beginning. Even if you don't do it 100% right, you'd be surprised how many issues you can avoid by thinking about this during (and not after) development.

Dropping your rights to open files as soon as possible, for example, or thinking about what information would be available to an attacker should they get RCE on the process. Shoehorning in solutions to these things after the fact tends to be so difficult that it's a rare sight.

I have been recommended to think of security as a process rather than an achievable state and have become quite fond of that perspective.

Extasia785

You are describing domain-driven design. Outsource generic subdomains, focus your expertise on the core subdomains.

https://blog.jonathanoliver.com/ddd-strategic-design-core-su...

cogman10

I think this helps, but I also think the default for any dev (particularly library authors) should be to minimize dependencies as much as possible. Dependencies have both a maintenance and a security cost. Bad libraries have deep and sprawling trees.

I've seen devs pull in frameworks just to get access to single simple to write functions.

casey2

Even if you make the obviously wrong assumption that every library is more secure than the one you would write (that will do less the vast majority of the time) We still end up in a eggs in one basket situation.

You haven't thought through any cyber security games or you are funded to post this bad argument over and over again by state agencies with large 0-day stockpiles.

the__alchemist

I would like to dig into point 2 a bit. Do you think this is a matter of degree, or of kind? Does security, in this, imply a network connection, or some other way that exposes your application to vulnerabilities, or is it something else? Are there any other categories that you would treat in a similar way as security, but to a lesser degree, or that almost meet that threshold for a special category, but don't?

SkiFire13

How many vulnerabilities were due to badly reinventing the wheel in C/C++ though?

Also, people often complain about "bloat", but don't realize that C/C++ are often the most bloated ones precisely because importing libraries is a pain, so they try to include everything in a single library, even though you only need to use less than 10% of it. Look for example at Qt, it is supposed to be a UI framework but it ends up implementing vectors, strings, json parser and who knows how much more stuff. But it's just 1 dependency so it's fine, right?

phkahler

>> Look for example at Qt, it is supposed to be a UI framework but it ends up implementing vectors, strings, json parser and who knows how much more stuff. But it's just 1 dependency so it's fine, right?

Qt is an application development framework, not a GUI toolkit. This is one reason I prefer GTK (there are things I dislike about it too).

r0ze-at-hn

I remember back in the early 2000's that discussion, but now with the the tonnage that systems like npm can pull in I laugh that we ever thought it wouldn't get worse.

GuB-42

There is still an advantage to using Qt over dozens of libraries that offer the same functionality.

Qt is backed by a single company, so all you have to watch out for is that company. Also, Qt is generally high quality, I have worked with it, read the source code, etc... and I generally liked what I saw. So I can reasonably assume that quality is consistent overall. When you have many libraries from many independent developers, it doesn't work. The JSON parser may be good, but it doesn't tell me anything about the library that deal with internationalization for instance, and if I wanted to keep track of everything, that's several time the work compared to a single vendor.

I agree that Qt is bloated though, but multiplatform UI frameworks are hard to keep light. There is a lot going on in a desktop UIs that people only notice when it isn't there. I tend to treat them like I treat the standard libraries, the OS, and for web apps, the browser. Big components, but you reasonably can't do without.

reaperducer

How many vulnerabilities were due to badly reinventing the wheel in C/C++ though?

I don't know. Suppose you tell us.

ChrisSD

In my experience every developer, company, team, sub-team, etc has their own "library" of random functions, utilities, classes, etc that just end up being included into new projects sooner or later (and everyone and their dog has their own bespoke string handling libraries). Copy/pasting large chunks of code from elsewhere is also rampant.

I'm not so sure C/C++ solves the actual problem. Only sweeps it under a carpet so it's much less visible.

achierius

It definitely does solve one problem. Like it or not, you can't be hit by supply chain attacks if you don't have a supply chain.

dgfitz

I mirror all deps locally and only build from the mirror. It isn’t an issue. C/C++ is my dayjob

Frieren

> every developer, company, team, sub-team, etc has their own "library" of random functions, utilities, classes, etc

You are right. But my conclusion is different.

If it is a stable and people have been there for a while then developers know that code as well as the rest. So, when something fails they know how to fix it.

Bringing generic libraries may create long callstacks of very generic code (usually templates) that is very difficult to debug while adding a lot of functionality that is never used.

Bringing a new library into the code base need to be a though decision.

ryandrake

> In my experience every developer, company, team, sub-team, etc has their own "library" of random functions, utilities, classes, etc that just end up being included into new projects sooner or later

Same here. And a lot of those homegrown functions, utilities and classes are actually already available, and better implemented, in the C++ Standard Library. Every C++ place I've worked had its own homegrown String class, and it was always, ALWAYS worse in all ways than std::string. Maddening. And you could never make a good business case to switch over to sanity. The homegrown functions had tendrils everywhere and many homegrown classes relied on each other, so your refactor would end up touching every file in the source tree. Nobody is going to approve that risky project. Once you start down the path of rolling your own standard library stuff, the cancer spreads through your whole codebase and becomes permanent.

rileymat2

Although I like std::string for somethings becomes a little tricky with cross platform work that involves both linux and windows. It also can be tricky with unicode and lengths.

grg0

This is something that I think about constantly and I have come to the same conclusion. While the idea of being able to trivially share code worldwide is appealing, so far it seems to encourage shittier software more than anything else, and the benefit of sharing trivially seems to be defeated by the downsides that bloat and bad software bring with it. Adding friction to code re-use (by means of having to manually download shit from a website and compile it yourself like it's 1995) seems to be a good thing for now until a better package management system is figured out. The friction forces you to think seriously where you actually need that shit or you can write the subset of the functionality you need yourself. To be clear, I also think C++ projects suffer a lot from re-inventing the wheel, particularly in the gamedev world, but that seems to be less worse than, e.g., initializing some nodejs framework project and starting with 100+ dependencies when you haven't even started to write shit.

pixl97

When doing SBOM/SCA we see apps with 1000+ deps. It's insane. It's so often we see large packages pulled in because a single function/behavior is needed and ends up massively increasing the risk profile.

null

[deleted]

1over137

Holy cow. What domain is this? Web-based probably?

rglullis

Cathedrals vs Bazaars.

Cathedrals are conservative. Reactionary, even. You can measure the rate of change by generations.

Bazaars are accessible and universal. The whole system is chaotic. Changes happen every day. No single agent is in control.

We need both to make meaningful progress, and it's the job of engineers to take any given problem and see where to look for the solution.

staunton

> While the idea of being able to trivially share code worldwide is appealing, so far it seems to encourage shittier software more than anything else, and the benefit of sharing trivially seems to be defeated by the downsides that bloat and bad software bring with it.

A lot of projects would simply not exist without it. Linux, comes to mind. I guess one might take the position that "Windows is fine" but would there ever have been even competition for Windows?

Another example, everyone would be rolling their own crypto without openssl, and that would mean software that's yet a lot more insecure than what we have. Writing software with any cryptography functionality in mind would be the privilege of giant companies only (and still suck a lot more than what we have).

There's a lot more things. The internet and software in general would be set back ~20years. Even with all the nostalgia I can muster, that seems like a much worse situation than today.

grg0

All those projects existed long before package managers in programming languages were a thing (although you could consider the distro's package manager to fulfill that purpose, I guess), so I don't think your point really takes away from mine. And for sure, there are critical dependencies like openssl that better be a shared endeavour. But whether you pull those dependencies in manually or through a package manager is somewhat tangential.

rgavuliak

I agree fully, most users care about making their lives easier, not about development purity. If you can't do both, the puritanistic approach loses.

crabbone

This is all heuristic (read "guessing") and not a real solution to the problem.

The ground truth is that software bloat isn't bad enough of a problem for software developers to try and fight it. We already know how to prevent this, if really want to. And if the problem was really hurting so much, we'd have automated ways of slimming down the executables / libraries.

In my role in creating CI for Python libraries, I did more hands-on dependency management. My approach was to first install libraries with pip, see what was installed, research why particular dependencies have been pulled in, then, if necessary, modify the packages in such a way that unnecessary dependencies would've been removed, and "vendor" the third party code (i.e. store it in my repository, at the version I need). This, obviously, works better for programs, where you typically end up distributing the program with its dependencies anyways. Less so for libraries, but in the context of CI this saved some long minutes of reinstalling dependencies afresh for every CI run.

In the end, it was a much better experience than what you usually get with CI targeting Pyhon. But, in the end, nobody really cared. If CI took less than a minute to complete instead of twenty minutes, very little was actually gained. The project didn't have enough CI traffic for this to have any actual effect. So, it was a nice proof of concept, but ended up being not all that useful.

ryandrake

The reason bloat doesn't get fixed is that it's a problem that doesn't really harm software developers. It is a negative externality whose pain is spread uniformly across users. Every little dependency developers add to make their work more convenient might increase the download size over the user's network by 100MB, or use another 0.5% of the user's CPU, or another 50MB of the user's RAM. The user gets hit, ever so slightly, but the developer sees only upside.

HPsquared

The phrase "cheap and nasty" comes to mind. Over time, some markets tend towards the cheap and nasty.

TeMPOraL

Some? Almost all. That's the default end state if there's actual competition on the market.

socalgal2

100, ha! The official rust docs, built in rust, use ~750 dependencies - queue the apoligists

matheusmoreira

> There is no auto-download

There is. Linux distributions have package managers whose entire purpose is to distribute and manage applications and their dependencies.

The key difference between Linux distribution package managers and programming language package managers is the presence of maintainers. Any random person can push packages to the likes of npm or PyPI. To push packages to Debian or Arch Linux, you must be known and trusted.

Programming language package managers are made for developers who love the convenience of pushing their projects to the world whenever they want. Linux distribution package managers are made for users who prefer to trust the maintainers not to let malware into the repositories.

Some measured amount of elitism can be a force for good.

ozim

Writing everything from scratch by hand is an insane take. It is not just reinventing the wheel but there are whole frameworks one should use because writing that thing on your own will take you a lifetime.

Yes you should not just pull as dependency thing that kid in his parents basement wrote for fun or to get OSS maintainer on his CV.

But there are tons of legitimate libraries and frameworks from people who are better than you at that specific domain.

barrkel

That's not how it works.

Here's a scenario. You pull in some library - maybe it resizes images or something. It in turn pulls in image decoders and encoders that you may or may not need. They in turn pull in metadata readers, and those pull in XML libraries to parse metadata, and before you know it a fairly simple resize is costing you 10s of MB.

Worse, you pull in different libraries and they all pull in different versions of their own dependencies, with lots of duplication of similar but slightly different code. Node_modules usually ends up like this.

The point is not writing the resize code yourself. It's the cultural effect of friction. If pulling in the resize library means you need to chase down the dependencies yourself, first, you're more aware of the cost, and second, the library author will probably give you knobs to eliminate dependencies. Perhaps you only pull in a JPEG decoder because that's all you need, and you exclude the metadata functionality.

It's an example, but can you see how adding friction to pulling in every extra transitive dependency would have the effect of librabry authors giving engineers options to prune the dependency tree? The easier a library is to use, the more popular it will be, and a library that has you chasing dependencies won't be easy to use.

lmm

> You pull in some library - maybe it resizes images or something. It in turn pulls in image decoders and encoders that you may or may not need. They in turn pull in metadata readers, and those pull in XML libraries to parse metadata, and before you know it a fairly simple resize is costing you 10s of MB.

This is more likely to happen in C++, where any library that isn't header-only is forced to be an all encompassing framework, precisely because of all that packaging friction. In an ecosystem with decent package management your image resizing library will have a core library and then extensions for each image format, and you can pull in only the ones you actually need, because it didn't cost them anything to split up their library into 30 tiny pieces.

MonkeyClub

> The easier a library is to use, the more popular it will be

You're thinking correctly on principle, but I think this is also the cause of the issue: it's too easy to pull in a Node dependency even thoughtlessly, so it's become popular.

It would require adding friction to move back from that and render it less easy, which would probably give rise to a new, easy and frictionless solution that ends up in the same place.

procaryote

There's a difference between "I need to connect to the database and I need to parse json, so I need two commonly used libs for those two things" and whatever npm is doing, and to some extent cargo or popular java frameworks are doing.

Building everything from scratch is insane, but so's uncritically growing a dependency jungle

actionfromafar

I feel you are arguing a bit of a strawman. The take is much more nuanced than write everything from scratch.

ozim

... simplest solution is often not to use a library at all and write the thing yourself, or even better, realize that you don't need the feature you would have used that library for ... the resulting code is of a manageable size..

I don't see the nuance there, that is my take of the comment, those are pretty much strongest statements and points about using libraries are minimal.

That is why I added mine strongly pointing that real world systems are not going to be "managable size" unless they are really small or a single person is working on the.

BrouteMinou

When you "Reinvent the wheel", you implement only what you need in an optimized way.

This gives a couple of advantages: you own your code, no bloat, usually simpler due to not having all the bells and whistles, less abstraction, so faster because there is no free lunch, minimize the attack surface for supply chain attacks...

For fun, the next time you are tempted to install a BlaZiNg FaSt MaDe in RuSt software: get the source, install cargo audit and run the cargo audit on that project.

See how many vulnerabilities there are. So far, in my experience, all the software I checked come with their list of vulnerabilities from transitive dependencies.

I don't know about npm, I only know by reputation and it's enough for me to avoid.

nebula8804

That wheel is only as good as your skill in making it. For many people (the majority i'd guess) someone else making that wheel will have a better end result.

doublerabbit

The skill is produced by carving the wheel. You've got to start somewhere. Whether a mess or not the returned product is a product of your own. By relying on dependencies you're forever reaching for a goal you'll never achieve.

dvh

People often think "speed" when they read "bloat". But bloat often means layers upon layers of indirection. You want to change the color of the button in one dialog. You find the dialog code, change the color and nothing. You dig deeper and find that some modules use different colors for common button, so you find the module setting, change the color and nothing. You dig deeper and find that global themes can change colors. You find the global theme, change the color and nothing. You start searching entire codebase and find that over 17 files change the color of that particular button and one of those files does it in a timer loop because your predecessor couldn't find out why the button color changed 16 times on startup so he just constantly change it to brown once a second. That is bloat. Trivial change will take you half a day. And PM is breathing on your neck asking why changing button color takes so long.

alganet

No. What you described is known as technical debt.

Bloat affects the end user, and it's a loose definition. Anything that was planned, went wrong, and affects user experience could be defined as bloat (many toolbars like Office had, many purposes like iTunes had, etc).

Bloat and technical debt are related, but not the same. There is a lot of software that has a very clean codebase and bloated experience, and vice-versa.

Speed is an ambiguous term. It is often better to think in terms of real performance and user-perceived performance.

For example, many Apple UX choices prioritize user perceived performance instead of real performance. Smooth animations to cover up loading times, things such as that. Their own users don't even know why, they often cannot explain why it feels smooth, even experienced tech people.

Things that are not performant but appear to be fast are good examples of good user-perceived performance.

Things that are performant but appear to be slow exist as well (fast backend lacking proper cache layer, fast responses but throttled by concurrent requests, etc).

FirmwareBurner

>many Apple UX choices prioritize user perceived performance instead of real performance.

Then why does Apple still ship 60Hz displays in 2025? The perceived performance on scrolling a web page on 60Hz is jarring no matter how performant your SoC is.

jsheard

Apple backed themselves into a corner with desktop monitors by setting the bar for Retina pixel density so high, display manufacturers still aren't able to provide panels which are that large and very dense and very fast. Nobody makes 5K 27" 120hz+ monitors because the panels just don't exist, not to mention that DisplayPort couldn't carry that much data losslessly until quite recently.

There's no excuse for 60hz iPhones though, that's just to upsell you to more expensive models.

os2warpman

> Then why does Apple still ship 60Hz displays in 2025?

To push people who want faster displays to their more expensive offerings.

60Hz: $1000

120Hz: $1600

That's one reason, among many, why Apple has a $3 trillion market cap.

For a site with so many people slavishly obsessed with startups and venture capital, there seems to be a profound lack of understanding of what the function of a business is. (mr_krabs_saying_the_word_money.avi)

alganet

I don't know why.

I said many choices are focused on user-perceived performance, not all of them.

Refresh rate only really makes a case for performance in games. In everyday tasks, like scrolling, it's more about aesthetics and comfort.

Also, their scrolling on 60Hz looks better than scrolling on Android at 60Hz. They know this. Why they didn't prioritize using 120Hz screens is out of my knowledge.

Also, you lack attention. These we're merely examples to expand on the idea of bloat versus technical debt.

I am answering out of kindness and in the spirit of sharing my perspective to point the thread in a more positive discussion.

BobbyTables2

At the library level, I dislike how coarse grained most things are. Sadly becomes easier to reimplement things to avoid huge dependency chains.

Want a simple web server ? Well, you’re going to get something with a JSON parser, PAM authentication, SSL, QUIC, websockets, an async framework, database for https auth, etc.

Ever look at “curl”? The number protocols is dizzing — one could easily think that HTTP is only a minor feature.

At the distro level, it is ridiculous that so long after Alpine Linux, the chasm between them and Debian/RHEL remains. A minimal Linux install shouldn’t be 1GB…

We used to boot Linux from a 1.44mb floppy disk. A modern Grub installation would require a sizable stack of floppies! (Grub and Windows 3.0 are similar in size!)

procaryote

> Want a simple web server ? Well, you’re going to get something with a JSON parser, PAM authentication, SSL, QUIC, websockets, an async framework, database for https auth, etc.

Simple means different things for different people it seems. For a simple web server you need a tcp socket.

If you want a full featured high performance web server, it's not gonna be simple.

udev4096

Alpine's biggest hurdle is musl. Most of the software still relies on libc. You should look into unikernels [0], it's the most slimmed down version of linux that you can ship. I am not sure how different a unikernel is from a distroless image tho

[0] - https://unikraft.org/

anacrolix

Alpine is not as good as it seems. It's mostly broken it just works when you ask it to run a handful of common tools. Everything out of view is completely broken.

actionfromafar

I think we lost something with static linking when going from C to Dotnet. (And I guess Java.) Many C (and C++, especially "header only") libraries when statically linked are pretty good at filtering out unused code.

Bundling stuff in Dotnet are done much more "runtime" often both by design of the library (it uses introspection¹) and the tools².

1: Simplified argument - one can use introspection and not expect all of the library to be there, but it's trickier.

2: Even when generating a self contained EXE, the standard toolchain performs no end-linking of the program, it just bundles everything up in one file.

anacrolix

I disagree. Most people here myself included aren't using Java or .NET. You are in a microcosm in this audience.

neonsunset

> I think we lost something with static linking when going from C to Dotnet. (And I guess Java.) Many C (and C++, especially "header only") libraries when statically linked are pretty good at filtering out unused code.

This is an interesting statement because, for example, in C version of Mimalloc you end up paying for opt-in assertions because they still exist in the code unless you compile a different version that strips them away. In C# port, you can set the same assertions/checks early with AppContext switch, and then the values will be cached in static readonly fields. Then, when JIT recompiles the code to a more optimized version, these values will become JIT constants leading to all the unreachable code to be optimized away completely (and to much better inlining of now streamlined methods).

> Even when generating a self contained EXE, the standard toolchain performs no end-linking of the program, it just bundles everything up in one file.

  /p:PublishTrimmed=true
or even

  /p:PublishAot=true # please note it's better to set it as a project property, but either way it requires non-optional linking
Lastly, consider that JITing the bytecode essentially acts like if everything is a single, statically-linked compilation unit since it's not subject to inconvenient compilation unit restrictions even Rust is subject to, the problems of which need to be cleaned up with link-time optimization.

kant2002

I think you overestimate ability of Dotnet to trim unused things. As a person who spend a lot of time wandering across ecosystem and measuring what can be done, I would say we have very bulky and complicated libraries in the .Net.

Just bringing HttpClient(without SSL support) add 6Mb of generated code.

Minimal API gets you additional 21 Mb. And we not even talk about desktop applications here.

Reflection is very very core of .Net ecosystem and you cannot reliably trim with how we use it currently

_fat_santa

> At the distro level, it is ridiculous that so long after Alpine Linux, the chasm between them and Debian/RHEL remains. A minimal Linux install shouldn’t be 1GB…

I would say this is a feature and not a bug. Alpine Linux is largely designed to be run in containerized environments so you can have an extremely small footprint cause you don't have to ship stuff like a desktop or really anything beyond the very very basics.

Compare that to Ubuntu which for the 5GB download is the "Desktop" variant that comes with much more software

michaelmrose

>A minimal Linux install shouldn’t be 1GB

Why not this seems pretty arbitrary. Seemingly developer time or functionality would suffer to achieve this goal. To what end?

Who cares how many floppies grub would require when its actually running on a 2TB ssd. The actually simpler thing is instead of duplicating effort to boot into Linux and use Linux to show the boot menu then kexec into the actual kernal or set it to boot next. See zfsbootmenu and "no more boot loader" this is simpler and less bloated but it doesnt use less space

spacerzasp

There is more to size than storage space. Larger applications take more memory, more cpu caches; things spill over to normal memory, latencies grow and everything runs much slower

michaelmrose

For practical purposes given more than enough RAM and fast storage there is no meaningful user discernible performance differences between a 500Mb OS and a 30GB OS.

Whereas very small linux distros are useful in several areas like containers and limited hardware running such on the desktop is an objectively worse experience and is moreso a minimalism fetish than a useful strategy.

jongjong

In my last job, just to run the software on my local machine, I had to launch 6 different microservices running in a containerized, Linux virtualized environment on Windows and had to launch them in a particular order and had to keep each one in a separate console for debugging purposes. It took about 20 minutes to launch the software to be able to test it locally. The launch couldn't be automated easily because each service was using a mix of containers and plain Node.js servers with different versions and it was Windows so I would probably have to write some unfamiliar code for Windows to automate opening all the necessary git bash tabs...

The services usually persisted except for automatic updates so I only had to restart all the services a few times per week so it didn't make sense to invest time to automate.

n_ary

At the risk of sounding very naïve and making huge guesses, what you describe seems to be what docker-compose solves. Special order of services, launching several containers at once. However, I have seen my fair share of oddities in the trenches where containers are evolution of virtual machines(vagrant) running everything in one vm but now split out into containers without adapting to how containers work, because new tech lead thought vms were uncool and everything must be docker now.

jongjong

We do use docker compose (thank god) but I also need to run a server from source for most of the microservices in order to modify and debug the code. There are around 20 something containers in practice, 6 pods/services. All interdependent and necessary to run the product (it's a legacy codebase 10+ years old, I joined less than 1 year ago and had nothing to do with architecture decisions). Most features touch on at least 3 to 4 repos/microservices all impossible to decouple. The problem is really opening and launching code across 6 bash consoles some of which require an additional manual authentication step with various cloud providers. I need the ability to restart some independently after making code changes. It's just a very complicated system.

I'm sure the launch can be fully automated but it's kind of at the edge of not worth automating because of how relatively infrequently I need to restart everything... Also the CEO doesn't like to make time for work which doesn't yield visible features for end users.

I actually handed my resignation a month ago, without another job lined up. It became too much haha. Good practice though. Very stressful/annoying.

branko_d

I remember, at the turn of the century (was is 2001?) when Microsoft was touting "weak coupling" achievable through "web services" and demoing the support for SOAP in Visual Studio.

To me, that was the strangest idea - how could you "decouple" one service from another if it needs to know what to call, and what information to pass and in what format? Distributing the computing - for performance, or redundancy or security or organizational reasons - that I can understand - but "weak coupling" just never made sense to me.

codr7

Yep, one of the minor details the micro service fan club don't talk about much.

Firing up the whole mess and debugging one or two of them locally is always a major pain, and god help you if you have no idea which services to stub and which to debug.

auszeph

Something I've felt is missing is a developer orchestration layer that makes it really easy to define the set of services like a docker-compose but just as easy to switch implementations between container, source, or remote.

Sometimes you need them all from source to debug across the stack, when you don't you might need a local container to avoid pollution from a test env, sometimes it is just fine to port-forward to a test env and save yourself the local resources.

vjvjvjvjghv

I had a discussion with team members and we agreed that we will make our next systems fully deployable with one script or installer. It requires a little more thought and discipline but will result in much cleaner architecture and will also document itself this way.

jongjong

Completely worth it IMO. My philosophy nowadays (on my side projects) is to make every software feel like a complete product that you can run out of the box, batteries included... I also try to support older engine versions to avoid setup issues.

If you take care of the developer, the project looks after itself.

bee_rider

I like that they are containerized microservices, but you have to launch them in a particular order. Hahaha. What a nightmare. Congrats on it being a former job. Move on to better things? Well, unemployment would be preferable.

liendolucas

Try CUDA in a Docker environment. Yesterday it took all day long to download an Ubuntu image (5.27Gb) and its Python dependencies (another few Gb) to install Pytorch. I've probably wasted 10Gb of bandwidth just to have the environment up and running. Fortunately in the meantime I wrote 90% what I needed to do. Oh I forgot that I still need to download a couple of hugging face models. Nice.

zelphirkalt

Was Windows a requirement or your own choice? Asking because I have seen people unwilling to switch to a GNU/Linux VM or boot into GNU/Linux and then forever struggling with their setup, while other people on the team used GNU/Linux or MacOS and didn't have nearly as many problems.

jongjong

Requirement. Had to use Azure too. I use Linux at home.

ronbenton

>Even companies with near-infinite resources (like Apple and Google) made trivial “worst practice” security mistakes that put their customers in danger. Yet we continue to rely on all these products.

I am at a big tech company and have seen some wildly insecure code make it into the codebase. I will forever maintain that we should consider checking if candidates actually understand software engineering rather than spending 4 or 5 hours seeing if they can solve brainteasers.

spooky_action

How do you propose we do this?

udev4096

Look at their code, from projects or any open source contributions. Ask how they intend to write secure code, rather than asking a bunch of useless algorithmic problems

shakna

When tech reports a library as insecure, but it takes a year to approve removal, much of the difficulty doesn't lie at the coder level of the corporation's infrastructure.

bob1029

When it comes to building software for money, I prefer to put all of my eggs into one really big basket.

The fewer 3rd parties you involve in your product, the more likely you will encounter a comprehensive resolution to whatever vulnerability as soon as a response is mounted. If it takes 40+ vendors to get pixels to your customers eyeballs, the chances of a comprehensive resolution rocket toward zero.

If every component is essential, does it matter that we have diversified the vendor base? Break one thing and nothing works. There is no gradient or portfolio of options. It is crystalline in every instance I've ever encountered.

joseda-hg

So Microsoft's everything and the kitchen sink approach?

boznz

Yet if you deliver a system without a modern bloated framework or a massive cloud stack and you are "old fashioned" and "out of touch" - been there done that, got the tee-shirt.

al_borland

Being mandated to throw away simple and stable code in favor of the “new platform” that changes every 18 months has been one of the most frustrating experiences of my working life and turned me into a bit of a nihilist (in a work context).

PaulHoule

Personally I see Docker as a problem more than a solution.

Back when I had slow ADSL (like 2 Mbps) I couldn't use Docker at all at home because the repository server had low timeouts. I was downloading 20GB games with Steam not to mention Freebase data dumps and other things that large because I had reliable tools to do the downloads, which Docker didn't use so downloading 5GB of images was not "wait for it" but rather "you can't do it."

By accelerating the rate at which you can attach random dependencies you can run into problems because you are using 6 different versions of libc for Christ's sake. Rather than getting Python from some reputable source like conda or deadsnakes, Docker gives data scientists superpowers to get Pythons with random strange build options and character encodings. A 20 megabyte patch requires 2 GB of disk IO once it goes through the Docker IO multiplier. A 5 minute build becomes a 20 minutes build. Docker is fast from the viewpoint of "ops" but is slow from the viewpoint of "dev"; where people use Docker they are always taking forever to do the simplest things and facing extreme burnout.

moralestapia

Docker sandboxes execution so it kind of helps as well?

PaulHoule

Back in 2004 I was regularly setting up Apache and IIS servers with 80 or more applications running on them simply by being systematic about how they were configured. In 2014 somebody wants to sell it back to me with 10x the disk I/O and a lot more that can go wrong, no thanks!

There are some places where people really want to run 8 versions of Java and 3 versions of PHP and think it's going to make them productive that they can write 15 microservices in 15 different languages... It's a delusion. If you get purposeless variation of variances in your system in control you are in control and have a huge competitive advantage over 10x larger teams who use tools that let them barrel on without being in control.

dang

Discussed at the time:

A 2024 plea for lean software - https://news.ycombinator.com/item?id=39315585 - Feb 2024 (240 comments)

al_borland

A big issue is the speed at which teams are expected to deliver. If every sprint is expected to deliver value to the user, there is isn’t enough slack in the system to go back and prune the code to remove cruft. People end up cutting corners to meet deadlines set by management. The corners that get cut are the things that are invisible in the demo. Security, documentation, and all the chewing gum holding it all together.

BLKNSLVR

And once a level of "story points" is achieved within a Sprint you can't go backwards and you can't deliver less value to the Customer. There is no room for re-evaluation. Forwards, moar!

As per Tame Impala's Elephant:

He pulled the mirrors off his Cadillac

Because he doesn't like it looking like he looks back

Looking back gives the impression of missteps or regret. We have no such thing!

JackSlateur

This is why cruft removal is linked to the value delivered to the user

You do not say : "there is two task: add some feature, takes 1 day, and delete some cruft, takes 1 day".

You say: "Yes, that feature. That's one task. It will take 2 days."

chading

Scrum points are about engineering controllability, rather than performance. But that's a complexity most don't get.

JackSlateur

Exactly

And because it is based on nothing, you can just lie about it

null

[deleted]

ahmedaley

So we have been working on a solution to this problem for the past 5 years at university. We have just released one tool for containers (not the full thing for now) and we are about to release our tools for removing bloat in shared libraries. Out paper describing one of these tools won the best paper award at MLSys yesterday! https://mlsys.org/virtual/2025/poster/3238

If there are any adopted or anyone who would like to try our tools, please reach out! We would love to support you!

jmclnx

No argument from me, I also believe bloat is a very large problem.

A get of my lawn section :)

I remember when GUIs started becoming a thing, I dreaded the move from Text to GUIs due to complexity. I also remember most programs I wrote when I started on minis were 64k code and 64k text. They were rather powerful even by today's standards, they did one thing and people had to learn which one to use to perform a task.

Now we have all in one where in some cases you need to page through endless menus or buttons to find an obscure function. In some cases you just give up looking and move on. Progress I guess.

zelphirkalt

There is still a fundamental difference between move from text interface to GUI on one hand and adding bloat so many people add these days on the other hand. GUI is some entirely different paradigm of usage, while the bloat of today can often be replaced with little code and one retains the same functionality.

rjsw

My first GUI applications used GEM, they were compiled to 8086 small model so the same 64k code and 64k data, didn't get close to running out of address space.