Skip to content(if available)orjump to list(if available)

Why Flatpak apps use so much disk space on Linux

loloquwowndueo

“Storage is cheap” goes the saying. Other people’s storage has a cost of zero, so why not just fill it up with 100 copies of the same dependency.

These package formats (I’m looking at you snap as well) are disrespectful of users’ computers to the point of creating a problem where due to size, things take so long and bog the computer down so much, that the resource being used is no longer storage, but time (system and human time). And THAT is not cheap at all.

Don’t believe me, install a few dozen snaps, turn the computer off for a week, and watch in amazement as you turn it back on and see it brought to its knees as your computer and network are taxed to the max downloading and applying updates.

wtarreau

Not to mention the catastrophic security that comes with these systems. On a local ubuntu, I've had exactly 4 different versions of the sudo binary. One in the host OS and 3 in different snaps (some were the same but there were a total of 4 different). If they had a reason to be different, it's likely for bug fixes, but not all of them were updated, meaning that even after my main OS was updated, there were still 3 bogus binaries exposed to users and waiting for an exploit to happen. I find that this is the most shocking aspect of these systems (and I'm really not happy with the disrespect of my storage, like you mention).

yjftsjthsd-h

Why do snaps have sudo at all?

brlin2021

The sudo binaries in the snaps are likely to have their SUID bit stripped, so they won't cause any trouble even if they have known vulnerabilities.

codedokode

Snap/flatpak style application packaging is absolutely necessary. Please let me explain this.

Every platform needs a way to distribute and install third-party applications. It is unlikely that the authors of the platform wrote all existing software so you need some way to install applications not included into an OS.

On Windows, there is such a mechanism - pack your application into a large installer exe. It is awful and not secure but at least it has been working for ages. For comparison, on Linux there is no such mechanism at all. Typical Linux distribution traditionally has only two supported ways to install software:

a) install more components from OS repository

b) write and compile the software yourself

While there is lot of third-party software for Linux, there is no universal distribution mechanism. Typically software developer only supports the release they are using. There may be ports - by other developers or by the distribution maintainers - but they are often broken and don't even start (for example, port of waydroid in Fedora).

Some port installation methods are outright dangerous, for example: running a curled script through sudo or adding a third-party repository to your package manager. That is a sure method to get a broken system, and it will be totally your fault because your distribution doesn't support such installation method so you got the result that was expected.

In my experience, Fedora, Ubuntu and Debian are especially bad for supporting applications other than Gimp or Libreoffice: the ports for these distributions are often broken, buggy, crash or even do not exist. For example: waydroid, carla, ardour.

The reason is that there are hundreds of releases of different distributions and nobody is going to port, test and maintain all applications for all of them. So the obvious solution is to choose a standard platform and write applications against this platform. I see no other solution. If I am writing some GUI application I am not interested in learning how Linux distributions are different from each other. I would rather ship an Alpine virtual machine image and not waste my time.

Flatpak/snap might become such platform but why they are so buggy? As I remember, Steam in flatpak is broken and cannot properly use GPU, VS Code is broken etc.

loloquwowndueo

No, this is an orthogonal concern. Bundling all the dependencies and killing users hard disks is not a solution to the “Every platform needs a way to distribute and install third-party applications” problem. If anything the proliferation of packaging formats and useless quibbles between snap and flatpak move us farther from the solution as there are now several competing dependency-bundling installers which require installing a package handler anyway.

In that respect flatpak is better because it’s more universally available than the Ubuntu-specific snaps (attempts at making snap cross-distro are laughable and failed) but it still introduces a bunch of tangential issues which are too much of a bother for the average user versus just doing “apt install” and moving on with their lives. PPAs cover this acceptably if suboptimally for Deb packages.

m4rtink

Snaps do zero deduplication and bundle everything AFAIK - flatpak at least does some deduplication on file level and has shared runtimes.

brlin2021

This statement is false as snaps also have shared runtimes known as "content snaps".

A common example is the ones with the gnome- prefix and the ones that end with -themes suffix.

loloquwowndueo

Wherein snaps found themselves reinventing shared libraries - at which point, what’s really the point.

neuroelectron

For a long time, storage was getting cheaper all the time but we've hit scaling walls in both CPUs and drives. I remember when I was a kid and bought Mechwarrior 2 a game that could use up to 500mb! The guy working the video game locker warned me "are you sure you have enough hard drive space?" after having just bought a 2gb drive for like $60, or something, I don't remember exactly. A concern that would have been valid maybe a year earlier.

dheera

To be fair, shared libraries have been problematic since the beginning of time.

In the Python world, something wants numpy>=2.5.13, another wants numpy<=2.5.12, yet Python has still not come up with a way to just do "import numpy==2.5.13" and have it pluck exactly that version and import it.

In the C++ world, I've seen code that spits out syntax errors if you use a newer version of gcc, others that spit out syntax errors if you use an older version of gcc, apt-get overwrites the shared library you depended on with a newer version, lots of other issues. Install CUDA 11.2, it tries to uninstall CUDA 11.1, never mind that you had something linked to it, and that everything else in that ecosystem disobeys semantics and doesn't work with later minor revisions.

It's such a shitshow that it fully makes sense to bundle all your dependencies if you want to ship something that "just works".

For your customer, storage is cheaper than employee time wasted getting something to work.

loloquwowndueo

Right but snaps don’t solve dependency hell (see content snaps which are shared library bundles).

o11c

That's what everybody uses `venv` for. Or `virtualenv` if you're stuck on old Python.

But as a rule, `<=` dependencies mean there's either a disastrous fault with the library, or else the caller is blatantly passing all the "do not enter" signs. `!=` dependencies by contrast are meaningful just to avoid a particular bug.

dheera

Venvs also suck because the user has to create and activate them.

It would be nice if a foo.py file could just deal with all of that.

A while ago I created this thing as a thought experiment. It likely doesn't work with more recent version of python though: https://news.ycombinator.com/item?id=24735303

It was partly in jest but I was not entirely un-serious; I think "dealing with dependencies" is quite possibly the biggest reason why Python isn't used to distribute end-user applications. It's a wonderful language but a really shitty import system. There are various attempts to make the experience better (pipx, poetry) but you still can't ship a .py file to someone, have them double click and run it. They still have to conda something, venv something, pipx something.

null

[deleted]

int_19h

Virtual environments don't solve the problem of two dependencies that you need having conflicting requirements.

api

There are things like content defined chunking and content based lookup. Evidently that’s too hard.

XorNot

The problem on Linux is that hard links are exactly what you don't want.

If hard links from the get go were copy on write, then I suspect content defined storage would've become the standard because it would be easy.

Instead we have this construct which makes it hard and dangerous (hard links hide data dependencies) on most Linux filesystems and no good answers (even ZFS can't easily handle a cp --reflink operation, and the problem is it's not the default anyway).

api

Use either symbolic links (which is fine in most cases) or overlay filesystems. I agree that hard links are usually not what you want.

peter-m80

Flatpack deduplicates dependencies and shares runtimes between apps

musicnarcoman

"Storage is cheap" if you do not have to pay for it. It is not so cheap when you are the one paying for the organizations storage.

CoolCold

You have savings from not using Windows in such a org - likely your Linux will be free or cheaper one

INTPenis

I use flatpaks daily but not many apps. Because I've been on Atomic Linux for a couple years now flatpak has become part of my daily life.

On this work laptop I have three flatpaks, Signal, Chromium and Firefox. They all take 1.6GiB in total.

On my gaming PC I have Signal, Flatseal, Firefox, PrismLauncher, Fedora MediaWriter and Steam, and obviously they take over 700G because of the games in Steam, but if I count just the other flatpaks they're 2.2GiB.

So yeah, not great, but on the other hand I don't care because I love the packaging of cgroups based software and I don't need many of them. I mean my container images take up a lot more space than my flatpaks.

qbane

I hope articles like this can at least provide some hints when the size of a flatpak store grows without bound. It is definitely more involved than "it bundles everything like a node_modules directory hence..."

[Bug]: /var/lib/flatpak/repo/objects/ taking up 295GB of space: https://github.com/flatpak/flatpak/issues/5904

Why flatpak apps are so huge in size: https://forums.linuxmint.com/viewtopic.php?t=275123

Flatpak using much more storage space than installed packages: https://discussion.fedoraproject.org/t/flatpak-using-much-mo...

catlikesshrimp

Your comment probably took more effort for this article than the prompter of the AI that produced said article.

Conclusion: Thank you for the links

massysett

“ The name "Flatpak" is even a nod to IKEA's flatpacking,”

Which is hilarious: an IKEA flat pack takes up less space than the finished product. Linux flatpack is the exact opposite.

jasonpeacock

The article mentions that Flatpack is not suitable for servers because it uses desktop features.

Does anyone know what those features are or have more details?

Linux generally draws a thin line between server and desktop, having “desktop only” dependencies is unusual less it’s something like needing the KDE or Gnome GUI libraries?

mananaysiempre

This may refer to xdg-desktop-portal[1], which is usable without Flatpak, but Flatpak forces you to go through it to access anything outside the app’s private sandbox. In particular, access to user files is mediated through a powerbox (trusted file dialog) [2] provided by the desktop environment. In a sense, Flatpak apps are normal Linux apps to about the same extent that WinRT/UWP apps are normal Windows apps—close, but more limited, and you’re going to need significant porting in either direction.

(This has also made an otherwise nice music player[3] unusable to me other than by dragging and dropping individual files from the file manager, as all of my music lives in git-annex, and accesses through git-annex symlinks are indistinguishable from sandbox escape attempts. On one hand, understandable; on the other, again, the software is effectively useless because of this.)

[1] https://wiki.archlinux.org/title/XDG_Desktop_Portal

[2] https://wiki.c2.com/?PowerBox

[3] https://apps.gnome.org/Amberol

circularfoyers

> On one hand, understandable; on the other, again, the software is effectively useless because of this.

Just in case you didn't already know, you can use Flatseal[1] to add the symlinked paths outside of those in the default whitelisted paths.

I think it's a good thing Flatpak have followed a security permissions system similar to Android, as I think it's great for security, but I definitely think they need to make this process more integrated and user friendly.

[1] https://flathub.org/apps/com.github.tchx84.Flatseal

Vilian

I can change those permission directly in the KDE settings, with the need to download flatseal, others DE need to implement their own

Vilian

You can allow an application complete access to a folder or your home directory, use flatseal for that

LtWorf

AFAIK it cannot do CLI applications at all.

jeroenhd

It can, but because the Flatpak system depends on APIs like D-Bus getting those to work in headless environments (SSH, framebuffer console, raw TTY) is a pain.

Flatpak will even helpfully link binaries you install to a directory you can add to your $PATH to make command line invocation easy.

ponorin

It assumes that you have a DE running and depends on features like D-Bus. So it's not designed to run headless except for building flatpak packages.

butz

If you are space concious, you should try to select Flatpak apps that are using the same runtime (Freedesktop, GNOME or KDE), and make sure all of them are using exactly the same version of runtime. Correct me if I'm wrong, but only two versions of Flatpak runtimes are supported at a time - current and previous. So during times when transitioning happens to newer runtime, some application upgrades are not done at once, and user ends up using more than one (and sometimes more than two) runtimes. In addition to higher disk space usage, one must account for usual updates too. The more programs and runtimes you have, more updates to download. Good thing, at least updates are partial.

account-5

I can't really comment about snap since I don't use Ubuntu but I thought flatpaks would work similar to how portable apps on windows do. Clearly I'm wrong, but how is it that windows can have portable apps of a similar size to their installable versions and Linux cannot? I know I'm missing something fundamental here, like how people blame Linux for lack of hardware support without acknowledging that hardware vendors do the work for windows to work correctly.

Either way disk space is cheap and abundant now. If I need thenlastest version of something I will use flatpaks.

blahaj

Just a guess, but Windows executables probably depend on a bunch of Windows APIs that are guaranteed to be there, while Linux systems are much more modular and do not have a common, let alone stable ABI interface in the userspace. You can probably get small graphically capable binaries if you depend on QT and just assume it to be present, but Flatpak precisely does not do that and bundles all the dependencies to be independent from shared dependencies of the OS outside of its control. The article also mentions that AppImages can be smaller probably because they assume some common dependencies to be present.

And of course there are also tons of huge Windows software that come with all sorts of their own dependencies.

Edit: I think I somewhat misread your comment and progval is more spot on. On Linux you usually install software with a package manager that resolves dependencies and only installs the unsatisfied dependencies resulting in small install size for many cases while on Windows that is not really a thing and installers just package all the dependencies they cannot expect to be present and the portable version just does the same.

badsectoracula

The equivalent of "Windows portable apps" on Linux isn't flatpaks (these add a bunch of extra stuff and need some sort of support from the OS) but AppImages[0]. AppImages are still not 100% the same (and can never be as Windows applications can rely on A LOT more stuff to be there than Linux desktop apps) but functionally/UX-wise they're the closest: you download some program, chmod +x it and run it like any other binary you'd have on your PC.

Personally i vastly prefer AppImages to flatpaks (in fact i do not use flatpaks at all, i'd rather build the program from source - or not use it if the build process is too convoluted - instead).

[0] https://appimage.org/

codedokode

Looking at their architecture they seem to be a pain to run safely (sandboxed). For example, you cannot take away access to mount syscall due to them mounting themselves using FUSE.

Also are they easy to debug? Do they ship with debugging symbols? Googling around shows that it might be tricky.

kmeisthax

It's a matter of standardization and ABI stability. Linux itself promises an eternally stable syscall ABI, but everything else around it changes constantly. Windows is basically the opposite: no public syscall ABI, but you can always get a window on screen by linking USER.dll and poking it with the correct structures. As a result, Windows apps can assume more, while desktop Linux apps have to ship more.

codedokode

Linux is moving to Windows model, by shipping userspace libraries. For example, ALSA has a library, DRM has a library.

dismalaf

"Portable" apps on Windows just don't write into the registry or save state in a system directory. They can still assume every Windows DLL since the beginning of time will be there.

Versus Linux where you have Gnome vs. KDE vs. Other and there's less emphasis on backwards compatibility and more on minimalism, so they need to package a lot more dependencies (potentially).

If you only install Gnome Flatpaks they end up smaller since they can share a bunch of components.

progval

Installable versions of Windows apps still bundle most of the libraries like portable apps do, because Windows does not have a package manager to install them.

maccard

Windows does have a package manager and has for the last 5 years.

kbolino

Apart from the Microsoft Visual C++ Runtime, there's not much in the way of third-party dependencies that you as a developer would want to pull in from there. Winget is great for installing lots of self-contained software that you as an end user want to keep up to date. But it doesn't really provide a curated ecosystem of compatible dependencies in the way that the usual Linux distribution does.

keyringlight

Assuming you're talking about winget, that seems to operate either as an alternative CLI interface to the MS store with a separate database developers would need to add their manifests to, or to download and run normal installers in silent mode. For example if you do winget show "adobe acrobat reader (64-bit) you can see what it will grab. It's a far cry from how most linux package managers operate

mjevans

Windows 2020 - Welcome to Linux 1999 where the distro has a package manager that has just about everything most users will ever need as options to install from the web.

wmf

Unfortunately a lot of Windows devs are targeting 10 year old versions.

int_19h

If you're targeting Windows, you can assume the following things to be present:

- the entirety of Win32 API

- all the Windows Runtime APIs

- .NET Framework 4.7+

This is a lot of functionality. For example, the list above includes four different widget toolkits alone (Win32, WinForms, WPF, WinRT XAML), several libraries to handle networking (including HTTP), USB, 2D and 3D graphics including text rendering, HTML renderer etc.

And all of this has a highly stable ABI, so long as you do everything by the book. COM/WinRT and .NET provide a stable ABI for high-level object-oriented APIs above and beyond what the basic C ABI can offer.

surajrmal

The real problem in Linux is the lack of a stable abi for anything other than the kernels UAPI. Perhaps one day we will standardize enough of the lower layers of Linux and provide real stability. There is definitely a line between packaging all of your dependencies and packaging none of them. We probably want something in the middle, or else perhaps more functionality which is currently provided via shared libraries should be moved behind stable IPC instead.

johnny22

> Clearly I'm wrong, but how is it that windows can have portable apps of a similar size to their installable versions and Linux cann

They can't depend on many apis existing or at the right version. Linux distros are made from a collection of various third party projects and distros just integrate those. Each of these third party projects has it's own development speed and ABI and API stability policies.

Each distro also has it's own development speed and release policy, which means they might have things that could either be too new or to old. Most distros try to avoid packaging multiple versions of the same project when they can avoid it to ease maintenance as well.

Heck, you can't even guarantee that you have the exact same libc. Most distros use glibc, but there are plenty of systems that use musl.

account-5

I'm replying to myself in reply to everyone who replied to me.

Thanks all for the explanations, much appreciated, I thought I was missing something. I really should have known though, Ive been using portable apps for over 20 years on windows and remember.net apps not being considered portable way back when, which are now considered portable since the run time is on all modern windows.

wltr

That was so useless and the style was so bad, I’m pretty sure it was written with (if not by) LLMs. Not even sure if I’m disappointed finding this low effort content here, or rather not surprised at all. I wish the content here would be more interesting, but maybe I’d want to find some other community for that.

I mean, the comments are much more interesting than this piece of content, but the content itself is almost offending. At least the discussion is much more valuable than what I’ve just read by following that link.

ReptileMan

Why does it seems that we try to both avoid and reinvent the static linker poorly with every new technology and generation. Windows has been fighting with dll hell for 30 years now. Linux seems to not be able to produce alternative to dll hell. Not sure how osx world is.

mcv

I don't know much about package managers, but instead of demanding every app uses the same version of every library, or including all libraries in every app, wouldn't it make more sense to allow different versions of libraries to exist next to each other, and every app simply picks the more up to date version that it supports?

I think that's how npm does it. Not that npm doesn't have its own dependency hell, but that's because different dependencies within the same application can end up requiring different versions of the same sub-dependency. But that's a problem only for developers.

Vilian

...that's literally what flatpak does?, and it deduplicate everything too

mcv

Is it? But if its only duplication is different versions of the same dependency, and it never has real duplication, then what exactly is the problem?

As I said, I don't know much about these things. I thought flatpak had the app and all its dependencies together in a single package.

null

[deleted]

compsciphd

I'll repeat myself, but this is because Docker (and all its descendents) didn't understand my work (or at least took the easy way out).

https://www.usenix.org/conference/usenix-atc-10/apiary-easy-...

I argued (and built years prior to docker), a container oriented file system infrastructure that combined the best of linux style package management and union file systems. Where instead of "packages", you had "layers" (analogous to packages) and an "image" that was just a set of layers. I imagined, instead of a linux distribution having an archive of installable packages, it would provide a mirror of usable layers (and PoCd this by converting a large enough set of Debian packages into layers to cover the applications needed for my PoC).

In such a world, you don't waste (directly at least) any additional space, as you are sharing the packages directly (and therefore the underlying files, which can also have memory benefits in terms of easier sharing of ro code pages, due to being the same page on disk).

You do still have a concept of version sprawl, as different images can be using different versions of the same package, but its not very visible. Each image enumerates directly what "shared" components its using. One could argue that just like upgrading a regular deb/rpm environment is relatively straight forward, "upgrading" (or in reality, creating a new image version from an existing version) in such a world is also easy. Just upgrade the shared layer versions in the image manifest/definition.

I was trying to create a world where you could upgrade the container easily (ex: move the running container's private RW layer to a new container on upgrade, or in a sense resolve the container's layers from version A to version B by swapping around the layers that have changed), but one might argue that today that isn't viewed as valuable, and I might agree. I was trying to demonstrate a system that supported what I called persistent and ephemeral containers, with persistent containers being what became called pets and ephemeral containers being what became called cattle and the world today wants everything to be cattle.