A deep dive into Debian 13 /tmp: What's new, and what to do if you don't like it

135 comments

·August 29, 2025

fh973

Swap on servers somewhat defeats the purpose of ECC memory: your program state is now subject to complex IO path that is not end-to-end checksum protected. Also you get unpredictable performance.

So typically: swap off on servers. Do they have a server story?

abrookewood

That's a really good point that had never occurred to me.

Edit: I think that the use of ZFS for your /tmp would solve this. You get Error Corrected memory writing to an check-summed file system.

yjftsjthsd-h

ZFS /tmp is probably fine, but swapping to ZFS on Linux is dicey AIUI; there's an unfortunate possibility of deadlock https://github.com/openzfs/zfs/issues/7734

abrookewood

Ah, thanks for pointing that out - wasn't aware.

cromka

So maybe another filesystem with heavy checksums could be used? Btrfs or dm-crypt with integrity over ext4?

m463

I can see it now: pro ecc sata and m.2 ssds

justsomehnguy

Well, SATA do have a basic CRC and you would see an increase in CRC transfer errors in SMART if the path (usually the cables) aren't good.

blueflow

First, having no swap means anonymous pages cannot be evicted, named pages must be evicted instead.

Second, the binaries of your processes are mapped in as named pages (because they come from the ELF file).

Named pages are generell not understood as "used" memory because they can be evicted and reclaimed, but if you have a service with a 150MB binary running, those 150MB of seemingly "free" memory are absolutely crucial for performance.

Running out of this 150MB of disk cache will result in the machine using up all I/O capacities to re-fetch the ELF from disk and likely become unresponsive. Having swap does significantly delay this lock-up by allowing anonymous pages to be evicted, so the same memory pressure will cause less stalls.

So until the OOM management on Linux gets fixed, you need swap.

Scaevolus

Swapping anonymous pages can bring the system to a crawl too. High memory pressure makes things very slow with swap, while with swap off high memory pressure is likely to invoke the oom killer and lets the system violently repair.

blueflow

The "bug" with the OOM killer that i implied is that what you describe does not happen. Which is not surprising because disk cache thrashing is normal mode of operation for serving big files to the network. An OOM killer acting on that alone would be problematic, but without swap, that's where the slowdown will happen for other workloads, too.

Its less a bug but an understood problem, and there aren't any good solutions around yet.

goodpoint

That's not how swap is meant to be used on servers.

dooglius

The purpose of ECC has nothing to do with being "end-to-end". A typical CPU path to/from DRAM will not be end-to-end either, since caches will use different encodings. This is generally considered fine since each I/O segment has error detection in one form or another, both in the CPU-to-memory case and the memory-to-disk case. ECC in general is not like cryptographic authentication where it protects against any possible alteration; it's probabilistic in nature against the most common failure modes.

nrdvana

The third mitigating feature the article forgot to mention is that tmpfs can get paged out to the swap partition. If you drop a large file there and forget it, it will all end up in the swap partition if applications are demanding more memory.

m463

what swap partition?

I meant this sort of jokingly. I think have a few linux systems that were never configured with swap partitions or swapfiles.

edoceo

I'm with you. I don't swap. Processes die. OOM. Linux can recover and not lose data. Just unavailable for a moment.

marginalia_nu

The Linux OOM killer is kinda sketchy to rely on. It likes to freeze up your system for long periods of time as it works out how to resolve the issue. Then it starts killing random PIDs to try to reclaim RAM like a system wide russian roulette.

It's especially janky when you don't have swap. I've found adding a small swap file of ~500 MB makes it work so much better, even for systems with half a terabyte of RAM this helps reduce the freezing issues.

Balinares

Swapping still occurs regardless. If there is no swap space the kernel swaps out code pages instead. So, running programs. The code pages then need to be loaded again from disk when the corresponding process is next scheduled and needs them.

This is not very efficient and is why a bit of actual swap space is generally recommended.

TiredOfLife

Using Desktop mode on SteamDeck before they increased the swap was fun. Launch a game, everything freezes, go for an hour long walk, see that the game has finally killed, make and drink cofee while system becomes usable again.

guappa

Fedora did this long before debian. I remember doing wget of an .iso file on /tmp and my entire wayland session being killed by the OOM killer.

I still think it's a terrible idea.

nolist_policy

Use `/var/tmp` of you want a disk backed tmp.

1718627440

I thought /var/tmp is for applications while /tmp is for the user.

buckle8017

Which is a great reason to have a big swap file now.

gnyman

Note though that if you don't have swap now, and enable it, you introduce the risk of thrashing [1]

If you have swap already it doesn't matter, but I've encountered enough thrashing that I now disable swap on almost all servers I work with.

It's rare but when it happens the server usually becomes completely unresponsive, so you have to hard reset it. I'd rather that the application trying to use too much memory is killed by the oom manager and I can ssh in and fix that.

[1] https://docs.redhat.com/en/documentation/red_hat_enterprise_...

mnw21cam

That's not true. Without swap, you already have the risk of thrashing. This is because Linux views all segments of code which your processes are running as clean and evictable from the cache, and therefore basically equivalent to swap, even when you have no swap. Under low-memory conditions, Linux will happily evict all clean pages, including the ones that the next process to be scheduled needs to execute from, causing thrashing. You can still get an unresponsive server under low memory conditions due to thrashing with no swap.

Setting swappiness to zero doesn't fix this. Disabling swap doesn't fix this. Disabling overcommit does fix this, but that might have unacceptable disadvantages if some of the processes you are running allocate much more RAM than they use. Installing earlyoom to prevent real low memory conditions does fix this, and is probably the best solution.

k_bx

Disabling swap on servers is de-facto standard for serious deployments.

The swap story needs a serious upgrade. I think /tmp in memory is a great idea, but I also think that particular /tmp needs a swap support (ideally with compression, ZSWAP), but not the main system.

null

[deleted]

baq

This is why I’m running with overcommit 2 and a different ratio per server purpose.

…though I’m not sure why we have to think about this in 2025 at all.

null

[deleted]

computatrum

The mentioned periodic clean up of tmp files is not enabled out-of-the-box in case of a upgrade from previous Debian versions, see https://www.debian.org/releases/trixie/release-notes/issues.... .

GCUMstlyHarmls

Actually quite handy and practical to know about, specifically in the context of a "low end box" where I personally would prefer that RAM exist for my applications and am totally fine with `/tmp` tasks being a bit slow (lets be real, the whole box is "slow" anyway and slow here is some factor of "vm block device on an ssd" rather than 1990s spinning rust).

greatgib

I'm surprised to discover that it was not already the case for a long time for tmpfs to be used for /tmp, and that change is nice.

But the auto-cleanup feature looks awful to me. Be it desktop or servers, machine with uptime of more than a year, I never saw the case of tmp being filled just by forgotten garbage. Only sometimes filled by unzipping a too big file or something like that. But it is on the spot.

It used to be the place where you could store cache or other things like that that will hold until next reboot. It looks so arbitrary and source of random unexpected bugs to have files being automatically deleted there after random time.

I don't know where this feature comes from, but when stupid risky things like this are coming, I would easily bet that it is again a systemd "I know best what is good for you" broken feature shoved through our throats...

And if coming from systemd, expect that one day it will accidentally delete important filed from you, something like following symlinks to your home dir or your nvme EFI partition...

mrweasel

> I never saw the case of tmp being filled just by forgotten garbage.

It might have more to do with the type of developers I've worked with, but it happens all the time. Monitoring complains and you go into check, and there it is gigabytes of junk dumped there by shitty software or scripts that can't cleanup after themselves.

The issue is that you don't always knows what's safe to delete, if you're the operations person, and not the developer. Periodically auto-cleaning /tmp is going to do break stuff, and it will be easier to demand that the operations team disable auto-cleanup than getting the issue fixed in the developers next sprint.

snmx999

Autocleaning: get the last accessed time from a file and only auto-clean files not accessed in the last n hours, e.g. 24 hours? Should be reasonably safe.

brainzap

devs throw everything into tmp, so it also accumulates a lot of privacy data

loa_in_

I tried out variations on this on my daily driver setups. The design choices here were likely threefold:

Store tmpfs in memory: volatile but limited to free ram or swap, and that writes to disk

Store tmpfs on dedicated volume: Since we're going to write onto disk anyway, make it a lightweight special purpose file system that's commited to disk

On disk tmpfs but cleaned up periodically: additional settings to clean up - how often, what should stay, tie file lifetime to machine reboot? The answers to these questions vary more between applications than between filesystems, therefore it's more flexible to leave clean up to userspace.

In the end my main concern turned out to be that I lost files that I didn't want to lose, either to reboot cleanup, on timer cleanup, etc. I opted to clean up my temp files manually as needed.

cl3misch

A tmpfs itself is basically a ramdisk by definition. I assume you mean /tmp when you say tmpfs?

loa_in_

Yes. I'm not careful lately.

hnlmorg

If you’ve got swap set up then stale files will get written back to disk so at least you’re not RAM indefinitely just because they’re stored on tmpfs.

It’s still not an ideal solution though.

PhilipRoman

I agree about the auto cleanup, I discovered it a few days after using /tmp as a ramdisk for Yocto build. Lost a few patches but nothing significant.

palmfacehn

If I am satisfied with my disk speed, why would I want to use system memory? What are the specific use cases where this is warranted?

margalabargala

Computers like a Raspberry Pi, where the OS is on a sdcard, will hugely benefit.

jauntywundrkind

Yup. There's lots of advice about how to reduce cycle count, increase lifetime of sd cards out there. This post has a bunch of ideas, and tmpfs is definitely on the list. https://raspberrypi.stackexchange.com/a/186/32611

techjamie

Technically it'll have some impact on the number of write cycles your disk goes through, and marginally reduce the level of wear.

Most disks have a lot of write cycles available that you'll be fine anyway, but it's a tiny benefit.

1vuio0pswjnm7

I havent used a non-tmpfs (disk-based) /tmp in over 15 years

Didnt need it on NetBSD, memory could go to zero and system would (thrash but) not crash. When I switched to Linux the OOM issue was a shock at first but I learned to avoid it

I use small form factor computers, with userland mounted in and running from memory, no swap; I only use longterm storage for non-temporary data

https://www.kingston.com/unitedkingdom/en/blog/pc-performanc...

worthless-trash

I'm still a fan of poly instantiated /tmp and PrivateTmp (systemd). This may confuse/annoy admins who are not aware of namespaces, but I know that it definitely closes the attack vector of /tmp abuse by bad actors.

https://www.redhat.com/en/blog/polyinstantiating-tmp-and-var...

ars

File is tmpfs will swap out if your system is under memory pressure.

If that happens, reading the file back is DRAMATICALLY slower than if you had just stored the file on disk in the first place.

This change is not going to speed things up for most users, it will slow things. Instead of caching important files, you waste memory on useless temporary files. Then the system swaps it out, so you can get cache back, and then it's really slow to read back.

This change is a mistake.

saurik

Why is reading the data back from swap be slower at all -- much less "DRAMATICALLY" so -- than saving the data to disk and reading it back?

cwillu

Because swapping back in happens 4kb at a time

mnw21cam

It's also because a filesystem is much more likely to have consecutive parts of a file stored consecutively on disc, whereas swap is going to just randomly scatter 4kB blocks everywhere, so you'll be dealing with random access read speed instead of throughput read speed.

cycomanic

Why?

marginalia_nu

This doesn't really make sense. If /tmp was an on-disk directory the same memory pressure that caused swapping would just evict the file from the page cache instead, again leading to a cache miss and a dramatically slower read.

ars

Reading it back from a filesystem is much much faster than reading it back from swap.

imp0cat

Most systems probably aren't having problems with insufficient RAM nowaday though, do they? And this will reduce wear on your SSD.

Also, you can easily disable it: https://www.debian.org/releases/trixie/release-notes/issues....

magicalhippo

If you're running it in a VM you might not have all that luxurious RAM.

When my Linux VM starts swapping I have to either wait an hour or more to regain control, or just hard restart the VM.

imp0cat

Right, but if it's a VM, it's probably provisioned by something like ansible/terraform? If so, it's quite easy to add an init script that will disable this feature and never have to worry about it again.

mhitza

What distro are you running? systemd-oomd kills processes a bit quicker than what came before (a couple minutes of a slow, stuttery system). Still too slow for a server you'd want to have back online as quickly as possible.

At least now when I run out of memory it kills processes that consume the most memory. A few years back it used to kill my desktop session instead!

ars

It's not about insufficient ram, it's about reserving ram for much more important things: cache.

This changes puts the least important data in ram - temp files - while evicting much more important cache data.

pixelesque

On small VPS systems with 512 MB or 1 GB you're more likely to notice (if /tmp is actually used by what's running on the sytem).

renewiltord

Why is there no write through unionfs in Linux? Feels like a very useful tool to have. Does no one else need this? Have half a mind to write one with an NFS interface.

EDIT: Thank you, jaunty. But all of these are device level. Even bcachefs was block device level. It doesn't allow union over a FUSE FS etc. It seems strange to not have it at the filesystem level.

jona-f

Do you mean that you can mark files for which still the underlying filesystem is used? As far as I remember there were experiments with that about 20 years ago, but it was decided that the added complexity wasn't worth it. The implementation that replaced all of that has been very stable (unlike the ones before) and i'm using it heavily, so i think they had a point. Some write-through behavior can be scripted on top of that.

EDIT: So, wikipedia lists overlayfs and aufs as active projects and unionfs predates both. Maybe unionfs v2 is what replaced all that? Maybe I'm hallucinating...

renewiltord

Overlayfs doesn't write through, and I believe unionfs and aufs no longer support write-through.

What I want is pretty much like how a write-through cache would work.

1. Write to top-level FS? The write cascades down but reads are fast immediately

2. Data not available in top-level FS? The read goes down to the bottom level and then reads up to the top so future reads are fast.

jauntywundrkind

Dm-cache! https://www.kernel.org/doc/Documentation/device-mapper/cache...

ComputerGuru

I feel like this is mixing agendas. Is the goal freeing up /temp more regularly (so you don’t inadvertently rely on it, to save space, etc) or is the goal performance? I feel like with modern nvme (or just ssd) the argument for tmpfs out of the box is a hard one to make, and if you’re under special circumstances where it matters (eg you actually need ram speeds or are running on an SD or eMMC) then you would know to use a tmpfs yourself.

(Also, sorry but this article absolutely does not constitute a “deep dive” into anything.)

probably_wrong

The part that's more likely to bite people here and that's easily overlooked is that files in /var/tmp will survive a reboot but they'll still be automatically deleted after 30 days.

HN

A deep dive into Debian 13 /tmp: What's new, and what to do if you don't like it

A deep dive into Debian 13 /tmp: What's new, and what to do if you don't like it