Skip to content(if available)orjump to list(if available)

BCacheFS is being disabled in the openSUSE kernels 6.17+

qalmakka

RIP BCacheFS. I was hopeful I could finally have a modern filesystem in Linux mainlined (I don't trust Btrfs anymore), but I guess I'll keep on having to install ZFS for the foreseeable future I guess.

As I predicted, out of tree bcachefs is basically dead on arrival - everybody interested is already on ZFS, btrfs is still around only because ZFS can't be mainlined basically

StopDisinfo910

> btrfs is still around only because ZFS can't be mainlined basically

ZFS is extremely annoying with the way it does extend and the fact that you can’t mismatch drive size. It’s not a panacea. There clearly is space for an improved design.

cyphar

This is being worked on (they call it AnyRaid), the work is being sponsored by HexOS[1].

[1]: https://hexos.com/blog/introducing-zfs-anyraid-sponsored-by-...

nubinetwork

Underprivision your disks, then you don't have to worry about those edge cases...

StopDisinfo910

If you need to consider how to buy your drives so you can use a filesystem, that’s a flaw of said filesystem not an edge case.

It clearly is an acceptable one for a lot of people but it does leave space for alternative designs.

sureglymop

I've never had any issues with either ZFS or Btrfs after 2020. I wonder what you all are doing to have such issues with them.

pantalaimon

Did you try RAID5?

pizza234

Just a few days ago I've had a checksum mismatch on a RAID-1 setup, on the metadata in both devices, which is very confusing.

Over the last one or two years I've experienced twice a checksum mismatch on the file storing the memory of a VMWare Workstation virtual machine.

Both are very likely bugs in Btrfs, and it's very unlikely that have been caused by the user (me).

In the relatively far past (around 5 years ago), I've had the system (root being on Btrfs) turning unbootable for no obvious reason, a couple of times.

jamesnorden

Ah yes, the famous "holding it wrong".

happymellon

I've also not had issues with BTRFS.

The question was around usage, because without knowing people's usecases and configurations it'll never be usable for you while working fine for others.

metadat

I've experienced unrecoverable corruption with btrfs within the past 2 years.

motorest

> Ah yes, the famous "holding it wrong".

Is it wrong to ask how to reproduce an issue?

Ygg2

Wait. You don't trust Btrfs but you would trust BCacheFS, that's obviously very experimental?

phire

Btrfs claims to be stable. IMO, it's not.

It's generally fine if you stay on the happy path. It will work for 99% of people. But if you fall off that happy path, bad things might happen and nobody is surprised. In my personal experience, nobody associated with the project seems to trust a btrfs filesystem that fell off the happy path, and they strongly recommend you delete it and start from scratch. I was horrified to discover that they don't trust fsck to actually fix a btrfs filesystem into a canonical state.

BCacheFS had the massive advantage that it knew it was experimental and embraced it. It took measures to keep data integrity despite the chaos, generally seems to be a better design and has a more trustworthy fsck.

It's not that I'd trust BCacheFS, it's still not quite there (even ignoring project management issues). But my trust for Btrfs is just so much lower.

ahartmetz

btrfs seems to be a wonky, ill-considered design with ten years of hotfixes. bcachefs seems to be a solid design that is (or has been, it's mostly done) regularly improved where trouble was found. Now it's just fixing basically little coding oversights. In two years, I will trust bcachefs to be a much more reliable filesystem than btrfs.

rurban

Still more stable than btrfs. btrfs is also dead slow

Iridiumkoivu

I agree with this sentiment.

Btrfs has destroyed itself on my testing/lab machines three times during last two years up to point where recovery wasn’t possible. Metadata corruption being main issue (or that’s how it looks like to me at least).

As of now I trust BCacheFS way more. I’ve given it roughly the same time to prove itself as Btrfs too. BCacheFS has issues but so far I’ve managed to resolve them without major data loss.

Please note that I currently use ext4 in all ”really important” desktop/laptop installations and OpenZFS in my server. Performance being the main concern for desktop and reliability for server.

kiney

btrfs has many technical advantages over zfs

debazel

Yes, like destroying itself and losing all data.

natebc

ZFS is perfectly capable of this too.

source: worked as a support engineer for a block storage company, witnessed hundreds of customers blowing one or both of their feet off with ZFS.

crest

Can you give an example because to me it always appeared as NIH copy-cat fs?

the_duke

This is a tragedy, bcachefs has so many great features...

motorest

Ultimately that's the right call, and the inevitable one as well.

jtickle

All of the "btrfs eats your data" bugs have been fixed and the people who constantly repeat them are people who relied on an experimental filesystem for files they cared not to lose. FUD all around. I have a btrfs on my home file server that's been running just fine for almost 10 years now and has survived the initial underlying hard drives mechanical death. Since then I have used it in plenty of production environments.

Don't do RAID 5. Just don't. That's not just a btrfs shortcoming. I lost a hardware RAID 5 due to "puncture" which would have been fascinating to learn about if it hadn't happened to a production database. It's an academically interesting concept but it is too dangerous especially with how large drives are now, if you're buying three, buy four instead. RAID 10 is much safer especially for software RAID.

Stop parroting lies about btrfs. Since it became marked stable, it has been a reliable, trustworthy, performant filesystem.

But as much as I trust it I also have backups because if you love your data, it's your own fault if you don't back it up and regularly verify the backups.

plqbfbv

> All of the "btrfs eats your data" bugs have been fixed ... I have a btrfs on my home file server that's been running just fine for almost 10 years now and has survived the initial underlying hard drives mechanical death

In the last 10 years, btrfs:

1. Blew up three times on two unrelated systems due to internal bugs (one a desktop, one a server). Very few people were/are aware of the remount-only-once-in-degraded "FEATURE" where if a filesystem crashed, you could mount with -odegraded exactly only once, then the superblock would completely prevent mounting (error: invalid superblock). I'm not sure whether that's still the case or whether it got fixed (I hope so). By the way, these were on RAID1 arrays with 2 identical disks with metadata=dup and data=dup, so the filesystem was definitely mountable and usable. It basically killed the usecase of RAID1 for availability reasons. ZFS has allowed me to perform live data migrations while missing one or two disks across many reboots.

2. Developers merged patches to mainline, later released to stable, that completely broke discard=async (or something similar) which was a supported mount option from the manpages. My desktop SSD basically ate itself, had to restore from backups. IIRC the bug/mailing list discussions I found out later were along the lines of "nobody should be using it", so no impact.

3. Had (maybe still has - haven't checked) a bug where if you fill the whole disk, and then remove data, you can't rebalance, because the filesystem sees it has no more space available (all chunks are allocated). The trick I figured out was to shrink the filesystem to force data relocation, then re-expand it, then balance. It was ~5 years ago and I even wrote a blog post about it.

4. Quota tracking when using docker subvolumes is basically unusable due to the btrfs-cleaner "background" task (imagine VSCode + DevContainers taking 3m on a modern SSD to cleanup 1 big docker container). This is on 6.16.

5. Hit a random bug just 3 days ago on 6.16, where I was doing periodic rebalancing and removing a docker subvolume. 200+ lines of logs in dmesg, filesystem "corrupted" and remounted read-only. I was already sweating, not wanting to spend hours restoring from backups, but unexpectedly the filesystem mounted correctly after reboot. (first pleasant experience in years)

ZFS in 10y+ has basically only failed me when I had bad non-ECC RAM, period. Unfortunately I want the latest features for graphics etc on my desktop and ZFS being out of tree is a no-go. I also like to keep the same filesystem on desktop and server, so I can troubleshoot locally if required. So now I'm still on btrfs, but I was really banking on bcachefs.

Oh well, at least I won't have to wait >4 weeks for a version that I can compile with the latest stable kernel.

The only stable implementation is Synology's, the rest, even mainline stable, failed on me at least once in the last 10 years.

arccy

"performant", it's still slow if you actually use any of the advanced features like copy on write.

FirmwareBurner

Every CoW filesystem is just as slow. There's no magic pill to fix performance but it's a known tradeoff.

betaby

> FUD all around

????

> Don't do RAID 5.

Ah, OK, so not FUD

> Stop parroting lies about btrfs.

I seee

M95D

I'm still waiting for an overlayfs that does read caching on the overlay without the need to format the backing storage.

rekoil

Sounds like what bcache does? https://bcache.evilpiepirate.org/

This is what bcachefs is based on.

lupusreal

The way the BCacheFS situation has been playing out is a tragedy. I had very high hopes for it.

johnisgood

Same. I liked many of its features (actually, all features, see https://bcachefs.org) and I was waiting for it to become usable, but I guess that day will never come now?

So, the alternative is ZFS only, maybe HAMMER2. HAMMER2 does not look too bad either, except you need DragonflyBSD for that.

ahartmetz

What I expect to happen is that bcachefs stabilizes outside of mainline, and after that, it can be merged back because no large patches = not much drama potential.

ThatPlayer

It's not unusable, I use it on a spare computer for fun, cuz I want tiering of SSD + HDDs. And this doesn't mean development has stopped, just not done in the kernel.

InsideOutSanta

Yeah, this all seems so unnecessary. I hope Kent can either figure out how to work in the context of a larger team or find somebody who can do it on his behalf.

johnisgood

> Once the BCacheFS maintainer behaves [...]

So, there are still behavioral issues here I take it? That is a bummer. This is not news to me, but I thought the situation has changed ever since.

bgwalter

[deleted wrongthink]

graemep

There is an apology for that comment and a rewording further down the thread. Evidently made by someone who is not a native speaker who did not realise how it comes across.

teekert

Good addition,thanx.

I've been in a similar situation, letting everyone know I was fired. Apparently in the US this has a negative connotation, and they use "being let go" (or something confusing as "handing in/being handed your 2 weeks notice", a concept completely unknown here). Here we only have one word for "your company terminating your employment", and there is no negative connotation associated with it. This can be difficult for non-natives. We can come across very weird or less intelligent.

T3OU-736

In the US, the terminology tends to split into "fired" (implies "for valid reasons") vs "laid off" (implies "position was terminsted, this was not about the employee or their qualities and performance").

dbdr

Funnily enough the apology ends with:

> If the above offended anyone, I sincerely apology them.

Unless this was tongue-in-cheek, this kind of proves the point that language was the cause. The apology is a good move in any case.

t51923712

Why would the "behave" comment mean anything different in Czech than in English?

The revised version, "Once the bcachefs maintainer conforms to the agreed process and the code is maintained upstream again" is still lecturing and piling on, as the LWN comments say:

https://lwn.net/Articles/1037496/

It is the classic case of CoC people and their inner circle denouncing someone, and subsequently the entire Internet keeps piling on the target.

hebocon

"behave" in this context can refer to simply respecting existing norms about RC code freezing.

rurban

> Once the BCacheFS maintainer behaves and the code is maintained upstream again, we will re-enable... (As IMO, it is a useful feature.)

How cynical. It's the kernel maintainer, not the bcachefs maintainer, who does not behave and has a huge history of unprofessional behavior for decades.

happymellon

The bcachefs maintainer has added new features during bugfix windows, and lied about it.

pantalaimon

It's still an experimental module, the feature was about gathering more debug information.

StopDisinfo910

So?

Bug fix windows are for bug fix. If it’s not a bug fix, it goes in the next version. That’s how the kernel release cycle works. It’s not very complicated.

If it’s so unstable that it urgently needs new features shipped regularly, I think it’s entirely legitimate that it has to live out of tree until it’s actually stable enough.

nicman23

How cynical. It's the bcachefs maintainer, not the kernel maintainer, who does not behave and has a huge history of unprofessional behavior for decades.

it is not like he was not explicitly warned.

fj23Z741GAh

There was a time when Linus encouraged critics of "unprofessional behavior" to snap back at him:

https://lkml.org/lkml/2013/7/15/374

That is a reasonable compromise. Except when someone actually snaps back at him.

boricj

The original author later sent an apology email explaining that it sounded too harsh in English and it wasn't meant to be offensive:

https://lwn.net/ml/all/bece61a0-b818-4d59-b340-860e94080f0d@...

self_awareness

The reason is that people like Linus, because he's entertaining. And people don't like Kent, because he opposed Linus, who is liked. That's all there is too it. Like in some high school.