Allocator Hints for Btrfs
18 comments
·February 7, 2025jeroenhd
ThatPlayer
RAID1 has been fine for years, but hardly anyone is running that and RAID5/6 are still considered unstable even by the devs: https://btrfs.readthedocs.io/en/latest/Status.html#block-gro...
I ran into Btrfs RAID6 corruption myself about ten years ago. Though to btrfs's credit, I didn't lose any data: the filesystem only locked up read-only on me. But this time I'll take their word on unstable.
nisa
> RAID1 has been fine for years
It's never been RAID1 it's the odness of the pid decides which disk to read from.
forza_user
There is a patch for this. Hopefully it will be available in mainline kernels soon. https://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux....
bjoli
Btrfs raid 1 works great. I have 3 10tb drives in a raid1c2 for a total of 15tb of space. Btrfs raid1 allows you to use any combination of drives and sizes. If I would have one 12tb drive instead of one 10tb I would have had 16tb of usable space.
For raid5/6 I dont use the built in raid, but mdraid. I stopped waiting for btrfs raid56 to stabilise.
fenazego
I have been using btrfs in raid1 setups over the past years, and have not had any issues. Quite happy with the features (snapshotting, send/receive, adding drives to grow an fs)
forza_user
The allocator hints only makes sense with at least two devices. Btrfs is metadata heavy, so offloading metadata to faster devices can make a big performance difference. For example a HDD combined with a small SSD or NVMe disk, you would set "data preferred" on the HDD and "metadata preferred" on the SSD/NVMe disk.
One advantage with these patches compared to using dm-cache, lvmcache or bcache is that everything is native Btrfs and fully compatible with non-patched kernels (though the 'hints' won't apply for new writes). Btrfs allows for adding additional devices to existing filesystems, so no mkfs and data-shuffling is needed. If you do not like the result, simply set default hint (type 0). You can then "btrfs device remove" your previously added SSD/NVMe devices.
gmokki
Didn't ZFS also have disk corruption on Linux last year?
And a few years before that Ubuntu backports had Ubuntu specific ZFS corruption.
Basically the old truth still holds with checksummed filesystems too: raid is not backup
ThatPlayer
The article mentions bcache, but not bcachefs which you can also do this on. I also like bcachefs more for mixed drive type filesystems because of the bcache feature of using the SSDs as write caches for data that gets moved to the HDD in the background. Also read cache.
This btrfs feature only supports having the data chunks prioritize the HDDs.
Whether or not that matters depends on your use case of course: it's not worse than a RAID setup of only HDD. I'm using bcachefs on budget gaming PCs built from old spare parts, so having a small SSD and a bunch of HDDs for games is great.
raj_hun
I like the idea behind bcache, sadly it was never mainlined in the upstream kernel, because the maintainer is generally uncooperative with the community and is building his own walled garden because btrfs was "Not Invented Here".
homebrewer
I'm pretty sure Kent explained how btrfs's problems lie in its design and are fundamentally unfixable. It's got nothing to do with NIH.
kjs3
No. Kent explained in his transparently biased opinion why btrfs is fundamentally unfixable. Which include such gems as "they wrote it too fast to get it right" (opinion much?) and "it has more code than XFS" (FS that tries to do more has more code is bad?). Many people, including people who just might know a few things about filesystems, have addressed his claims, both in detail and in the large. Now, I am in absolutely no position to objectively evaluate the detailed claims of dueling filesystem gurus, but having read quite a bit, I do feel pretty confident in thinking "it's complicated" and "btrfs for all its issues certainly isn't the flaming shitshow Kent really, really wants everyone to believe it is".
koverstreet
bcache and bcachefs are both mainlined - please don't spread this sort of FUD.
I'm taking the experimental label off in 6.16, so it'd be nice to avoid the drama and have a smooth release :)
(also, fun to see btrfs copying bcachefs)
Cumpiler69
[dead]
Are these tweaks exclusively beneficial to mixed-media RAID setups, or do they also pose an advantage to single-device BTRFS setups?
I know BTRFS is the default FS for a bunch of distros now, but I haven't heard much about it in the RAID space. Most people seem to use ZFS, despite the massive memory pressure it adds, because of fear of bugs and reports of filesystem corruption years ago. Perhaps someone who uses BTRFS for RAID can comment on how well BTRFS RAID works these days?