TMSU: Command-line tool for applying tags and viewing virtual tagged filesystem
49 comments
·January 23, 2025teach
bityard
My version of that is, "metadata curation is a fun hobby but search is king."
walterbell
Search and LLMs can make good use of accurately labelled data.
londons_explore
> people keep thinking that doing a bunch of manual taxonomy work is going to help them find files faster but eventually search catches up
This. I spent many days cataloging, tagging, deduping and organising my photo and data files, programs, bookmarks, etc.
And I've barely used any of those photos or data files since. The time invested totally wasn't well spent, and I should have just left everything called "DSC0000565.jpg".
sunrunner
> retroactively apply tags to terabytes of existing files
This has always felt like one of the primary issues with tag-based lookup over hierarchical. By the time you're knee deep with enough stuff that you realise tags would help, you've already accumulated too much to practically deal with.
That and figuring out what the tags should be upfront and hoping you don't realise you need additional or different tags later on.
walterbell
CLI tools (find, grep, locate, tmsu) enable bulk changes to tags.
casey2
Only to the files you can already easily find with find grep plocate etc
That's because they are already tagged via path, or in the file. I'm just going to wait for multimodal LLM tagging solutions to catch up, rather than just try to hack it with current models/tech.
null
buildbot
Something like this just for photos has been in the back of my mind forever - it would be really nice to have a virtual folder built from images where the exif data says you used X camera or took the photos on X date. This would be useful for editing applications that are not catalog based, just point them at the virtual folder and query the images you want to edit, and there they are.
Edit - Someone mentioned befs but deleted their comment, it seems like it might sorta be supported in modern linux, possibly just read only though:
IncreasePosts
I'm sure one could whip together a FUSE filesystem like this very quickly. Here's something similar from 12 years ago: http://pisarenko.net/blog/2013/06/02/introducing-photofs-fus...
buildbot
For sure, just need the time and motivation :)
EvanAnderson
I'm the guilty party who deleted the comment re: BeFS. I thought my analysis of the project was a little biting and, aside from mentioning BeFS, I didn't think my comment was adding much.
I thought about photos and EXIF tags, too. Duplicating the data from the EXIF into another repository strikes me as a bad idea. That's why I was pining for BeFS.
(I have a lot of crazy ideas about filesystems (arguably more like digital asset management systems) and data ingestion and export. Ideas kind of like the failed WinFS. Nothing will ever come of it because I don't have the skills or the time, but sometimes in fever dreams I imagine this stuff.)
sillywalk
With regard to BeFS or BFS the native BeOS (and Haiku) filesystem.
The TSMU examples for mp3 files + VFS are similar to BeOS.
One of the BeOS advocates - Scott Hacker - created bash script for ripping CDs into MP3s called RipEnc. It would query the CDDB to get the metadata - track names/artists etc, so the files would be renamed from TRACK1 to e.g. "Dead Milkmen - Punk Rock Girl" for the CD. It would then convert the CD tracks to MP3 files. The metadata would be added both in the MP3 ID3 fields, as well as to the extended attributes of the files in BFS, and it would organize the music in folders by Artist or Album or something.
You could then have a query - a virtual folder/directory that lists files based on extended attributes - all mp3 files by ARTIST foo, and from ALBUM bar, that would stay updated if the file metadata changed. I can't remember if this virtual directory was available at the command line - or if it was only available in Tracker (the native BeOS/Haiku file manager).
The problem with this, and it's not just a BFS problem, is that the metadata in the file and about the file get can get un-synced, either when updating it, or transferring it to another system that doesn't support the extended attributes.
groby_b
If you mostly want to query and can live without the VFS, dogsheep[1] is your friend. It's a general tool to import lots of different data types into a personal sqlite instance, and dogsheep-photos[2] both extracts image metadata and uploads all the pics to S3 if you'd like.
On my to-try list, there's also supertag[3], a tag-based filesystem that's mounted via FUSE
[1] https://dogsheep.github.io/ [2] https://github.com/dogsheep/dogsheep-photos [3] https://amoffat.github.io/supertag/
walterbell
Apple/Android devices could assist with offline image analysis and metadata generation, https://github.com/mazzzystar/Queryable
cbull
I've used exif-database for something similar but it doesn't build the folder, it just lets you query the sqlite database to find what you are looking for.
jrgaston
Check out Lightroom.
buildbot
I have 41k photos in my Lightroom catalog, I’ve checked it out.
That doesn’t work when I want to use Capture One, Lightroom does not apply Phase One calibration profiles which makes it useless for them, or my own raw processor for Sinar digital backs.
Recommending the most common digital photo DAM/editor is not really a helpful comment either. The number of people who know what exif is and don’t know about Lightroom has to be…small.
pvg
Discussions in 2014 and 2016
bityard
I'll admit when I first saw this, I was put off by the idea of having a separate tool to do something that _should_ be baked into the filesystem. But honestly, this a pretty close to a very Unixy way of solving the problem. Have a separate (and importantly: optional) tool that does the job and does it well.
Additionally, Linux _does_ support tagging files right in the filesystem via the user.xdg.tags xattr. Although it looks like Dolphin is one of the few userspace tools that knows about it.
layer8
This reminds me a bit of the DESCRIBE command in 4DOS ca. 35 years ago [0]. It was only a single text entry per file, but supported by many tools [1], including file managers. There was a proposal to extend the format to XMP properties [2].
[0] https://archive.org/details/bitsavers_jpsoftware_65101374/pa...
[1] https://4dos.info/4tools.htm#02
[2] http://www.optimasc.com/products/fileid/4dos-descext.pdf
Imustaskforhelp
Yoo this is such a great idea , I once saw a video of a youtuber creating open source tag software and I don't know , I realized a frustration and I was also needing something like this once and I was installing this tag cli tool
but this seems even better, this is why I am on hackernews
lemonwaterlime
See also:
- "Designing better file organization around tags not hierarchies" [1]
- `tag` - a macOS version of `tmsu` that uses the system tags (xattr-based if I recall) [2]
[1] https://www.nayuki.io/page/designing-better-file-organizatio...
ecmm
I also inadvertently implemented something similar as a zsh script [1] and as a simple rust CLI [2] a couple years ago.
NotPractical
xattrs are great, and would the obvious solution for tags/metadata on Linux too, if the syscall API didn't delete them at every opportunity; programs must be explicitly told not to do that.
https://wiki.archlinux.org/title/Extended_attributes
That being said, it'd be cool to see a port of that CLI to Linux using user.xdg.tags. You can avoid deleting them if you're careful.
thebeardisred
I was wondering why this wasn't using xattrs.
Imustaskforhelp
yea when I was looking for linux version I was finding tag again and again lol
felindev
There's multiple projects that attempt similar thing with SQLite, most recent one being Tag Studio. Seems like tag-based file organization is better solution but required mental cost/effort of upkeep is what keeps it from gaining any traction in long run
https://github.com/TagStudioDev/TagStudio/ https://www.youtube.com/watch?v=wTQeMkYRMcw
keernan
User notes and other file metadata has been supported since Linux Kernel 2.6. See man xattr
https://man7.org/linux/man-pages/man7/xattr.7.html
Since it is baked into the file system, it is pretty easy to create bash scripts to add keyword tags by parsing the directory tree (e.g. batch add tags to books, movies, videos, etc stored in hierarchical category directories).
walterbell
Preservation of xattrs is app-dependent? https://news.ycombinator.com/item?id=42807500
keernan
Most apps have flags to be used to preserve xattrs. Others were written as though they never knew xattr existed and therefore fail to use the kernel flags to preserve them. But it is not that difficult to preserve them once you are using them.
I use rsync to back up my book, music, and video collections (which is where I use them) and the meta data is backed up with them - so if something ever happens I can always restore them. The xattr commands also have a backup and restore for just the xattr built in.
walterbell
Could a custom Linux kernel force xattr preservation flags on even for xattr-unaware apps?
null
somat
I still maintain that all it would take to add tags to the unix filesystem is to start calling "files" "tags"
then use ln to add tags.
dotancohen
Wouldn't really help for the use cases that I can think up.
$ ls -l # How grep for files with tag "foo"?
$ find . -tag foo
somat
The hardest(most inefficient) part is finding out what tags a file has.
find . -samefile group/beatles/love_me_do
album/please please me/side 2/1
vocals/paul mcartney/love me do
vocals/john lennon/love me do
year/1963/love me do
group/beatles/love me do
You have to love ontology to go down this route. I do, and did this once... It is possible, but does not really provide any meaningful advantage.dddw
Makes me think of tagspaces https://www.tagspaces.org/
exe34
I really wanted this to exist about 15 years ago, but at the time didn't have the skills to make it happen. Nowadays though I'd want something that can sync across between Linux and Android at the very least. But I've gotten to comfortable with find and grep to bother.
I messed with TMSU back one of the previous times it was posted. It's very cool and works well but I just couldn't make myself go retroactively apply tags to terabytes of existing files.
It almost feels like a personal categorization version of the "AI Bitter Lesson": people keep thinking that doing a bunch of manual taxonomy work is going to help them find files faster but eventually search catches up