Skip to content(if available)orjump to list(if available)

TMSU: Command-line tool for applying tags and viewing virtual tagged filesystem

teach

I messed with TMSU back one of the previous times it was posted. It's very cool and works well but I just couldn't make myself go retroactively apply tags to terabytes of existing files.

It almost feels like a personal categorization version of the "AI Bitter Lesson": people keep thinking that doing a bunch of manual taxonomy work is going to help them find files faster but eventually search catches up

bityard

My version of that is, "metadata curation is a fun hobby but search is king."

walterbell

Search and LLMs can make good use of accurately labelled data.

londons_explore

> people keep thinking that doing a bunch of manual taxonomy work is going to help them find files faster but eventually search catches up

This. I spent many days cataloging, tagging, deduping and organising my photo and data files, programs, bookmarks, etc.

And I've barely used any of those photos or data files since. The time invested totally wasn't well spent, and I should have just left everything called "DSC0000565.jpg".

sunrunner

> retroactively apply tags to terabytes of existing files

This has always felt like one of the primary issues with tag-based lookup over hierarchical. By the time you're knee deep with enough stuff that you realise tags would help, you've already accumulated too much to practically deal with.

That and figuring out what the tags should be upfront and hoping you don't realise you need additional or different tags later on.

walterbell

CLI tools (find, grep, locate, tmsu) enable bulk changes to tags.

casey2

Only to the files you can already easily find with find grep plocate etc

That's because they are already tagged via path, or in the file. I'm just going to wait for multimodal LLM tagging solutions to catch up, rather than just try to hack it with current models/tech.

null

[deleted]

buildbot

Something like this just for photos has been in the back of my mind forever - it would be really nice to have a virtual folder built from images where the exif data says you used X camera or took the photos on X date. This would be useful for editing applications that are not catalog based, just point them at the virtual folder and query the images you want to edit, and there they are.

Edit - Someone mentioned befs but deleted their comment, it seems like it might sorta be supported in modern linux, possibly just read only though:

https://github.com/torvalds/linux/tree/master/fs/befs

IncreasePosts

I'm sure one could whip together a FUSE filesystem like this very quickly. Here's something similar from 12 years ago: http://pisarenko.net/blog/2013/06/02/introducing-photofs-fus...

buildbot

For sure, just need the time and motivation :)

EvanAnderson

I'm the guilty party who deleted the comment re: BeFS. I thought my analysis of the project was a little biting and, aside from mentioning BeFS, I didn't think my comment was adding much.

I thought about photos and EXIF tags, too. Duplicating the data from the EXIF into another repository strikes me as a bad idea. That's why I was pining for BeFS.

(I have a lot of crazy ideas about filesystems (arguably more like digital asset management systems) and data ingestion and export. Ideas kind of like the failed WinFS. Nothing will ever come of it because I don't have the skills or the time, but sometimes in fever dreams I imagine this stuff.)

sillywalk

With regard to BeFS or BFS the native BeOS (and Haiku) filesystem.

The TSMU examples for mp3 files + VFS are similar to BeOS.

One of the BeOS advocates - Scott Hacker - created bash script for ripping CDs into MP3s called RipEnc. It would query the CDDB to get the metadata - track names/artists etc, so the files would be renamed from TRACK1 to e.g. "Dead Milkmen - Punk Rock Girl" for the CD. It would then convert the CD tracks to MP3 files. The metadata would be added both in the MP3 ID3 fields, as well as to the extended attributes of the files in BFS, and it would organize the music in folders by Artist or Album or something.

You could then have a query - a virtual folder/directory that lists files based on extended attributes - all mp3 files by ARTIST foo, and from ALBUM bar, that would stay updated if the file metadata changed. I can't remember if this virtual directory was available at the command line - or if it was only available in Tracker (the native BeOS/Haiku file manager).

The problem with this, and it's not just a BFS problem, is that the metadata in the file and about the file get can get un-synced, either when updating it, or transferring it to another system that doesn't support the extended attributes.

groby_b

If you mostly want to query and can live without the VFS, dogsheep[1] is your friend. It's a general tool to import lots of different data types into a personal sqlite instance, and dogsheep-photos[2] both extracts image metadata and uploads all the pics to S3 if you'd like.

On my to-try list, there's also supertag[3], a tag-based filesystem that's mounted via FUSE

[1] https://dogsheep.github.io/ [2] https://github.com/dogsheep/dogsheep-photos [3] https://amoffat.github.io/supertag/

walterbell

Apple/Android devices could assist with offline image analysis and metadata generation, https://github.com/mazzzystar/Queryable

cbull

I've used exif-database for something similar but it doesn't build the folder, it just lets you query the sqlite database to find what you are looking for.

https://github.com/perk11/exif-database

jrgaston

Check out Lightroom.

buildbot

I have 41k photos in my Lightroom catalog, I’ve checked it out.

That doesn’t work when I want to use Capture One, Lightroom does not apply Phase One calibration profiles which makes it useless for them, or my own raw processor for Sinar digital backs.

Recommending the most common digital photo DAM/editor is not really a helpful comment either. The number of people who know what exif is and don’t know about Lightroom has to be…small.

bityard

I'll admit when I first saw this, I was put off by the idea of having a separate tool to do something that _should_ be baked into the filesystem. But honestly, this a pretty close to a very Unixy way of solving the problem. Have a separate (and importantly: optional) tool that does the job and does it well.

Additionally, Linux _does_ support tagging files right in the filesystem via the user.xdg.tags xattr. Although it looks like Dolphin is one of the few userspace tools that knows about it.

layer8

This reminds me a bit of the DESCRIBE command in 4DOS ca. 35 years ago [0]. It was only a single text entry per file, but supported by many tools [1], including file managers. There was a proposal to extend the format to XMP properties [2].

[0] https://archive.org/details/bitsavers_jpsoftware_65101374/pa...

[1] https://4dos.info/4tools.htm#02

[2] http://www.optimasc.com/products/fileid/4dos-descext.pdf

Imustaskforhelp

Yoo this is such a great idea , I once saw a video of a youtuber creating open source tag software and I don't know , I realized a frustration and I was also needing something like this once and I was installing this tag cli tool

but this seems even better, this is why I am on hackernews

lemonwaterlime

See also:

- "Designing better file organization around tags not hierarchies" [1]

- `tag` - a macOS version of `tmsu` that uses the system tags (xattr-based if I recall) [2]

[1] https://www.nayuki.io/page/designing-better-file-organizatio...

[2] https://github.com/jdberry/tag

ecmm

I also inadvertently implemented something similar as a zsh script [1] and as a simple rust CLI [2] a couple years ago.

[1] https://github.com/xdoardo/zshelf

[2] https://github.com/xdoardo/shelf

NotPractical

xattrs are great, and would the obvious solution for tags/metadata on Linux too, if the syscall API didn't delete them at every opportunity; programs must be explicitly told not to do that.

https://wiki.archlinux.org/title/Extended_attributes

That being said, it'd be cool to see a port of that CLI to Linux using user.xdg.tags. You can avoid deleting them if you're careful.

thebeardisred

I was wondering why this wasn't using xattrs.

Imustaskforhelp

yea when I was looking for linux version I was finding tag again and again lol

felindev

There's multiple projects that attempt similar thing with SQLite, most recent one being Tag Studio. Seems like tag-based file organization is better solution but required mental cost/effort of upkeep is what keeps it from gaining any traction in long run

https://github.com/TagStudioDev/TagStudio/ https://www.youtube.com/watch?v=wTQeMkYRMcw

keernan

User notes and other file metadata has been supported since Linux Kernel 2.6. See man xattr

https://man7.org/linux/man-pages/man7/xattr.7.html

Since it is baked into the file system, it is pretty easy to create bash scripts to add keyword tags by parsing the directory tree (e.g. batch add tags to books, movies, videos, etc stored in hierarchical category directories).

walterbell

Preservation of xattrs is app-dependent? https://news.ycombinator.com/item?id=42807500

keernan

Most apps have flags to be used to preserve xattrs. Others were written as though they never knew xattr existed and therefore fail to use the kernel flags to preserve them. But it is not that difficult to preserve them once you are using them.

I use rsync to back up my book, music, and video collections (which is where I use them) and the meta data is backed up with them - so if something ever happens I can always restore them. The xattr commands also have a backup and restore for just the xattr built in.

walterbell

Could a custom Linux kernel force xattr preservation flags on even for xattr-unaware apps?

null

[deleted]

somat

I still maintain that all it would take to add tags to the unix filesystem is to start calling "files" "tags"

then use ln to add tags.

dotancohen

Wouldn't really help for the use cases that I can think up.

  $ ls -l # How grep for files with tag "foo"?

  $ find . -tag foo

somat

The hardest(most inefficient) part is finding out what tags a file has.

    find . -samefile group/beatles/love_me_do

    album/please please me/side 2/1
    vocals/paul mcartney/love me do
    vocals/john lennon/love me do
    year/1963/love me do
    group/beatles/love me do
You have to love ontology to go down this route. I do, and did this once... It is possible, but does not really provide any meaningful advantage.

dddw

Makes me think of tagspaces https://www.tagspaces.org/

exe34

I really wanted this to exist about 15 years ago, but at the time didn't have the skills to make it happen. Nowadays though I'd want something that can sync across between Linux and Android at the very least. But I've gotten to comfortable with find and grep to bother.