Spotify reportedly investigating Anna's Archive's scraping of their library
41 comments
·December 22, 2025ipsum2
Aurornis
> I can't load a single GitHub pull request without being accused of botting.
The only time I encountered this was after a power outage when my ISP's DHCP server handed me a new IP that was tainted. It felt like every major website was suddenly full of captchas.
Eventually I had to unplug the router for 24 hours until the ISP let go of my DHCP reservation. When I reconnected it gave me a new IP and the problems went away.
andrewmcwatters
[dead]
int32_64
Convenience won.
How many people are actually going to download a torrent client, navigate through some massive torrent file collection to check the files of the artists they want to download so they can upload mp3s to their phone over a USB cable like it's 2004 again, just so they can avoid paying Spotify?
cakealert
A sufficiently seeded torrent is a high latency static CDN.
You just need a client that can make use of it.
I'm not sure if anyone will be interested in making one however, you can already get a patched Spotify APK from the usual mobile piracy spaces that's good enough.
Raed667
Wasn't popcorntime basically a video streaming backed by torrent ? Why can't it be the same for audio ?
The metadata is 200 GB which can be easily indexed and could be made searchable, then you download only what you need
madduci
And specifically, not everybody owns a NAS with 300 TB capacity. At 30TB drive for almost 1000€, we are talking about 10-15000€.
As mentioned in other stories, this is really welcomed by other big corps or LLM related companies
realusername
Great, so the copyright conglomerates have nothing to complain about if it's useless then.
unbelievably
[dead]
breppp
Probably a net positive for future open source music generation LLM models
aiisimmoral
Which means a net negative for humanity.
breppp
Depends, the camera killed painting and is a positive for art in my opinion
It's not obvious that LLM generation won't create more interesting music experiences (for lack of non-marketing speak for self curated music)
_fzslm
Arguably, the camera evolved painting because it expanded the idea of what it could be – that it could be more than the illustration of/"illusion" of reality.
I think and have always thought the exact same thing will happen with generative AI.
galleywest200
The camera did not kill painting. There are tons and tons of painters still, lots of them use digital means like a tablet these days but it still absolutely exists.
bopbopbop7
The camera did not kill painting. And how does comparing a camera to an LLM even make sense?
firloop
I wish Spotify welcomed or collaborated with these archival initiatives. Anna's Archive does not compete with Spotify in any way.
thenthenthen
I am flabbergasted by the comments here, Spotify started with pirated music and now invests in the military.
https://torrentfreak.com/how-the-pirate-bay-helped-spotify-b...
And
https://djmag.com/news/spotifys-daniel-ek-leads-eu600-millio...
Aurornis
> I wish Spotify welcomed or collaborated with these archival initiatives.
Spotify licenses the music in their library under specific terms. They don't own it. They can't just decide to give out freely on their own terms.
> Anna's Archive does not compete with Spotify in any way.
I think HN often underestimates the breadth of casual piracy among the general public who want to avoid paying $10/month for a service. There are already numerous tools to stream TV shows and movies from torrents on demand. I have no doubt the same will appear for a giant archive of Spotify music. A lot of people will jump at any chance to cancel their Spotify subscription if they can get close to the same access for free.
twostorytower
Anna's archive offers to share their data for AI training (in exchange for donations), so that's certainly something the record labels want control of. https://annas-archive.org/llm
maxloh
I don't think music producer would agree to that. Spotify would likely lose contracts even if they simply opted for silence.
rendaw
It's probably up to the publishers, not them.
I buy my music, but at the same time I respect that Spotify is a bit more unified than any of the 100 video streaming services that don't have the one thing I want to watch.
o_____________o
Simply sharing metadata, related artists, genres, etc would create a pretty interesting ecosystem[1].
piva00
Every Noise was created by a former Spotify employee.
vintermann
He's a former Spotify employee now, but he was a Spotify employee when he made it. I think it hasn't been updated since he lost his data access.
I have a lot of respect for Glenn McDonald for spam fighting all these years on Spotify, but we can go better than PCA for mapping music these days. Any neural embedding model is going to produce more meaningful axes. In fact Spotify had an intern who did just that, just before the launch of Discover Weekly: Sander Dieleman. Along with Aäron van den Oord he was snapped up by Deepmind after their Spotify internship. Those two guys were (and are) wildly good at what they do.
tene80i
Not even in the “providing a way to get music” way?
nemomarx
A big database that contains every song is pretty different from a recommendation system, web streaming, playlists, etc. Someone could use the dump to create something like that ofc, but the database itself isn't really the interesting thing Spotify offers.
unethical_ban
Spotify's (and the other huge streamers) main selling points are its catalogue, it's recommendations/auto playlists. Other features like steaming quality, UI, and network effects are also at play.
Even the metadata is a huge proprietary data dump. Not sure how you think apple, Google, Amazon or an upstart budget streaming service couldn't use this to better compete against Spotify.
Raed667
I'm hoping that this metadata leak can revive projects like https://everynoise.com
Spotify (and netflix etc..) have become very hostile to exposing their catalogue over API, so i'm glad they've gotten open sourced :)
udave
wasn't spotify started out as a collection of pirated songs? somethings go in full circle I guess.
glitcher
And also being the successor to Napster, the irony is thick with this quote:
"Since day one, we have stood with the artist community against piracy"
Funny thing, I've met a lot of independent artists who don't care about piracy one bit. I have a feeling it's the record labels and large corporations, not the artists, making the biggest fuss over piracy.
sosborn
For an independent artist, exposure matters more than album sales as it leads to ticket sales.
For large labels, exposure is a solved problem and album sales are all that matters.
They are all trying to maximize revenue, they just have different ways of going about it.
ikamm
Previously - https://news.ycombinator.com/item?id=46338339
mystraline
Oooh, scary. "Investigations!"
This is a archivalist institution that actively ignores "copyright" to further the art and science of our shared media legacy.
And frankly, public libraries would absolutely be deemed illegal if they were made 10 years ago. (And it was only because rich people like Rockefeller wanted to wash their actual history with a social-happy persona.)
Anti scraping measures are making it more difficult to use the web. I can't load a single GitHub pull request without being accused of botting.