This website is for humans
366 comments
·August 13, 2025reactordev
heikkilevanto
I have been speculating on adding a tar pit on my personal web site. A script that produces a page of random nonsense and random looking links to the same script. The thing not linked to anywhere, but explicitly forbidden on robots.txt. If the crawlers start on it let them get lost. Bit of rate limiting should keep my server safe, and slow down the crawlers. Maybe I should add some confusing prompts on the page as well... Probably I never get around to it, but the idea sounds tempting.
shakna
I have a single <a> element in my website's head, to a route banned by robots and the page is also marked by noindex meta tags and http headers.
When something grabs it, which AI crawlers regularly do, it feeds them the text of 1984, about a sentence per minute. Most crawlers stay on the line for about four hours.
dbalatero
That's hilarious, can I steal the source for my own site?
phyzome
Should be possible to do this with a static site, even.
Here's what I've been doing so far: https://www.brainonfire.net/blog/2024/09/19/poisoning-ai-scr... (serving scrambled versions of my posts to LLM scrapers)
reactordev
I did something similar. On a normal browser it just displays the matrix rain effect. For a bot, it's a page of links on links to pages that link to each other using a clever php script and .htaccess fun. The fun part is watching the logs to see how long they get stuck for. As each link is unique and can build a tree structure several GB deep on my server.
I did this once before with an ssh honey pot on my Mesos cluster in 2017.
gleenn
Check out doing a compression bomb too, you can host a very small file for you that uncompresses into a massive file for crawlers and hopefully runs them out of ram and they die. Someone posted about it recently on HN even but I can't immediately find the link
extraduder_ire
It's either this one https://news.ycombinator.com/item?id=44670319 or the comments from this one https://news.ycombinator.com/item?id=44651536
I also recall reading it. I think wasting their time is more effective than making them crash and give up in this instance though.
J_McQuade
I loved reading about something similar that popped up on HN a wee while back: https://zadzmo.org/code/nepenthes/
fbunnies
I loved reading about something dissimilar that did not pop up on HN yet: https://apnews.com/article/rabbits-with-horns-virus-colorado...
xyzal
Or, serve "Emergent Misalignment" dataset.
https://github.com/emergent-misalignment/emergent-misalignme...
Karawebnetwork
Reminds me of CSS Zen Garden and its 221 themes: https://csszengarden.com/
e.g. https://csszengarden.com/221/ https://csszengarden.com/214/ https://csszengarden.com/123/
cxr
Only somewhat related and unfortunately misses the point.
CSS Zen Garden was powered by style sheets as they were designed to be used. Want to offer a different look? Write an alternative style sheet. This site doesn't do that. It compiles everything to a big CSS blob and then uses JS (which for some reason is also compiled to a blob, despite consisting of a grand total of 325 SLOC before being fed into bundler) to insert/remove stuff from the page and fiddle with a "data-theme" attribute on the html element.
Kind of a bummer since clicking through to the author's Mastodon profile shows a bunch of love for stuff like a talk about "Un-Sass'ing my CSS" and people advocating others "remove JS by pointing them to a modern CSS solution". (For comparison: Firefox's page style switcher and the DOM APIs it depends on[1] are older than Firefox itself. The spec[1] was made a recommendation in November 2000.)
1. <https://www.w3.org/TR/DOM-Level-2-HTML/html.html#ID-87355129>)
reactordev
I fault her static site builder and not the author for that. It’s just how her bundler bundles.
extraduder_ire
I'm disappointed no browsers other than Firefox support it anymore.[0] Chrome dropped support in version 47.
It's very rare to see it used in the wild too, probably because it's not "sticky" across page loads.
0: https://developer.mozilla.org/en-US/docs/Web/HTML/Reference/...
martin-t
This shouldn't be enforced through technology but the law.
LLM and other "genAI" (really "generative machine statistics") algorithms just take other people's work, mix it so that any individual training input is unrecognizable and resell it back to them. If there is any benefit to society from LLM and other A"I" algorithms, then most of the work _by orders of magnitude_ was done by the people whose data is being stolen and trained on.
If you train on copyrighted data, the model and its output should be copyrighted under the same license. It's plagiarism and it should be copyright infringement.
stahorn
It's like the world turned upside down in the last 20 years. I used to pirate everything as a teenager, and I found it silly that copy right would follow along no matter how anything was encoded. If I XORed copyright material A with open source material B, I would get a strange file C that together with B, I could use to get material A again. Why would it be illegal for me to send anybody B and C, where the strange file C might just as well be thought of as containing the open source material B?!
Now when I've grown up, starting paying for what I want, and seeing the need for some way of content creators to get payed for their work, these AI companies pop up. They encode content into a completely new way and then in some way we should just accept that it's fine this time.
This page was posed here on hacker news a few months ago, and it really shows that this is just what's going on:
https://theaiunderwriter.substack.com/p/an-image-of-an-arche...
Maybe another 10 years and we'll be in the spot when these things are considered illegal again?
martin-t
I went through exactly this process.
Then I discovered (A)GPL and realized that the system makes sense to protect user rights.
And as I started making my own money, I started paying instead of pirating, though I sometimes wonder how much of my money goes to the actual artists and creators and how much goes to zero-sum occupations like marketing and management.
---
It comes down to understanding power differentials - we need laws so large numbers of individuals each with little power can defend themselves against a small number of individuals with large amounts of power.
(Well, we can defend ourselves anyway but it would be illegal and many would see it as an overreaction - as long as they steal only a little from each of us, we're each supposed to only be a little angry.)
---
> Maybe another 10 years and we'll be in the spot when these things are considered illegal again?
That's my hope too. But it requires many people to understand they're being stolen from and my fear is way too few produce "content"[0] and that the majority will feel like they benefit from being able to imitate us with little effort. There's also this angle that US needs to beat China (even though two nuclear superpowers both lose in an open conflict) and because China has been stealing everything for decades, we (the west) need to start stealing to keep up too.
[0]: https://eev.ee/blog/2025/07/03/the-rise-of-whatever/#:~:text...
lawlessone
just pirate again. It's the only way to ensure a game or movie can't be recalled by publishers the next time they want everyone to buy the sequel.
thewebguyd
> and resell it back to them.
This is the part I take issue with the most with this tech. Outside of open weight models (and even then, it's not fully open source - the training data is not available, we cannot reproduce the model ourselves), all the LLM companies are doing is stealing and selling our (humans, collectively) knowledge back to us. It's yet another large scale, massive transfer of wealth.
These aren't being made for the good of humanity, to be given freely, they are being made for profit, treating human knowledge and some raw material to be mined and resold at massive scale.
martin-t
And that's just one part of it.
Part 2 is all the copyleft code powering the world. Now it can be effortlessly laundered. The freedom to inspect and modify? Gone.
Part 3 is what happens if actual AI is created. Rich people (who usually perform zero- or negative- sum work, if any) need the masses (who perform positive-sum work) for a technological civilization to actually function. So we have a log of bargaining power.
Then an ultra rich narcissistic billionaire comes along and wants to replace everyone with robots. We're still far off from that even if actual AI is achieved but the result is not that everyone can live a happy post-scarcity life with equality, blackjack and hookers. The result is that we all become beggars dependent on what those benevolent owners of AI and robots hand out to us because we will no longer have anything valuable to provide (besides our bodies I guess).
jasonvorhe
Which law? Which jurisdiction? From the same class of people who have been writing laws in their favor for a few centuries already? Pass. Let them consume it all. I'll rather choose the gwern approach and write stuff that's unlikely to get filtered out in upcoming models during training. Anubis treats me like a machine, just like Cloudflare but open source and erroneously in good spirit.
riazrizvi
Laws have to be enforceable. When a technology comes along that breaks enforceability, the law/society changes. See also prohibition vs expansion of homebrewing 20’s/30’s, censorship vs expansion of media production 60’s/70’s, encryption bans vs open source movement 90’s, music sampling markets vs music electronics 80’s/90’s…
throw10920
> Laws have to be enforceable.
This is a good point. In this case, it does seem pretty easy to enforce, though - just require anyone hosting an LLM for others to use to have full provenance of all of the data that they trained that LLM on. Wouldn't that solve the problem fairly easily? It's not like LLM training can be done in your garage (at which point this requirement would kill off hundreds/thousands of small LLM-training businesses that would hypothetically otherwise exist).
martin-t
In most of those cases, it was because too many people broke the laws, regardless of what companies did. It was too distributed.
But to train a model, you need a huge amount of compute, centralized and owned by a large corporation. Cut the problem at the root.
visarga
> algorithms just take other people's work, mix it so that any individual training input is unrecognizable and resell it back to them
LLMs are huge and need special hardware to run. Cloud providers underprice even local hosting. Many providers offer free access.
But why are you not talking about what the LLM user brings? They bring a unique task or problem to solve. They guide the model and channel it towards the goal. In the end they take the risk of using anything from the LLM. Context is what they bring, and consequence sink.
martin-t
Quantity matters.
Imagine it took 10^12 hours to produce the training data, 10^6 hours to produce the training algorithm and 10^0 hours to write a bunch of prompts to get the model to generate a useful output.
How should the reward be distributed among the people who performed the work?
lawlessone
>But why are you not talking about what the LLM user brings? They bring a unique task or problem to solve. They guide the model and channel it towards the goal. In the end they take the risk of using anything from the LLM.
I must remember next i'm shopping to demand the staff thank me when i ask them them where the eggs are.
jasonvorhe
These themes are really nice. Even work well on quirky displays. Stuff like this is what makes me enjoy the internet regardless of the way to the gutter.
Scrounger
> My issue is that crawlers aren’t respecting robots.txt
Cloudflare has a toggle switch to automatically block LLM's + scrapers etc:
https://blog.cloudflare.com/declaring-your-aindependence-blo...
oooyay
The theme also changes the background of her profile picture. The attention to detail is commendable.
jacobyoder
Hovering over the netscape link renders it slowly, line by line, like images used to come down...
oooyay
hah, that's amazing
clbn
Not just the background, the Netscape one is a different photo!
pas
PoW might not work for long, but Anubis is very nice: https://anubis.techaro.lol/
That said ... putting part of your soul into machine format so you can put it on on the big shared machine using your personal machine and expecting that only other really truly quintessentially proper personal machines receive it and those soulless other machines don't ... is strange.
...
If people want a walled garden (and yeah, sure, I sometimes want one too) then let's do that! Since it must allow authors to set certain conditions, and require users to pay into the maintenance costs (to understand that they are not the product) it should be called OpenFreeBook just to match the current post-truth vibe.
workethics
> That said ... putting part of your soul into machine format so you can put it on on the big shared machine using your personal machine and expecting that only other really truly quintessentially proper personal machines receive it and those soulless other machines don't ... is strange.
That's a mischaracterization of most people want. When I put out a bowl of candy for Halloween I'm fine with EVERYONE taking some candy. But these companies are the equivalent of the asshole that dumps the whole bowl into their bag.
horsawlarway
I really don't think this holds.
It's vanishingly rare to end up in a spot where your site is getting enough LLM driven traffic for you to really notice (and I'm not talking out my ass - I host several sites from personal hardware running in my basement).
Bots are a thing. Bots have been a thing and will continue to be a thing.
They mostly aren't worth worrying about, and at least for now you can throw PoW in front of your site if you are suddenly getting enough traffic from them to care.
In the mean time...
Your bowl of candy is still there. Still full of your candy for real people to read.
That's the fun of digital goods... They aren't "exhaustible" like your candy bowl. No LLM is dumping your whole bowl (they can't). At most - they're just making the line to access it longer.
lblume
> these companies are the equivalent of the asshole that dumps the whole bowl into their bag
In most cases, they aren't? You can still access a website that is being crawled for the purpose of training LLMs. Sure, DOS exists, but seems to not be as much of a problem as to cause widespread outage of websites.
reactordev
More like when the project kids show up in the millionaire neighborhood because they know they’ll get full size candy bars.
It’s not that there’s none for the others. It’s that there was this unspoken agreement, reinforced by the last 20 years, that website content is protected speech, protected intellectual property, and is copyrightable to its owner/author. Now, that trust and good faith is broken.
pyrale
I’m not sure that the issue is just a technical distinction between humans and bots.
Rather it’s about promoting a web serving human-human interactions, rather than one that exists only to be harvested, and where humans mostly speak to bots.
It is also about not wanting a future where the bot owners get extreme influence and power. Especially the ones with mid-century middle-europe political opinions.
reactordev
Security through obscurity is no security at all…
ryao
If you want a good example of a site with a theme switcher:
rikafurude21
Author seems to be very idealistic, and I appreciate that he cares about the quality of the content he provides for free. Personal experience however shows me that when I look at a recipe site I will first have to skip through the entire backstory to the recipe and then try to parse it inbetween annoying ads in a bloated wordpress page. I can't blame anyone who prefers to simply prompt a chatbot for exactly what hes looking for.
sodimel
> Personal experience however shows me that when I look at a recipe site I will first have to skip through the entire backstory to the recipe and then try to parse it inbetween annoying ads in a bloated wordpress page
That's when money comes into view. People were putting time and effort to offer something for free, then some companies told them they could actually earn money from their content. So they put on ads because who don't like some money for already-done work?
Then the same companies told them that they will make less money, and if they wanted to still earn the same amount as before, they will need to put more ads, and to have more visits (so invest heavily in seo).
Those people had already organized themselves (or stopped updating their websites), and had created companies to handle money generated from their websites. In order to keep the companies sustainable, they needed to add more ads on the websites.
Then some people thought that maybe they could buy the companies making the recipes website, and put a bunch more ads to earn even more money.
I think you're thinking about those websites owned by big companies whose only goal is to make money, but author is writing about real websites made by real people who don't show ads on websites they made because they care about their visitors, and not about making money.
packetlost
Semi related, but a decent search engine like Kagi has been a dramatically better experience than "searching" with an LLM. The web is full of corporate interests now, but you can filter that out and still get a pretty good experience.
martin-t
It always starts with people doing real positive-sum work and then grifters and parasites come along and ruin it.
We could make advertising illegal: https://simone.org/advertising/
jama211
The thing is you can’t regulate word of mouth. It just pushes the money underground, where it can’t be taxed. People will still be paid to promote things, they’ll just pass it off as their own opinion, and it’ll be more insidious. Like it or not, at least advertising now often is clearly advertising. Not always, but often.
keysdev
Some organization prohibit advertising doing their elections. Best idea ever. USA should try it. Saves a lot money and annoying ads.
pas
Or just let this LLM mania run to its conclusion, and we'll end up with two webs, one for profit for AI by AI and one where people put their shit for themselves (and don't really care what others think about it, or if they remix it, or ...).
yztest
Sounds like that could be a fun idea for a new search engine/search engine function, only show results of websites without ads/and or paywalls. Sounds like a really run way to experience the passion part of the internet. Could be hard to implement as I would guess with any level of popularity it would quickly end up with people trying to turn such sites into sales funnels.
swiftcoder
The unfortunate truth here is that the big recipe blogs are all written for robots. Not for LLMs, because those are a fairly recent evolution - but for the mostly-opaque-but-still-gameable google ranking algorithm that has ruled the web for the last ~15 years.
philipwhiuk
Why are you needlessly gendering your post (especially as it's wrong)
skrebbel
I agree with you but I don’t think your confrontational tone is helpful. I think this comment does roughly the same thing, better: https://news.ycombinator.com/item?id=44890782
philipwhiuk
I considered the blunt approach but some people find that ruder.
fknorangesite
[flagged]
cnst
Between the lines — what has necessitated AI summaries are the endless SEO search-engine optimisations and the endless ad rolls and endless page element reloads to refresh the ads and endless scrolling and endless JavaScript frameworks with endless special effects that noone wants to waste their time on.
How can the publishers and the website owners fault the visitors for not wanting to waste their time on all of that?
Even before the influx of AI, there's already entire websites with artificial "review" content that do nothing more than simply rehash the existing content without adding anything of value.
axus
I don't use an ad-blocker, I definitely noticed the website has no ads and stores no cookies or other data besides the theme you can select by clicking at the top right.
The concept of independent creative careers seems to be ending, and people are very unhappy about that. All that's left may be hobbyists who can live with intellectual parasites.
thrance
Click on the recipe sites she linked. They're actually really good. Loading fast, easy to navigate and with concise recipes.
rikafurude21
Yes, but I am talking about results that you would get through googling.
xrisk
That is, undoubtedly, a problem created by Google itself. See for example: Kagi’s small web (https://blog.kagi.com/small-web)
dyarosla
Arbitrage opportunity to make a search engine that bubbles up non ad infested websites!
atx2bos
Paprika or one of the other ones?
drivers99
There are more than two options. Actual paper cookbooks are good for that: no ads, no per-recipe backstory, and many other positive characteristics.
danielbln
Also no search (usually just an index and/or ToC), no dynamic changes ("I don't have this ingredient at home, can I substitute it?"), etc. Don't get me wrong, I love me a good cookbook, but being able to dynamically create a recipe based on what I have, how much time I have, my own skill level, that's really cool when it works.
jen729w
I would have linked you to Eat Your Books, a website that lets you search the cook books that you own.
But Cloudflare/they have inexplicably blocked me, some guy on his iPhone in a hotel in Vietnam. So, screw them, particularly on this thread about the open web.
thrown-0825
Most of the cookbooks ive seen are just as bad when it comes to having too much exposition and not enough recipe.
account42
Also no search though and limited bookmarking and editing ability.
stronglikedan
I don't think they're very idealistic at all. They give two examples of the types of recipe sites they enjoy, and neither match your description of recipe sites. Sure, there's ads but they're unobtrusive and don't block the content. And the actual recipes are just below the fold. Maybe you just need better recipe sites in your collection.
Notatheist
The first site I clicked on a focaccia recipe and had to skip to the bottom of the page, past 7 paragraphs, 10 images and a video to find the actual list of ingredients. The second one had a pop-up from the guardian begging me to subscribe that covers literally half the screen and pops back up with every page load.
account42
And the first fast food restaurant that I ran into didn't server me quality food either. Shocking!
nicbou
If they did it any other way, no one would ever have found that website. Don't hate the players...
coffeecat
"80% as good as the real thing, at 20% of the cost" has always been a defining characteristic of progress.
I think the key insight is that only a small fraction of people who read recipes online actually care which particular version of the recipe they're getting. Most people just want to see a working recipe as quickly as possible. What they want is a meal - the recipe is just an intermediate step toward what they really care about.
There are still people who make fine wood furniture by hand. But most people just want a table or a chair - they couldn't care less about the species of wood or the type of joint used - and particle board is 80% as good as wood at a fraction of the cost! most people couldn't even tell the difference. Generative AI is to real writing as particle board is to wood.
ggoo
Particle board:
- degrades faster, necessitating replacement
- makes the average quality of all wood furniture notably worse
- arguably made the cost of real wood furniture more expensive, since fewer people can make a living off it.
Not to say the tradeoffs are or are not worth it, but "80% of the real thing" does not exist in a vacuum, it kinda lowers the quality on the whole imo.
pixl97
How about
- There are 8 billion people on the planet now and there isn't enough high quality furniture quality wood to make stuff for all of them.
Up until the time of industrialization there just wasn't that much furniture per person in comparison to what we have now.
The reason 'real' wood furniture is more expensive is not that there isn't demand or artisans creating it, there are likely more than ever. Go buy hardwood without knots and see how much the materials alone set you back.
The trade off isn't 'really good furniture' vs 'kinda suck furniture'. It's 'really good furniture' vs 'no furniture at all'.
jcgl
Knotty softwoods can make perfectly suitable furniture. They can (and are) grown at scale.
I’m sympathetic to the viewpoint that the supply particleboard furniture has suffocated the marketplaces for mid- and low-end wooden furniture. Such pieces definitely exist affordably (I’ve bought them at places like Marshall’s, for instance). But they seem comparatively underrepresented in the market.
Maybe a consumer preference for flatpack furniture is enough to explain this? But then again, wooden furniture can be flatpacked too—ikea has plenty of it.
phyzome
If you make better furniture, it will last longer, and you don't need as much wood to serve the same number of people.
It will cost more, sure, but that keeps people from just throwing it out; they sell it instead of throwing it out. The amortized cost is probably similar or even better, but less wasteful.
pluto_modadic
(per capita) buy one cabinet every time you move (they break if you try to move them), or buy one quality piece of wood furniture and resell it when you don't want it.
it's disposable plates vs dishwasher ones, but particle board vs actual furniture
ggoo
You did not read my comment very well. I was not commenting on the the particle board tradeoff, or even the AI tradeoff we find ourselves in now. I was saying that reduction to a lower common denominator (80%), even though it seems innocuous, actually does have broader effects not usually considered.
andrewla
> it kinda lowers the quality
That's why it's "80% of the real thing" and not "100% of the real thing".
doug_durham
Who said anything about particle board. There is factory created furniture that uses long lasting high quality wood. It will last generations and is still less expensive than handcrafted furniture.
nicbou
The main issue with AI is that it still relies on the industries it destroys for its training data. You still need people to cover the news, to review products, to write about their feelings and to find new things to talk about. AI just denies these people an audience or attribution.
This is where the particle wood analogy falls apart. IKEA creates its own goods. AI relies on the work of the industry it's destroying.
stuartjohnson12
> Generative AI is to real writing as particle board is to wood.
Incredible analogy. Saving this one to my brain's rhetorical archives.
martin-t
One law I would like to see if expected durability. Food has an expiry date and ingrediant list. Something similar should accompany all products so consumers can make an educated choice how long it's gonna last and what's gonna break
"Nice metal <thing> you have there, would be a shame if one of the critical moving parts inside was actually plastic."
jayd16
Sure it's awful but look how much you get.
jmull
> If the AI search result tells you everything you need, why would you ever visit the actual website?
AI has this problem in reverse: If search gets me what I need, why would I use an AI middleman?
When it works, it successfully regurgitates the information contained in the source pages, with enough completeness, correctness, and context to be useful for my purposes… and when it doesn’t, it doesn’t.
At best it works about as well as regular search, and you don’t always get the best.
(just note: everything in AI is in the “attract users” phase. The “degrade” phase, where they switch to profits is inevitable — the valuations of AI companies make this a certainty. That is, AI search will get worse — a lot worse — as it is changed to focus on influencing how users spend their money and vote, to benefit the people controlling the AI, rather than help the users.)
AI summaries are pretty useful (at least for now), and that’s part of AI search. But you want to choose the content it summarizes.
jjice
> But you want to choose the content it summarizes.
Absolutely. The problem is that I think 95% of users will not do that unfortunately. I've helped many a dev with some code that was just complete nonsense that was seemingly written in confidence. Turns out it was a blind LLM copy-paste. Just as empty as the old Stack Overflow version. At least LLM code has gotten higher quality. We will absolutely end up with tons of "seems okay" copy-pasted code from LLMs and I'm not sure how well that turns out long term. Maybe fine (especially if LLMs can edit later).
jmull
The AIs at the forefront of the current AI boom work by expressing the patterns that exist in their training data.
Just avoid trying to do anything novel and they'll do just fine for you.
nicbou
As someone who is currently threatened by the Google Zero, thank you.
This applies to recipes, but also to everything else that requires humans to experience life and feel things. Someone needs to find the best cafes in Berlin and document their fix for a 2007 Renault Kangoo fuel pump. Someone needs to try the gadget and feel the carefully designed clicking of the volume wheel. Someone has to get their heart broken in a specific way and someone has to write some kind words for them. Someone has to be disappointed in the customer service and warn others who come after them.
If you destroy the economics of sharing with other people, of getting reader mail and building communities of practice, you will kill all the things that made the internet great, and the livelihoods of those who built them.
And that is a damn shame.
Terretta
> If you destroy the economics of sharing with other people
OK...
Someone needs to find the best cafes in Berlin and document their fix for a 2007 Renault Kangoo fuel pump. Someone needs to try the gadget and feel the carefully designed clicking of the volume wheel. Someone has to get their heart broken in a specific way and someone has to write some kind words for them. Someone has to be disappointed in the customer service and warn others who come after them.
None of those people get paid, three decades ago most of them* shared just fine on BBSs and usenet, while paying to do so, not to mention geocities, Tumbler, on whatever, happily paying to share. For a long time, your dialup connection even came with an FTP site you on which you could host static web pages from e.g. FrontPage or any number of Windows and Mac tools. Not to mention LiveJournal and then Blogger, followed by MoveableType and Wordpress...
People were happy to pay to share instead of get paid, before ads.
You cannot really destroy the economics of sharing that way, it remains too cheap and easy. Unless, you were to, say, invent a giant middleman replacing these yahoos that prioritized "content" that works well to collect and send clicks when ads are wrapped around it, then ensure whatever anyone shares disappears unless they play the game, so more ads can be sold both on the middleman and on the content.
At that point, your sharing becomes gamified, and you're soon sharing not to share something important, but for the points....
Oh.
> the livelihoods of those who built them
But it was never supposed to be about a new class of livelihood. Imagine, if you will, some kind of whole earth catalog hand curated by a bunch of Yahoos...
https://en.wikipedia.org/wiki/Information_wants_to_be_free
---
* Those who had anything useful they felt compelled to share for the good of others, not as scaffolding content for ads to surround. Getting paid to say any of those things tends to be negatively correlated with the quality of what's being said. Those who share just because "you need to know this", there tends to be something to what they put out there.
nicbou
People didn't get paid, but they got rewarded in other ways: attribution, gratitude, community. If I tell an immigrant what I do, there's a pretty good chance that their face will light up because they've used my website. It makes me giddy with pride.
I don't think most people will bother writing anything without an audience, nor will they carefully choose their words if they're fed into a machine.
Yes, the internet had ads, but it had scores of excellent free content, a lot of it crafted with love. God forbid some people find a way to live from making free useful things.
boogieknite
ive been having a difficult time putting this into words but i find anti-ai sentiment much more interesting than pro-ai
almost every pro-ai converation ive been a part of feels like a waste of time and makes me think wed be better off reading sci fi books on the subject
every anti-ai conversation, even if i disagree, is much more interesting and feels more meaningful, thoughtful, and earnest. its difficult to describe but maybe its the passion of anti-ai vs the boring speculation of pro-ai
im expecting and hoping to see new punk come from anti-ai. im sure its already formed and significant, but im out of the loop
personally: i use ai for work and personal projects. im not anti-ai. but i think my opinion is incredibly dull
AuthAuth
Anti Ai conversation forces us to think about what we actually value and WHY. Its a nice mix of real life factors and philosophy and I also find it enjoyable to read.
I've typed out so many comments but deleted them because I find its so hard to find the words that convey what I feel is right but also dont contradict.
johnfn
I couldn't disagree more. Every anti-AI argument I read has the same tired elements - that AI produces slop (is it?) that is soulless (really?). That the human element is lost (are you sure?). As most arguments of the form "hey everyone else, stop being excited about something" typically go, I find these to be dispassionate -- not passionate. What is there to get excited about when your true goal is to quash everyone else's excitement?
Whereas I find pro-AI arguments to be finding some new and exciting use case for AI. Novelty and exploration tend to be exciting, passion-inducing topics. It's why people like writing about learning Rust, or traveling.
At least that's my experience.
martin-t
You really did not run into a single argument against A"I" because of plagiarism, copyright infringement, LLM-induced mental illness, destruction of critical thinking skills, academic cheating, abuse of power / surveillance, profiling, censorship, LLM-powered harassment/stalking/abuse, industrialized lying, etc?
jama211
There are many good points here, but destruction of critical thinking skills is some serious citation needed stuff. They said the same thing about calculators, computers, hell even light novels back in the day.
johnfn
Ah yes, sorry I elided the rest of the list. I think you could roll all these up into "doomerism" though.
boogieknite
llm tool show-and-tell is great. i seek it out and participate. there's not much to discuss
i also think learning rust and traveling is fun to do, but boring to discuss with people who werent there. these topics fall under the category of describing a dream. theyre only compelling to the person, or people if pair programming, who experienced it. could be a "me" thing
did Brian Eno make art with his doc's application of ai? or is Eno in the artistic out-group now? im not cool enough to keep up with this stuff. citing Eno is probably proof of my lack-of-cool. this topic is more interesting than talking about Ghidra MCP, which is the most novel application of an LLM ive experienced. i want to read the argument against Eno's application of AI as art
pluto_modadic
managers who don't understand the technicalities of what their engineers are doing only need a status update or strategy to /sound/ smart: they judge by smell. everything under the surface veneer is bullshit.
it's smart mobile text prediction. nothing more. slop is if you asked it to write the same, identical essay, and it came out with no personality, just the same bullet points, the same voicing... everything unique about the creator, everything correct about the profession, are lost. it's a cheap mcdonalds burger.
jennyholzer
lmao ai generated response
johnfn
Believe it or not, every character was typed with my fingers. I'll take this as a compliment :P
Terretta
AIs don't type --, we type —.
crazygringo
> ...some of my favourites like Smitten Kitchen and Meera Sodha because I know they’re going to be excellent. I trust that the recipe is tried and tested, and the result will be delicious. ChatGPT will give you an approximation of a recipe made up from the average of lots of recipes, but they lack the personality of each individual recipe, which will be slightly different to reflect the experiences and tastes of the author.
It's funny, I want the ChatGPT "approximation". As someone who does a lot of cooking, when I want to learn a new dish, the last thing I want is the "personality" and "tastes" of some author, which is generally expressed by including bizarre ingredient choices, or bizarrely low or high levels of fat, sugar, and salt.
I used to have to read through 15 different "idiosyncratic" versions of a recipe because every single blogger seems to want to put their own "twist" on a recipe, and then I had to figure out the commonalities across them, and then make that. It took forever.
Now I can just ask ChatGPT and get something like the "Platonic ideal" of a particular recipe, which is great to start with. And then I can ask it for suggestions of variations, which will generally be well-chosen and "standard" as opposed to idiosyncratic "individuality".
Because let's face it: individuality is great in art, whether it's fiction or music. I love individuality there. But not in everyday cooking. Usually, you just want a fairly standard version of something that tastes good. Obviously if you go to high-end dining you're looking for something more like individual art. But not for regular recipes to make at home, usually.
escapedmoose
You’ve captured an elusive sentiment so well! AI is (often, not always) good at generating the “Platonic ideal” of something. It falls apart ime when you’re confronting a specific problem that requires more nuance.
Character/personality in a creative work imo comes largely from an element of surprise. If there’s nothing counterintuitive about a work, it’s not very memorable/enticing. Maybe that’s why AI art/text feels so bland. The Platonic ideal is bland. But also, if you’re looking for function rather than art, AI can dish it out.
cindyllm
[dead]
Anamon
Maybe you're just using the wrong sources? I guess it's possible that websites don't exist for that, although I somehow doubt it. When I want the basic version of a traditional recipe, I use a cook book. Like the one I still have from high school cooking class. No experiments or fancy additions, just the basic stuff clearly put together.
There are still a lot of advantages for such a source compared to an LLM average. Cue all the many usual reasons why an LLM response might not be what you were looking for, even if you don't notice it.
AuthAuth
> when I want to learn a new dish, the last thing I want is the "personality" and "tastes" of some author
Bro what do you think cooking is? Every dish is a generalized description of peoples personal ways of making that thing passed down through generations. There is no single authoritative way of doing it.
crazygringo
You're making my point for me. It's precisely that "generalized description" I'm looking for. Not personal idiosyncrasies laid on top.
account42
There is no such as a "platonic ideal" of a recipe. Picking the "most average" recipe is just as arbitrary as a random one.
crazygringo
Quite the opposite. You're looking for a harmonious blend of flavors and textures which i does indeed tend to be the ~average, precisely because outliers are less likely to be harmonious. The average is not arbitrary at all. It's more like a "wisdom of the crowds".
logicprog
I think the fundamental problem here is that there are two uses for the internet: as a source for on-demand information to learn a specific thing or solve a specific problem, and as a sort of proto-social network, to build human connections. For most people looking things up on the internet, the primary purpose is the former, whereas for most people posting things to the internet, the primary purpose is more the latter. With traditional search, there was an integration of the two desires because people who wanted information had to go directly to sources of information that were oriented towards human connection and then could be enramped onto the human connection part maybe. But it was also frustrating for that same reason, from the perspective of people that just wanted information — a lot of the time the information you were trying to gather was buried in stuff that focused too much on the personal, on the context and storytelling, when that wasn't wanted, or wasn't quite what you were looking for and so you had to read several sources and synthesize them together. The introduction of AI has sort of totally split those two worlds. Now people who just want straight to the point information targeted at specifically what they want will use an AI with web search or something enabled. Whereas people that want to make connections will use RSS, explore other pages on blogs, and us marginalia and wiby to find blogs in the first place. I'm not even really sure that this separation is necessarily ultimately a bad thing since one would hope that the long-term effect of it would be it to filter the users that show up on your blog down to those who are actually looking for precisely what you're looking for.
AuthAuth
>from the perspective of people that just wanted information — a lot of the time the information you were trying to gather was buried in stuff that focused too much on the personal, on the context and storytelling, when that wasn't wanted, or wasn't quite what you were looking for and so you had to read several sources and synthesize them together.
When looking for information its critically important to have the story and the context included along side the information. The context is what makes a technical blog post more reliable than an old fourm post. When an AI looks at both and takes the answer the ai user no longer knows where that answer came from and therefore cant make an informed decision on how to interpret the information.
logicprog
That's a fair point. But it can cite that original context in case the human user decides they need it, which might be the best of both worlds? I'm not sure. Also, long form posts may be more useful in certain cases than forum posts, but technical forums didn't pop up out of nowhere, people created and went to them precisely because they were useful even when blog posts already exist, so there's clearly a space for both. There's overlap, for sure, though.
mxuribe
I don't recall who (unfortunately) but back when i first heard of Gemini (the protocol and related websites, and not the AI), I read a similar (though not exact) comparison...and that was their justification for why something like Gemini websites might eventually thrive...and i agreed with that assessment then, and i agree with your opinions now! My question is: as this splintering gets more and more pronounced, will each separate "world" be named something like the "infonet" (for the AI/get-quick-answers world); and the "socialNet" (for the fun, meandering of digital gardens)? Hmmm...
logicprog
That's sort of my ideal, to be honest — why I'm less hostile to AI agent browsers. A semantic wikipedia like internet designed for AI agents as well as more traditional org-mode like hypertext database and lookup systems to crawl and correlate for users, and a neocities or gemini-like place full of digital gardens and personal posts and stories. I don't think they'd have to be totally separate — I'm not a huge fan of splitting onto a different protocol, for instance — though; I more imagine them as sort of parallel universes living interlaced through the same internet. I like infonet as a name, but maybe something like personanet would be better for the other?
mxuribe
> ...I more imagine them as sort of parallel universes living interlaced through the same internet...
Yep, i love the approach too!
> ...maybe something like personanet would be better for the other?
100% fully agreed, personanet (or even personet or some similar alternative) is better, more humanistic name!!!
accrual
This is a really wonderful blog. Well written, to the point, and has its own personality. I'm taking some notes for my own future blog and enjoyed meeting Penny the dog (virtually):
https://localghost.dev/blog/touching-grass-and-shrubs-and-fl...
Dotnaught
https://localghost.dev/robots.txt
User-Agent: * Allow: /
charles_f
I contacted the author, she said because no-one respects it, she hasn't even tried.
thrance
Not like anyone respects that anyways.
a3w
Also, I wanted tldrbot to summarize this page. /s
criddell
That's a good point. It's not a black and white issue.
I personally see a bot working on behalf of an end user differently than OpenAI hoovering up every bit of text they can find to build something they can sell. I'd guess the owner of localghost.dev doesn't have a problem with somebody using a screen reader because although it's a machine pulling the content, it's for a specific person and is being pulled because they requested it.
If the people making LLM's were more ethical, they would respect a Creative Commons-type license that could specify these nuances.
luckys
This might be the one of the best website designs I've ever experienced.
Agree with the content of the post but no idea how is it even possible to enforce it. The data is out there and it is doubtful that laws will be passed to protect content from use by LLMs. Is there even a license that could be placed on a website barring machines from reading it? And if yes would it be enforceable in court?
Anamon
The No-Derivatives clause of Creative Commons is supposed to prohobit ML training. That's also nice because it doesn't prohibit other, human-serving purposes of using the data with "robots". Although I believe an official analysis by CC is still upcoming.
As for enforcibility... I wonder that, too. I added ND to all of my new content preemptively, it's not like it costs me much to do.
pessimizer
This website could have been written by an LLM. Real life is for humans, because you can verify that people you have shaken hands with are not AI. Even if people you've shaken hands with are AI-assisted, they're the editor/director/auteur, nothing gets out without their approval, so it's their speech. If I know you're real, I know you're real. I can read your blog and know I'm interacting with a person.
This will change when the AIs (or rather their owners, although it will be left to an agent) start employing gig workers to pretend to be them in public.
edit: the (for now) problem is that the longer they write, the more likely they will make an inhuman mistake. This will not last. Did the "Voight-Kampff" test in Bladerunner accidentally predict something? It's not whether they don't get anxiety, though, it's that they answer like they've never seen (or maybe more relevant related to) a dying animal.
johnpaulkiser
Soon with little help at all for static sites like this. Had chatgpt "recreate" the background image from a screenshot of the site using it's image generator, then had "agent mode" create a linktree style "version" of the site and publish it all without assistance.
AuthAuth
That has no content though. Its just a badly written blurb and then 4 links. If you did continue down this experiment and generate a blog full of content with chatGPT it would have the same problem. The content would be boring and painful to read unlike the OPs blog.
a3w
It never said "this website stems from a human".
mockingloris
@a3w I suggest starting from "Real life is for humans..."
│
└── Dey well; Be well
Terretta
Having grown up in Cameroon, I get that you're excited to let everyone know you're in Nigeria. But I'm not sure the multi-line signature in all your comments is additive.
PS. Your personal site rocks and I'd be interested to help with your aim in whatever occasional way I can while I {{dayjob}}.
mockingloris
> This website could have been written by an LLM. Real life is for humans, because you can verify that people you have shaken hands with are not AI. Even if people you've shaken hands with are AI-assisted, they're the editor/director/auteur, nothing gets out without their approval, so it's their speech.
100% Agree.
│
└── Dey well; Be well
I’m in love with the theme switcher. This is how a personal blog should be. Great content. Fun site to be on.
My issue is that crawlers aren’t respecting robots.txt, they are capable of operating captchas, human verification check boxes, and can extract all your content and information as a tree in a matter of minutes.
Throttling doesn’t help when you have to load a bunch of assets with your page. IP range blocking doesn’t work because they’re lambdas essentially. Their user-agent info looks like someone on Chrome trying to browse your site.
We can’t even render everything to a canvas to stop it.
The only remaining tactic is verification through authorization. Sad.