Bot or human? Creating an invisible Turing test for the internet

90 comments

·June 25, 2025

imiric

I applaud the effort. We need human-friendly CAPTCHAs, as much as they're generally disliked. They're the only solution to the growing spam and abuse problem on the web.

Proof-of-work CAPTCHAs work well for making bots expensive to run at scale, but they still rely on accurate bot detection. Avoiding both false positives and negatives is crucial, yet all existing approaches are not reliable enough.

One comment re:

> While AI agents can theoretically simulate these patterns, the effort likely outweighs other alternatives.

For now. Behavioral and cognitive signals seem to work against the current generation of bots, but will likely also be defeated as AI tools become cheaper and more accessible. It's only a matter of time until attackers can train a model on real human input, and inference to be cheap enough. Or just for the benefit of using a bot on a specific target to outweigh the costs.

So I think we will need a different detection mechanism. Maybe something from the real world, some type of ID, or even micropayments. I'm not sure, but it's clear that bot detection is at the opposite, and currently losing, side of the AI race.

dataviz1000

1. Create a website with a series of tasks to capture this data.

2. Send link to coworkers via Slack so they can spend five minutes doing the tasks.

3. Capture that data and create thousands of slight variations saved to db as profiles

4. Bypass bot protections.

There is nothing anyone can do to prevent bots.

ATechGuy

> There is nothing anyone can do to prevent bots.

Are you sure about this?

JimDabell

> So I think we will need a different detection mechanism. Maybe something from the real world, some type of ID, or even micropayments. I'm not sure, but it's clear that bot detection is at the opposite, and currently losing, side of the AI race.

I think the most likely long-term solution is something like DIDs.

https://en.wikipedia.org/wiki/Decentralized_identifier

A small number of trusted authorities (e.g. governments) issue IDs. Users can identify themselves to third-parties without disclosing their real-world identity to the third-party and without disclosing their interaction with the third-party to the issuing body.

The key part of this is that the identity is persistent. A website might not know who you are, but they know when it’s you returning. So if you get banned, you can’t just register a new account to evade the ban. You’d need to do the equivalent of getting a new passport from your government.

johnisgood

I have not heard about DIDs at all before. How does this really work? They are Government-issued? I am not sure I would trust that though.

thatnerd

https://www.wired.com/story/worldcoin-sam-altman-orb/

julkali

That is the silicon valley cryptoscam version.

This concept has been studied already extensively, e.g [1] (in 2000!) by people like Rivest and Chaum, who have actual decade-old competence in that field.

[1] https://people.csail.mit.edu/rivest/pubs/pubs/LRSW99.pdf

timshell

Yup, Worldcoin has been the one of the efforts in this space. We're trying to have a frictionless, less privacy-invasive method than biometric scanning

freeone3000

It also allows automated software to act on behalf of a person, which is excellent for assistive technologies and something most current bot detection leaves behind.

imiric

On the one hand, yes, this might work, but I'm concerned that it will inevitably require loss of anonymity and be abused by companies for user tracking. I suppose any type of user identification or fingerprinting is at the expense of user privacy, but I hope we can come up with solutions that don't have these drawbacks.

JimDabell

> I'm concerned that it will inevitably require loss of anonymity and be abused by companies for user tracking.

Are you sure you read my comment fully?

charcircuit

The benefit of majorly reducing fraud can create an ecosystem where the trade off is worth it for users to take. For example generous free plans or trials can exist without companies needing to invest so much in antifraud for them.

BiteCode_dev

But this mean that now a saas baning you from your account for spurious reason can be a serious problem.

econ

You could roll a new id to replace the previous one. Each user would still have only one at a time. If this isn't acceptable a service may ask to have the feature disabled for clear mission critical reasons and/or a fee.

msgodel

Everything on the web is a robot, every client is an agent for someone somewhere, some are just more automated.

Distinguishing en mass seems like a waste to me. Deal with the actual problems like resource abuse.

I think part of the issue is that a lot of people are lying to themselves that they "love the public" when in reality they really don't and want nothing to do with them. They lack the introspection to untangle that though and express themselves with different technical solutions.

bobbiechen

I do think the answer is two-pronged: roll out the red carpet for "good bots", add friction for "bad bots".

I work for Stytch and for us, that looks like:

1) make it easy to provide Connected Apps experiences, like OAuth-style consent screens "Do you want to grant MyAgent access to your Google Drive files?"

2) make it easy to detect all bots and shift them towards the happy path. For example, "Looks like you're scraping my website for AI training. If you want to see the content easily, just grab it all at /LLMs.txt instead."

As other comments mention, bot traffic is overwhelmingly malicious. Being able to cheaply distinguish bots and add friction makes your life as a defending team much easier.

msgodel

IMO if it looks like a bot and doesn't follow robots.txt you should just start feeding it noise. Ignoring robots.txt makes you a bad netizen.

lucb1e

> but [PoWs] still rely on accurate bot detection.

No they don't, that's the point: you can serve everyone a PoW and don't have to discriminate and ban real people. This system you're enthusiastic about is what tries to do this "accurate bot detection" (scratch the first word)

vhcr

The default policy of anubis tries to detect bots and changes the difficulty of the proof of work based on that.

https://github.com/TecharoHQ/anubis/blob/main/data/botPolici...

lucb1e

Oh... that I regularly see these pages working on a challenge probably says something about my humanness

chrismorgan

> We need human-friendly CAPTCHAs, as much as they're generally disliked. They're the only solution to the growing spam and abuse problem on the web.

This is wrong, badly wrong.

CAPTCHA stood for “Completely Automated Public Turing test to tell Computers and Humans Apart”. And that’s how people are using such things: to tell computers and humans apart. But that’s not the right problem.

Spam and abuse can come from computers, or from humans.

Productive use can come from humans, or from computers.

Abuse prevention should not be about distinguishing computers and humans: it should be about the actual usage behaviour.

CAPTCHAs are fundamentally solving the wrong problem. Twenty years ago, they were a tolerable proxy for the right problem: imperfect, but generally good enough. But they have become a worse proxy over time.

Also, “human-friendly CAPTCHAs” are just flat-out impossible in the long term. As you identify, it’s only a “for now” thing. Once it’s a target, it ceases to be effective. And the range in humans is so broad that it’s generally distressingly easy to make a bot exceed the lower reaches of human performance.

> Proof-of-work CAPTCHAs work well for making bots expensive to run at scale, but they still rely on accurate bot detection. Avoiding both false positives and negatives is crucial, yet all existing approaches are not reliable enough.

Proof-of-work is even more obviously a temporary solution, security by obscurity: it relies upon symmetry in computation power, which is just wildly incorrect. And all of the implementations I know of have made the bone-headed decision to start with SHA-256 hashing, which amplifies this asymmetry to ludicrous degree (factors of tens of thousands with common hardware, to tens of millions with Bitcoin mining hardware). At that point, forget choosing different iteration counts based on bot detection, it doesn’t even matter.

—⁂—

The inconvenient truth is: there is no Final Ultimate Solution to the Spam Problem (FUSSP).

imiric

> Spam and abuse can come from computers, or from humans.

> Productive use can come from humans, or from computers.

I agree in principle, but the reality is that 37% of all internet traffic originates from bots[1]. The overwhelming majority of that traffic (89% according to Fastly) can be described as abusive. In turn, the abusive traffic from humans likely pales in comparison. It's vastly cheaper to setup bot farms than mechanical turk farms, and it's only getting cheaper.

Identifying the source of the traffic, while difficult, is a generalizable problem. Whereas tracking specific behavior will depend on each site, and will likely require custom implementation for each type of service. Or it requires invasive tracking of users throughout the duration of their session, as many fraud prevention systems do.

Both approaches can be deployed at the same time. A CAPTCHA is not meant to be the only security solution anyway, but as a first layer of defense that is generally simple to deploy and maintain.

That said, I concede that the sentence "[CAPTCHAs] are the only solution" is wrong. :)

> Proof-of-work is even more obviously a temporary solution, security by obscurity

I disagree, and don't see how it's security by obscurity. It's simply a method of increasing the access cost for abusive traffic. The more signals are gathered that identify the user as abusive, the higher the "price" they're required to pay to access the service. Whether the user is a suspected bot or not could just be one type of signal. Behavioral and cognitive signals as mentioned in TFA can be others. Yes, these methods aren't perfect, and can mistakenly penalize human users and be spoofed by bots, but it's the best we currently have. This is what I'd like to see improved.

Still, even with all their faults, I think PoW CAPTCHAs offer a much better UX than traditional CAPTCHAs ever did. Yes, telling humans apart from computers is getting more difficult, but it doesn't mean that the task is pointless.

[1]: https://learn.fastly.com/rs/025-XKO-469/images/Fastly-Threat...

nico

> Proof-of-work CAPTCHAs work well for making bots expensive to run at scale

“Expensive” depends on the value of what you do behind the captcha

There are human-solving captcha services that charge USD 1 for 1k captchas solved (0.1 cents per captcha)

So as long as you can charge more than what solving the captchas cost, you are good to go

Unfortunately, for a lot of tasks, humans are currently cheaper than AI

econ

There must be hilarious undiscovered unknown rube Goldberg machines out there where a human completes a captcha, then the host sells the captcha to the seller who passes it to next user who passes it to the next website who sells it again and so on.

msgodel

POW captchas aren't actually captchas, it's just hashcash (IE make sure the person reading the content is using as much or more compute as you are serving it so they can't DOS you either on purpose or accident.) We stopped needing it for a while because compute and bandwidth grew really fast while serverside software mostly stayed the same.

johnisgood

Agreed, it indeed is hashcash.

https://en.bitcoin.it/wiki/Hashcash

https://en.wikipedia.org/wiki/Hashcash

A Factor (Forth-like language) implementation of it: https://github.com/factor/factor/blob/master/extra/hashcash/... (docs: https://github.com/factor/factor/blob/master/extra/hashcash/...)

From the docs:

> "E-mail senders attach hashcash stamps with the "X-Hashcash" header. Vendors and authors of anti-spam tools are encouraged to exempt e-mail sent with hashcash from their blacklists and content-based filtering rules."

turnsout

Exactly. If the financial incentive is there, they'll add sufficient jitter to trick the detector, and eventually train an ML model to make it even more realistic.

timshell

Yes and no. Traditional CAPTCHAs didn't cause bot farms to advance computer vision

turnsout

It's possible they didn't advance computer vision, but they certainly applied it.

lucb1e

I don't see how that contradicts the parent post. Computer vision wasn't as good when reCAPTCHA was still typing out books, but machine learning has (per my expectation, having worked with it since ~2015, but the proof would be in the pudding) likely been good enough for mimicking e.g. keystroke timings for decades. It hasn't been needed until now. That doesn't mean they won't use it now that it is needed. Different situation from where tech did not yet exist

raincole

> Traditional CAPTCHAs didn't cause bot farms to advance computer vision

Are you sure? And how do you know?

There are a lot of CAPTCHA cracking services. Given the price, they are hardly sustainable even under developing country wage level. I believe they actually solve the easy ones automatically and humans are only involved for the harder ones.

mitthrowaway2

Weren't advancing computer vision (and digitizing books) among the goals of ReCAPTCHA? They seem to have been pretty successful with that.

Animats

Previous CAPTCHAs were based on tasks humans could do but machines could not. The machines caught up and passed humans on those tasks. These new tasks are based on the concept that humans are dumber than AI agents, making more mistakes and showing more randomness.

It might work for a while, but that's a losing battle.

timshell

> These new tasks are based on the concept that humans are dumber than AI agents, making more mistakes and showing more randomness.

Hi this is incorrect. Different =/= dumber. The insight is that humans and computers have different constraints / algorithmic capabilities / objective functions / etc.

Animats

For a few more years, humans who haven't been laid off yet can believe that.

renegat0x0

So recently two things have happened. I have been banned on reddit technology, and warned on other subreddit that I behave like a bot.

Maybe it was my fault to advertise my own solution in comments.

Such behavior however triggered bot detection. I might have behaved like a NPC. So currently a human can be identified as a bot, and banned on that premise. Crazy times.

Currently I feel I must act like a human.

JimDabell

This is interesting stuff, but I’d be seriously concerned about this accidentally catching people who have accessibility needs. How is it going to handle somebody using the keyboard to tab through controls instead of the mouse? Is a typing cadence detector going to flag people who use voice interfaces?

qoez

I totally assumed typing cadence and mouse behaviour was incorperated into bot detection for years before this already, interesting.

lq9AJ8yrfs

You are not wrong.

The article is more of an intro piece for newcomers and doesn't discuss at all the state of the art or where the competition is--the high end of the market is pretty saturated already but the low end is wide open.

There is a bit of a spread in the market, and the specific detection techniques are ofc proprietary and dynamic. Until you have stewed on it quite a bit, it is reasonable to assume everything you can think of has a- been tried b- is either mainstream or doesn't work well c- what working well means is subtle.

Bots are adversarial and nasty ones play the field. Sources of truth are scarce and expensive to consult, and the costs of false positives are felt acutely by the users and the buyers, vs false negatives are more of a slow burn and a nagging suspicion.

hinkley

As I understand it detection software is also at great pains to make it difficult for bots to analyze the patterns of rejections to figure out what rule is catching them.

If they can narrow down the possibilities to quadratic space then you lose.

ipdashc

Yeah, I feel like I'm going crazy looking at that first example video. Was Google's CAPTCHA not supposed to analyze exactly that? Yet the mouse is insta-jumping to the input boxes, the input text is being pasted in instantaneously, and somehow it gets past? That seems utterly trivial to detect. Meanwhile us normal users are clicking on pictures of traffic lights all day?

mitchitized

That is because I do not think Google's aims for captcha are the same as ours.

I can tell you that as soon as you download Chrome and login to any Google account of yours, the captcha tests are suddenly and mysteriously gone.

Use firefox in full-lockdown mode, and you will be clicking fire hydrants and crosswalks for the next several hours.

My crazy conspiracy theory is that Google is just using captcha as an opportunity to force everyone out of privacy mode, further empowering the surveillance capitalism engines. The intent is not to be effective, but inconvenient.

Animats

Yes. As someone who runs with Firefox in full lockdown mode, including Privacy Badger and total blocking of Google Tag Manager, I have to click on a lot of fire hydrants and crosswalks.

Very few sites are broken by blocking Google's features, incidentally. Even Privacy Badger warns that blocking Google Tag Manager may break sites. It doesn't break anything important.

timshell

me and you both

timshell

That's definitely been the marketing. The point of Section 1 is to refute that point

lucb1e

I had a security manager at a big bank (one of my first clients) tell straight to my face that the website decides whether to let me in before I even start typing the password(-equivalent) and that the password is just a formality not to scare people. Near as I could tell, he believed it himself

Marketing indeed. He had me doubting for a while what magic they weren't sharing with the rest of us to avoid countermeasures being developed, but I know better now (working in infosec, seeing what these systems catch, don't catch, and bycatch)

NoMoreNicksLeft

You can never go wrong betting on laziness and aversion to ambition for excellence.

bgwalter

chess.com had this a long time ago.

koalaman

I'm not sure reCAPTCHA is really trying to detect automated vs human interaction with a browser. The primary use-case is to detect abusive use. The distinction here is if I automate my own browser to do things for me on sites using my personal account may not be a problem for site owners, while a spam operation or reselling operation which generates thousands of false accounts using automation is a big problem that they'd want to be able to block. I think reCAPTCHA is tailored towards the latter, and for it not to block the former might be more of a feature than a bug.

roguecoder

LinkedIn, for example, doesn't care if you as a human are manually looking at all your connections one-by-one or if you have automated a bot to do it: it will lock you out the same either way.

logsr

In a few more years there will probably be virtually no human users of web sites and apps. Everything will be through an AI agent mediation layer. Building better CAPTCHAs is interesting technically, but it is doubling down on a failed solution that nobody actually wants. What is needed is an authentication layer that allows agents to act on behalf of registered users with economic incentives to control usage. CAPTCHA has always been an economic bar only, since they are easy to farm out to human solvers, and it is a very low bar. Having an agent API with usage charges is a much better solution because it compensates operators instead of wasting the cost of solving CAPTCHAs. Maybe this will finally be the era of micro payments?

contagiousflow

> Building better CAPTCHAs is interesting technically, but it is doubling down on a failed solution that nobody actually wants

I want it. I don't want my message boards to be people's AI agents...

mdahardy

Co-founder of Roundtable here.

I agree that better authentication methods for AI agents are needed. But right now bots and malicious agents are a real problem for anyone running sites with significant traffic. In the long run I don’t think human traffic will go to zero even if its relative proportion is reduced.

bwfan123

We also need an inverse turing test. ie, detect humans pretending to be AI.

Like the case recently of builder.ai which had humans pretending to be ai.

Turing was a visionary - but even he could not imagine a time when humans pretend to be bots.

jenadine

Yet, human pretending to be machine have existed for centuries https://en.m.wikipedia.org/wiki/Mechanical_Turk

hobs

Not so far fetched, The Mechanical Turk was created in the 1700s, so that already happened a long time before Turing was born.

hinkley

I’ve wanted to create a wiki for a hobby for a long time, but I don’t want to get stuck in spam and abuse reports, which just becomes more of a given with each passing year.

With a hobby wiki, eventual consistency is fine. I believe ghost bans and quarantine and some sort of invisible captcha would go a long way toward my goal, but it’s hard to find invisible captcha.

There was a research project long ago that used high resolution data from keyboards to determine who was typing. The idea was not to use the typing pattern as a password, but to flag suspicious activity. To have someone walk past that desk to see if Sally hurt her arm playing tennis this weekend of if Dave is fucking around on her computer while she’s in a meeting

That’s about the level I’m looking for. Assume everyone is a bot during a probationary period and put accounts into buckets of likely human, likely bot, and unknown.

What I’d have to work out though is temporary storage for candidate edits in a way they cannot fill up my database. A way to throttle them and throw some away if they hit a limit. Otherwise it’s still a DOS attack.

lucb1e

How does one graduate from probation, while being hellbanned / having your contribution quarantined? Since I'm certainly not wasting my time doing a second contribution so long as the first one isn't getting approved, it sounds like this would have to be a manual process or you'd lose out on new contributors that are seeing their work go to /dev/null and never returning

hinkley

Do you believe what we are doing now is working? Because with the exception of places like this the internet sure looks pretty Dead to me.

You always have to show people their own edits. It's a common form of proofreading. But what's added and how often does matter. Misinformation is one thing. External links are potentially something much worse. I used to think SO had it figured out as far as mutual policing, but that's not working so well now either.

lucb1e

I'm not sure what e.g. showing one one's own change answers. Do you manually review submissions or how does get one out of this initial "put everyone in quarantine" state?

I'm also not sure what "we" are doing now that makes the web look dead to you. I receive no more email spam than ten years ago, less if anything, and I haven't seen any spam on the places that I frequent like HN, stackexchange, wikipedia, mastodon, signal, github, etc.

timshell

Happy to help if I can :)

avoutos

Anyone know how this compares to Cloudflare Turnstile?

lucb1e

And so what am I supposed to do if a false positive happens?

I use keyboard navigation on many pages. Using the firefox setting "search when you start typing", I don't have to hit ctrl+f to search on the page, I just type what I want to click on and press enter or ctrl+enter for a new browser tab, or press (shift+)tab to go to the nearest (previous/next) input field. When I open HN, it's muscle memory: ctrl+t (new tab) new enter (autocompletes to the domain) thr enter (go to threads page) anything new? type first few chars of username, shift+tab+tab enter to upvote. Done? Backspace to go back. View comments of a link? Type last char of a word in the link, space, and first char of next word, that's almost always unique on the page, then escape, type men, enter, to almost always activate the comment link. Or shift+tab enter instead to upvote. On the comments page, reading top-level comments is either searching for [ and then enter+f3 when I want to collapse the next one, space for page down... Don't have to take my hands off the home row

etc. on lots of website, also ones I've never visited before (it'll be slower and less habitual of course, but still: if there is text near to where I want to go, I'm typing it). I use the mouse as well, but I find it harder to use than the keys that are always in the same place, much easier to press

So will it tell me that my mouse movements don't look human enough or will I see a "Sorry, something went wrong" http 403 error and have no clue if it's tracking cookies, my IP address, that I don't use Google Chrome®, that I went through pages too fast, that I didn't come past the expected page (where a cookie gets set) but clicked on a search result directly, that I have a bank in country A but residence in country B, that I now did too many tries in figuring out which of these factors is blocking me.... I can give examples of websites where I got blocked in the last ~2 months for each of these. It's such a minefield. The only thing that always passes is proof-of-work CPU challenges, but I dread to think what poor/eco people with slow/old computers are facing. Will this "invisible" captcha (yeah, invisible until you get banned) at least tell me how I'm supposed to give my money to whatever service or webshop will use this?

b0a04gl

assume this is basically nosedive but for presence on the internet. except you don't rate anyone. your device, motion, latency, and scroll inertia get rated by some pipeline you’ll never see. and that’s what decides what version of the site you get.

> what if the turing test already runs silently across every site you open. just passive gating based on scroll cadence, mouse entropy, input lag without captcha or prompt

>what if you already failed one today. maybe your browser fingerprint was too rare, maybe your keyboard rhythm matched a bot cluster from six months ago. so the UI throttled by 200ms. or the request just 403'd.

> what if the system doesn't need to prove you're a bot. it just needs a small enough doubt to skip serving you the real content.

> what if human is no longer biological but statistical. a moving average of behavior trained on telemetry from five metro cities. everyone outside that gets misclassified.

>what if you'll never know. timeline loads emptier than someone else with explicit rejection to the content