Tell HN: Cloudflare is blocking Pale Moon and other non-mainstream browsers

553 comments

·February 5, 2025

Hello.

Cloudflare's Browser Intergrity Check/Verification/Challenge feature used by many websites, is denying access to users of non-mainstream browsers like Pale Moon.

Users reports began on January 31:

https://forum.palemoon.org/viewtopic.php?f=3&t=32045

This situation occurs at least once a year, and there is no easy way to contact Cloudflare. Their "Submit feedback" tool yields no results. A Cloudflare Community topic was flagged as "spam" by members of that community and was promptly locked with no real solution, and no official response from Cloudflare:

https://community.cloudflare.com/t/access-denied-to-pale-moo...

Partial list of other browsers that are being denied access:

Falkon, SeaMonkey, IceCat, Basilisk.

Hacker News 2022 post about the same issue, which brought attention and had Cloudflare quickly patching the issue:

https://news.ycombinator.com/item?id=31317886

A Cloudflare product manager declared back then: "...we do not want to be in the business of saying one browser is more legitimate than another."

As of now, there is no official response from Cloudflare. Internet access is still denied by their tool.

Visit

nikkwong

Yesterday I was attempting to buy a product on a small retailer's website—as soon as I hit the "add to cart" button I got a message from Cloudflare: "Sorry, you have been blocked". My only recourse was to message the owner of the domain asking them to unblock me. Of course, I didn't, and decided to buy the product elsewhere. I wasn't doing anything suspicious.. using Arc on a M1 MBP; normal browsing habits.

Not sure if this problem is common but; I would be pretty upset if I implemented Cloudflare and it started to inadvertently hurt my sales figures. I would hope the cost to retailers is trivial in this case, I guess the upside of blocking automated traffic can be quite great.

Just checked again and I'm still blocked on the website. Hopefully this kind of thing gets sorted out.

LeifCarrotson

> I would be pretty upset if I implemented Cloudflare and it started to inadvertently hurt my sales figures.

The problem is that all these Cloudflare forensics-based throttling and blocking efforts don't hurt sales figures.

The number of legitimate users running Arc is a rounding error. Arc browser users often come to Cloudflare without third-party tracking and without cookies, which is weird and therefore suspicious - you look an awful lot like a freshly instantiated headless browser, in contrast to the vast majority of legitimate users who are carrying around a ton of tracking data. And by blocking cookies and ads, you wouldn't even be attributable in most of the stats if they did let you in.

It would be like kicking anyone wearing dark sunglasses out of a physical store: sure, burglars are likely to want to hide their eyes. Retail shrink is something like 1.5% of inventory, while blind users are <0.5% of the population. It would violate the ADA (and basic ethics) to prohibit out all blind shoppers, so in the real world we've decided that it's not legal to discriminate on this basis even if it would be a net positive for your financials.

The web is a nearly unregulated open ocean, Cloudflare can effectively block anyone for any reason and they don't have much incentive to show compassion to legitimate users that end up as bycatch in their trawl nets.

azemetre

Something tells me that if you asked the store owner that the poster tried to give money to, they'd be furious at cloudflare for stopping the transaction.

Liskni_si

Yeah maybe if you somehow managed to email them without their email provider stopping that email from reaching them…

graemep

What about all false positives in aggregate?

The problem is site owners do not know - it just adds to the number of blocked threats in cloudflare's reassuring emails.

edelbitter

It is difficult to gauge the size of the Cloudflare effect.. if the usage statistics the site owner is collecting.. are also not collected for those undesirables.

TheRealPomax

The number of legitimate users on "not chrome, edge, safari, or firefox" is about 10% of the browser market. I don't know about you, but if I'm running a shop, and the whole point of my website is to make sales, but my front door is preventing 10% of those sales? That door is getting replaced.

lotsofpulp

You don't think the people actually running the shops, whose income depends on the shop, have thought of that and thus there exists a downside that more than offsets the upside?

Aldo_MX

Then you get burglars in your shop instead of legitimate customers.

User Agents look the way they do because this is a recurring issue.

A browser without network effects gets blocked, they look for a way to bypass the blocking, then they become mainstream and now the de-facto UA is larger than before.

supernovae

If you were running a shop, you would realize that nearly 100% of the fraud is "not chrome, edge, safari, or firefox"

It's unfortunate yes but that's what drives the threat signatures

agoodusername63

Why would you assume that the 10% of non standard browsers are going to buy anything?

Demographic is important here. If I was running a shop that sold software for Linux users, sure. If I'm running a store that sells pretty much anything else? I'm not caring.

NoMoreNicksLeft

>That door is getting replaced.

Sure. If there was another place to buy a better door at. But if that door manufacturer's the only one that makes doors, if the door installer and door technicians all tell you that they can't or won't make another door for you, then you just deal. Maybe crank up the prices a bit to try to mitigate your 10% shortfalls.

The place where a business looks at that problem and sees money being left on the table that it can't live without and that it has no other way of making up for... that is a very narrow stretch, and only very marginal businesses live there.

RobotToaster

I wonder if cloudflare blocks like these affect screen reader users, in which case they may violate the ADA.

dragontamer

And if they did violate the ADA, do you seriously expect this administration's anti-DEI Department of Justice to pursue legal action?

samspot

In my experience, screen reader users stick to the mainstream browsers to preserve compatibility. https://webaim.org/projects/screenreadersurvey10/

jen729w

Vendors who block iCloud Relay are the worst. I'm sure they don't even know they're doing it. But some significant percentage of Apple users -- and you'd have to think it's only gonna grow -- comes from those IP address ranges.

Bad business, guys. You gotta find another way. Blocking IP addresses is o-ver.

grayhatter

> Bad business, guys. You gotta find another way. Blocking IP addresses is o-ver.

no, it's still the front line. And likely always will be. It's the only client identifier bots can't lie about. (or nearly the only)

At $OLDJOB, ASN reputation was the single best predictor of traffic hostility. We were usually smart enough to know which we can, or can't block outright. But it's an insane take to say network based blocking is over... especially on a thread about some vendor blocking benign users because of the user-agent.

null

[deleted]

weare138

I don't use iCloud Relay but it seems Apple's ASN would be 'reputable'.

jidar

Blocking based on ASN has never and should never be the frontline. It's the illusion of increased security with little actual impact. The bad guys are everywhere and if blocking an ASN has an improvement on your actual breaches then your security is total crap and always will be until you start doing the right things.

cprecioso

This would be weird, esp. given that Cloudflare is one of the vendors who act as exit nodes for iCloud Relay.

latexr

I believe your parent comment means when the target website blocks, not Cloudflare.

YouTube is a perfect example. Using iCloud Private Relay can now frequently label you as a bot, which stops you from watching videos until you login.

jrootabega

I don't think that's weird. That's what I would want from an honest vendor who is involved in both services - block anonymization/obfuscation users if I'm paying you to block them. Apple/Cloudflare don't sell/support iCloud Relay as a service that is guaranteed to get you treated nicely by the parties on the other end, so they're not being deceptive with that part either.

What I'd worry about is Cloudflare using their knowledge of their VPN clients to allow services behind their attack protection to treat those clients better, because maybe they're leaking client info to the protected services.

Not that I think Cloudflare/Apple/etc. are supremely noble/honest/moral, or that it's good that semi-anonymous connections are treated so badly by default; this juxtaposition just doesn't seem like a problem to me.

EDIT: OK, I back off of this position somewhat. Apple's marketing of iCloud Relay might allow users to believe it's more prestigious and reputable than a VPN/Tor. They do have fine print explaining that you might be treated badly by the remote services, but it's, you know, fine print, and Apple knows that they have a reputation for class and legitimacy.

hedora

I’ve noticed wifi at coffee shops, etc have started blocking it too.

I need to disable it for one of my internal networks (because I have DNS overrides that go to 192.168.0.x), or I’d wish they’d just make it mandatory for iPhones and put and end to such shenanigans.

Apple could make it a bit more configurable for power users, and then flip the “always on” nuclear option switch.

Either that, or they could add a “workaround oppressive regimes” toggle that’d probably be disabled in China, but hey, I’m in the US, so whatever.

Edit: I also agree that blocking / geolocating IP addresses is a big anti-pattern these days. Many ISPs use CGNAT. For instance, all starlink traffic from the south half of the west coast appears to come from LA.

As a result, some apps have started hell-banning my phone every time I drive to work because they see me teleport hundreds of miles in 10 minutes every morning. (And both of my two IPs probably have 100’s of concurrent users at any given time. I’m sure some of them are doing something naughty).

rthomas6

Wait, this comment made me aware of the existence of iCloud Relay. Apple built their own Tor only for Apple users? Why would they do that? Why not use Tor???

dewey

You can use iCloud Relay without even noticing that you are using it, this is not true with Tor as you'll spend most of your time waiting for reconnecting circuits.

guipsp

Because it is 1. Not Tor and 2. Fast

echoangle

It’s more like a VPN instead of Tor

jillyboel

If you use a weird proxy you're gonna get blocked. Facts of life.

oremolten

Well its primarily because the security vendors for say WAFs and other tools list these IPs in the "Anonymizers" or "VPN" category and most typically these are blocked as seldom do you see legitimate traffic originating to your store front or accounts pages from these. Another vendor we use lists these under "hacking tools" So your option as a security professional is to express to your risk management team we allow "hacking tools" or lose iCloud Relay customers. Which way do you think they steer? In alternative cases a site may use a vendor for their cart/checkout page and don't even have control over these blocks as they are also blocking "hacking tools" or "anonymizers" from hitting their checkout pages.

grayhatter

> So your option as a security professional is to express to your risk management team we allow "hacking tools" or lose iCloud Relay customers

a professional would explain how the vendor is being lazy and making a mistake there because they don't understand your business.

depending on the flavor of security professional (hacker) they might also subtly suggest that this vendor is dumb and should be embarrassed they've made this mistake, thus creating the implication that if you still want to block these users you would also have to be an idiot

under so circumstance is what I ever allow anyone to get the mistaken impression that some vendor understands my job better than I do. As a "security professional" it's literally your job to identify hostile traffic, better than a vendor could.

Yeul

Oh I think we all know that the Endgame is only allowing the approved webbrowser from the approved hardware. And getting on those lists will be made very expensive indeed...

oremolten

Wait till you see how M365 does management around iCloud relay makes it real fun troubleshooting suspicious login parameters...

Xelbair

To access any site protected by cloudflare captcha i have to change browsers from firefox to chrome. and i have basically default suite of addons (ublock is the only one affecting the pages themselves).

VPN doesn't matter, i probably share IP with someone "flagged" via ISP.

Every site, that is except their cloudlfare dashboard.

benhurmarcel

I have come across several websites on which Cloudflare blocks my devices, whatever I use. No Captcha, just blocked. I tried a stock iPhone (Safari, no blockers, no VPN, no iCloud relay, both on wifi or 4G), and a Windows PC with Firefox, Chrome, or Edge, no luck. That includes a website of a local business so that can't be the country either.

I have no idea why.

KomoD

Maybe you have anti-fingerprinting protection on? I've heard it can cause issues.

Xelbair

No, only thing i have is dns-over-https.

But i should turn that on.

justinpombrio

> Of course, I didn't, and decided to buy the product elsewhere

Consider messaging the owner to tell them you were trying to buy a product on their site and the site wouldn't let you. There's a chance that they'll care and be able to do something about it. But no chance if they don't know about the problem!

raxxorraxor

I think this is on Cloudflare. Perhaps there is a demand for such a service, but it is another to implement it. And this is very bad for a free and therefore safe net.

I don't even know which attack vectors an integrity check for a browser could help against. Against infected clients? It is in any way evidently not effective.

wvh

There is some political-philosophical irony that the Chinese prefer their government to do the blocking and take away their freedom, while the US prefers their monopolistic capitalistic corporate world to do it. A rose by any other name. Chose your friends carefully.

Ray20

To trivialize totalitarian regimes that carry out terror against their own citizens, that can outright kill you and whole your family, by comparing them to capitalistic corporate world where, in the worst case, you can simply choose another, less fancy option, is the height of madness.

taurknaut

> using Arc on a M1 MBP; normal browsing habits.

Well i've certainly never heard of this browser before and it still seems pretty young. I'd guess it's the same issue.

yurishimo

Arc is almost 3 (4?) years old and was the darling child of dev influencers for the better part of 2 years. It's not a niche browser, especially amongst devs that are likely to work at Cloudflare.

littlestymaar

It's definitely a niche browser. I think I heard of it once on HN over the past few years, and I'd be surprised if there was actually more than a few thousands of people using it.

bdhcuidbebe

It is a niche browser with no hype going for it.

chrisandchris

I'm still not sure how some random browser should result in a block by the provider. I don't think there's any security risk for the provider of the site by using an outdated browser. Blocking malicious IPs yes/maybe, blocking suspicious acitivity maybe. But because you have browser X - please not.

This is going to lead two a two-class internet where new technologies will not emerge and big players will win because the gate the high is so absurdly high and random that people stop to invent.

taurknaut

I presume this was not intentional.

tyzoid

It's a chromium derivative.

Elfener

I think it's also EOL/not getting updates now?

I mean I never used it, their only selling point seem to have been hype.

lijok

Definitely not EOL; https://resources.arc.net/hc/en-us/articles/20498293324823-A...

wraptile

Cloudflare doesn't report this to the site admins so they're just sitting there losing sales and thinking Cloudflare is doing a good job.

throitallaway

Same thing with Captchas. If I'm placing a food order or something and I'm presented with a Captcha 9 times out of 10 I just say "screw it."

tibbar

This echoes the user agent checking that was prevalent in past times. Websites would limit features and sometimes refuse to render for the "wrong" browser, even if that browser had the ability to display the website just fine. So browsers started pretending to be other browsers in their user agents. Case in point - my Chrome browser, running on an M3 mac, has the following user agent:

"'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/132.0.0.0 Safari/537.36'"

That means my browser is pretending to be Firefox AND Safari on an Intel chip.

I don't know what features Cloudflare uses to determine what browser you're on, or if perhaps it's sophisticated enough to get past the user agent spoofing, but it's all rather funny and reminiscent just the same.

gloosx

I'm still using Mozilla/5.0 (iPhone; CPU iPhone OS 6_0 like Mac OS X) AppleWebKit/536.26 (KHTML, like Gecko) Version/6.0 Mobile/10A5376e Safari/8536.25 on my desktop.

The internet is so much better like this! There is a 2010 lightweight mobile version of Google, and m.youtube with obviously cleaner and better UI and not a single ad (apparently it's not worth to show you ads if you still appear to be using iphone 6)

anticensor

> (apparently it's not worth to show you ads if you still appear to be using iphone 6)

Why not adwall the user instead, showing only ads until they upgrade the device or buy premium?

hexagonwin

This is iOS 6 and not iPhone 6, btw.

gloosx

Whoa, really. So it is even back to 3GS/4 days then.

leafmeal

I tried this just for fun and youtube said to update my browser :(

gloosx

When you click OK it lets you in regardless ;)

johnmaguire

As a counterpoint, I asked Claude to write a script to fetch Claude usage and expose it as a Prometheus metric. As no public API exists, Claude suggested I grab the request from the Network tab. I copied it as cURL, and attempted to run it, and was denied with a 403 from CF.

I forgot the script open, polling for about 20 minutes, and suddenly it started working.

So even sending all the same headers as Firefox, but with cURL, CF seemed to detect automated access, and then eventually allowed it through anyway after it saw I was only polling once a minute. I found this rather impressive. Are they using subtle timings? Does cURL have an easy-to-spot fingerprint outside of its headers?

Reminded me of this attack, where they can detect when a script is running under "curl | sh" and serve alternate code versus when it is read in the browser: https://news.ycombinator.com/item?id=17636032

schroeding

> Does cURL have an easy-to-spot fingerprint outside of its headers?

If it's a https URL: Yes, the TLS handshake. There are curl builds[1] which try (and succeed) to imitate the TLS handshake (and settings for HTTP/2) of a normal browser, though.

[1] https://github.com/lwthiker/curl-impersonate

bennyg

To echo further, they may be leaning on something like the [ja4 fingerprint](https://www.google.com/url?sa=t&source=web&rct=j&opi=8997844...) (which you'd need to rebuild curl to emulate that chromium version to try and trick).

areyourllySorry

it's possible there was an attack that stopped which led to more lenient antibot

ZeWaka

> if perhaps it's sophisticated enough to get past the user agent spoofing

As a part of some browser fingerprinting I have access to at work, there's both commercial and free solutions to determine the actual browser being used.

It's quite easy even if you're just going off of the browser-exposed properties. You just check the values against a prepopulated table. You can see some of such values here: https://amiunique.org/fingerprint

Edit: To follow up, one of the leading fingerprinting libraries just ignores useragent and uses functionality testing as well: https://github.com/fingerprintjs/fingerprintjs/blob/master/s...

wongarsu

They are pretending to be an ancient Mozilla version from the time after Netscape but before Firefox, KHTML (which was forked to webkit), Firefox (Gecko engine), Chrome and Safari. The only piece of browser history it's missing is somehow pretending to be IE.

mh-

> The only piece of browser history it's missing is somehow pretending to be IE.

They're kinda covered because IE also sent Mozilla/5.0 (or 4.0, 2.0, [..]).

tibbar

Amusingly, I also just realized that even the operating system is spoofed here! I'm on macOS 14, yet the user agent claims "Mac OS X" 10.15. It's a pretty funny situation, and clearly for the sole benefit of very old websites and libraries performing dubious checks.

zerocrates

I don't know if they still do it, but the Apple Silicon Macs also lied about their architecture and said they're Intel. Truth is not the guiding principle of the User-Agent (or all the JS navigator properties, or anything else easy to use to check this kind of thing).

Avamander

> I don't know what features Cloudflare uses to determine what browser you're on, or if perhaps it's sophisticated enough to get past the user agent spoofing, but it's all rather funny and reminiscent just the same.

Yes, it is, both your TLS and TCP stacks are unique enough that such spoofing can be detected. But there are a lot of other things that can be fingerprinted as well.

createaccount99

> That means my browser is pretending to be Firefox AND Safari on an Intel chip.

That's not the case, that ua is Chrome on MacOS. The rest is backward compatibility garbage

tibbar

This is the user agent on Chrome, but the reason for all the references to other browsers (and an old OS and architecture), the backward compatibility garbage, is to pretend to be those browsers for the sake of old websites that are doing string matching on the user agents.

christophilus

Slack was doing this with their huddle feature for the longest time (still were last I checked). Drives me crazy.

6031769

Doesn't drive me crazy - gives me a "Get Out of Huddles Free" card.

ai-christianson

How many of you all are running bare metal hooked right up to the internet? Is DDoS or any of that actually a super common problem?

I know it happens, but also I've run plenty of servers hooked directly to the internet (with standard *nix security precautions and hosting provider DDoS protection) and haven't had it actually be an issue.

So why run absolutely everything through Cloudflare?

matt_heimer

Yes, [D]DoS is a problem. Its not uncommon for a single person with residential fiber to have more bandwidth than your small site hosted on a 1u box or VPS. Either your bandwidth is rate limited and they can denial of service your site or your bandwidth is greater but they can still cause you to go over your allocation and cause massive charges.

In the past you could ban IPs but that's not very useful anymore.

The distributed attacks tend to be AI companies that assume every site has infinite bandwidth and their crawlers tend to run out of different regions.

Even if you aren't dealing with attacks or outages, Cloudflare's caching features can save you a ton of money.

If you haven't used Cloudflare, most sites only need their free tier offering.

It's hard to say no to a free service that provides feature you need.

Source: I went over a decade hosting a site without a CDN before it became too difficult to deal with. Basically I spent 3 days straight banning ips at the hosting company level, tuning various rate limiting web server modules and even scaling the hardware to double the capacity. None of it could keep the site online 100% of the time. Within 30 mins of trying Cloudflare it was working perfectly.

johnmaguire

> It's hard to say no to a free service that provides feature you need.

Very true! Though you still see people who are surprised to learn that CF DDOS protection acts as a MITM proxy and can read your traffic plaintext. This is of course by design, to inspect the traffic. But admittedly, CF is not very clear about this in the Admin Panel or docs.

Places one might expect to learn this, but won't:

- https://developers.cloudflare.com/dns/manage-dns-records/ref...

- https://developers.cloudflare.com/fundamentals/concepts/how-...

- https://imgur.com/a/zGegZ00

sophacles

How would you do DDoS protection without having something in path?

Aachen

> not uncommon for a single person with residential fiber to have more bandwidth than your small site hosted on a 1u box or VPS.

Then self host from your connection at home, don't pay for the VPS :). That's what I've been doing for over a decade now and still never saw a (D)DoS attack

50 mbps has been enough to host various websites, including one site that allows several gigabytes of file upload unauthenticated for most of the time that I self host. Must say that 100 mbps is nicer though, even if not strictly necessary. Well, more is always nicer but returns really diminish after 100 (in 2025, for my use case). Probably it's different if you host videos, a Tor relay, etc. I'm just talking normal websites

lucumo

> 50 mbps has been enough to host various websites,

Bandwidth hasn't been a limiting factor for years for me.

But generating dynamic pages can bring just enough load for it to get painful. Just this week I had to blacklist Meta's ridiculously overactive bot sending me more requests per second than all my real users do in an hour. Meta and ClaudeBot have been causing intermittent overloads for weeks now.

They now get 403s because I'm done trying to slow them down.

johnklos

I've been hosting web sites on my own bare metal in colo for more than 25 years. In all that time I've dealt with one DDoS that was big enough to bring everything down, and that was because of a specific person being pissed at another specific person. The attacker did jail time for DDoS activities.

Every other attempt at DDoS has been ineffective, has been form abuse and credential stuffing, has been generally amateurish enough to not take anything down.

I host (web, email, shells) lots of people including kids (young adults) who're learning about the Internet, about security, et cetera, who do dumb things like talk shit on irc. You'd think I'd've had more DDoS attacks than that rather famous one.

So when people assert with confidence that the Internet would fall over if companies like Cloudflare weren't there to "protect" them, I have to wonder how Cloudflare marketed so well that these people believe this BS with no experience. Sure, it could be something else, like someone running Wordpress with a default admin URL left open who makes a huge deal about how they're getting "hacked", but that wouldn't explain all the Cloudflare apologists.

Cloudflare wants to be a monopoly. They've shown they have no care in the world for marginalized people, whether they're people who don't live in a western country or people who simply prefer to not run mainstream OSes and browsers. They protect scammers because they make money from scammers. So why would people want to use them? That's a very good question.

systems_glitch

Same basic experience. The colo ISP soaks up most actual DDoS. We had a couple mid-sized ones when we were hosting irc.binrev.net from salty b& users. No real effect other than the colo did let us know it was happening and that it was "not a significant amount of DDoS by our standards."

mvdtnz

I'm sorry but lumping in people who prefer to use a weird browser with "marginalised people" does not help your credibility.

Aachen

What bit do you mean specifically? As a fellow web hoster, who also hosted kids before (from a game making forum), I can fully corroborate what they're saying

johnklos

You're focusing on the wrong kind of pedantry.

"Marginalized" has a specific connotation, sure, but people can be marginalized for reasons other than, or in addition to, those that fit the connotation.

grishka

> How many of you all are running bare metal hooked right up to the internet?

I do. Many people I know do. In my risk model, DDoS is something purely theoretical. Yes it can happen, but you have to seriously upset someone for it to maybe happen.

maples37

From my experience, if you tick off the wrong person, the threshold for them starting a DDoS is surprisingly low.

A while ago, my company was hiring and conducting interviews, and after one candidate was rejected, one of our sites got hit by a DDoS. I wasn't in the room when people were dealing with it, but in the post-incident review, they said "we're 99% sure we know exactly who this came from".

Loughla

What the hell is wrong with people? Honestly the lack of substantive human interaction in a lot of folks' lives, except via the Internet, is a real problem.

Take that story for instance. Here's how that goes in the physical world, just to show how unbelievably ridiculous it is.

So you didn't get the job? What's your next step?

I'll stop by their office and keep people from entering the front doors by running around in front of them. That'll show those bastards.

professorsnep

I run a Mediawiki instance for an online community on a fairly cheap box (not a ton of traffic) but had a few instances of AI bots like Amazon's crawling a lot of expensive API pages thousands of times an hour (despite robots.txt preventing those). Turned on Cloudflare's bot blocking and 50% of total traffic instantly went away. Even now, blocked bot requests make up 25% of total requests to the site. Without blocking I would have needed to upgrade quite a bit or play a tiring game of whack a mole blocking any new IP ranges for the dozens of bots.

mrweasel

AI bots are a huge issue for a lot of sites. Just putting intentional DDoS attacks aside, AI scrapers can frequently tip over a site because many of them don't know how to back off. Google is an exception really, their experience with creating GoogleBot as ensured that they are never a problem.

Many of the AI scrapers don't identify themselves, they live on AWS, Azure, Alibaba Cloud, and Tencent Cloud, so you can't really block them and rate limiting also have limited effect as they just jump to new IPs. As a site owner, you can't really contact AWS and ask them to terminate their customers service in order for you to recover.

CGamesPlay

How do you feel, knowing that some portion of the 25% “detected bot traffic” are actually people in this comment thread?

account42

You don't need buttflare's mistery juice to rate-limit or block bad users.

motiejus

I've been running jakstys.lt (and subdomains like git.jakstys.lt) from my closet, a simple residential connection with a small monthly price for a static IP.

The only time I had a problem was when gitea started caching git bundles of my Linux kernel mirror, which bots kept downloading (things like a full targz of every commit since 2005). Server promptly went out of disk space. I fixed gitea settings to not cache those. That was it.

Not ever ddos. Or I (and uptimerobot) did not notice it. :)

nijave

Small/medium SaaS. Had ~8 hours of 100k reqs/sec last year when we usually see 100-150 reqs/sec. Moved everything behind a Cloudflare Enterprise setup and ditched AWS Client Access VPN (OpenVPN) for Cloudflare WARP

I've only been here 1.5 years but sounds like we usually see 1 decent sized DDoS a year plus a handful of other "DoS" usually AI crawler extensions or 3rd parties calling too aggressively

There are some extensions/products that create a "personal AI knowledge base" and they'll use the customers login credentials and scrape every link once an hour. Some links are really really resource intensive data or report requests that are very rare in real usage

gamegod

Did you put rate limiting rules on your webserver?

Why was that not enough to mitigate the DDoS?

danielheath

Not the same poster, but the first "D" in "DDoS" is why rate-limiting doesn't work - attackers these days usually have a _huge_ (tens of thousands) pool of residential ip4 addresses to work with.

nijave

We had rate limiting with Istio/Envoy but Envoy was using 4-8x normal memory processing that much traffic and crashing.

The attacker was using residential proxies and making about 8 requests before cycling to a new IP.

Challenges work much better since they use cookies or other metadata to establish a client is trusted then let requests pass. This stops bad clients at the first request but you need something more sophisticated than a webserver with basic rate limiting.

hombre_fatal

That might have been good for preventing someone from spamming your HotScripts guestbook in 2005, but not much else.

null

[deleted]

rpgwaiter

It’s free unless you’re rolling in traffic, it’s extremely easy to setup, and CF can handle pretty much all of your infra with tools way better than AWS.

Also you can buy a cheaper ipv6 only VPS and run it thru free CF proxy to allow ipv4 traffic to your site

zelphirkalt

Easy to set up, easy to screw up user experience. Easy-peasy.

uniformlyrandom

Most exploits target the software, not the hardware. CF is a good reverse proxy.

windsignaling

As a website owner and VPN user I see both sides of this.

On one hand, I get the annoying "Verify" box every time I use ChatGPT (and now due its popularity, DeepSeek as well).

On the other hand, without Cloudflare I'd be seeing thousands of junk requests and hacking attempts everyday, people attempting credit card fraud, etc.

I honestly don't know what the solution is.

rozap

What is a "junk" request? Is it hammering an expensive endpoint 5000 times per second, or just somebody using your website in a way you don't like? I've also been on both sides of it (on-call at 3am getting dos'd is no fun), but I think the danger here is that we've gotten to a point where a new google can't realistically be created.

The thing is that these tools are generally used to further entrench power that monopolies, duopolies, and cartels already have. Example: I've built an app that compares grocery prices as you make a shopping list, and you would not believe the extent that grocers go to to make price comparison difficult. This thing doesn't make thousands or even hundreds of requests - maybe a few dozen over the course of a day. What I thought would be a quick little project has turned out to be wildly adversarial. But now spite driven development is a factor so I will press on.

It will always be a cat and mouse game, but we're at a point where the cat has a 46 billion dollar market cap and handles a huge portion of traffic on the internet.

jeroenhd

I've such bots on my server. Some Chinese Huawei bot as well as an American one.

They ignored robots.txt (claimed not to, but I blacklisted them there and they didn't stop) and started randomly generating image paths. At some point /img/123.png became /img/123.png?a=123 or whatever, and they just kept adding parameters and subpaths for no good reason. Nginx dutifully ignored the extra parameters and kept sending the same images files over and over again, wasting everyone's time and bandwidth.

I was able to block these bots by just blocking the entire IP range at the firewall level (for Huawei I had to block all of China Telecom and later a huge range owned by Tencent for similar reasons).

I have lost all faith in scrapers. I've written my own scrapers too, but almost all of the scrapers I've come across are nefarious. Some scour the internet searching for personal data to sell, some look for websites to send hack attempts at to brute force bug bounty programs, others are just scraping for more AI content. Until the scraping industry starts behaving, I can't feel bad for people blocking these things even if they hurt small search engines.

MatthiasPortzel

Why not just ignore the bots? I have a Linode VPS, cheapest tier, and I get 1TB of network transfer a month. The bots that you're concerned about use a tiny fraction of that (<1%). I'm not behind a CDN and I've never put effort into banning at the IP level or setting up fail2ban.

I get that there might be some feeling of righteous justice that comes from removing these entries from your Nginx logs, but it also seems like there's a lot of self-induced stress that comes from monitoring failed Nginx and ssh logs.

x3haloed

Honestly, it should just come down to rate limiting and what you’re willing to serve and to whom. If you’re a free information idealist like me, I’m OK with bots accessing public web-serving servers, but not OK with allowing them to consume all my bandwidth and compute cycles. Furthermore, I’m also not OK with legitimate users consuming all my resources. So I should employ strategies that prevent individual clients or groups of clients from endlessly submitting requests, whether the format of the requests make sense or are “junk.”

DocTomoe

Sounds like a problem easily solved with fail2ban. Which keeps legitimate folks in, and offenders out - and also unbans after a set amount of time, to avoid dynamic IPs screwing over legitimate users permanently.

makeitdouble

> somebody using your website in a way you don't like?

This usually includes people making a near-realtime updated perfect copy of your site and serving that copy for either scam or middle-manning transactions or straight fraud.

Having a clear category of "good bots" from either a verified or accepted companies would help for these cases. Cloudflare has such a system I think, but then a new search engine would have to go to each and every platform provider to make deals and that also sounds impossible.

Terr_

I'd settle for some kind of "proof of investment" in a bot-identity, so that I know blocking that identity is impactful, and it's not just one of a billion tiny throwaways.

In other words, knowing who someone is isn't strictly necessary, provided they have "skin the game" to encourage proper behavior.

OptionOfT

> and you would not believe the extent that grocers go to to make price comparison difficult. This thing doesn't make thousands or even hundreds of requests - maybe a few dozen over the course of a day.

It's gonna get even worse. Walmart & Kroger are implementing digital price tags, so whatever you see on the website will probably (purposefully?) be out of date by the time you get to the store.

Stores don't want you to compare.

rozap

Originally I was excited to see that kroger had an API, until just about the first thing that the ToS said was "you can't use this for price comparison".

And yea, I imagine dynamic pricing will make things even more complicated.

That being said, that's why this feature isn't built into the billion shopping list apps that are out there. Because it's a pain.

_blk

So you put something in your cart and by the time you reach the cashier the price doubled? Sounds like someone is about to patent price locking when you add an item to your pysical shopping cart.

ohcmon

Actually, I think creating google alternative has never been as doable as it is today.

to11mtm

I'll give a fun example from the past.

I used to work at a company that did auto inspections. (e.x. if you turned a lease in, did a trade in on a used car, private party, etc.)

Because of that, we had a server that contained 'condition reports', as well as the images that went through those condition reports.

Mind you, sometimes condition reports had to be revised. Maybe a photo was bad, maybe the photos were in the wrong order, etc.

It was a perfect storm:

- The Image caching was all inmem

- If an image didn't exist, the server would error with a 500

- IIS was set up such that too many errors caused a recycle

- Some scraper was working off a dataset (that ironically was 'corrected' in an hour or so) but contained an image that did not exist.

- The scraper, instead of eventually 'moving on' would keep retrying the URL.

It was the only time that org had an 'anyone who thinks they can help solve please attend' meeting at the IT level.

Very true. I'm reminded of Oren Eini's tale of building an app to compare grocery prices in Israel, where apparently mandated supermarket chains to publish prices [0]. On top of even the government mandate for data sharing appearing to hit the wrong over/under for formatting, There's the constant issue of 'incomparabilities'.

And it's weird, because it immediately triggered memories of how 20-ish years ago, one of the most accessible Best Buy's was across the street from a Circuit City, but good luck price matching because the stores all happened to sell barely different laptops/desktops (e.x. up the storage but use a lower grade CPU) so that nobody really had to price match.

[0] - https://ayende.com/blog/170978/the-business-process-of-compa...

_factor

Best Buy will also sell identical hardware with a slightly modified SKU and negligible changes to avoid comparison.

It’s difficult to compare when BB is the “only” company that sells a particular item.

tempodox

+1 for spite-driven development.

gjsman-1000

Simple: We need to acknowledge that the vision of a decentralized internet as it was implemented was a complete failure, is dying, and will probably never return.

Robots went out of control, whether malicious or the AI scrapers or the Clearview surveillance kind; users learned to not trust random websites; SEO spam ruined search, the only thing that made a decentralized internet navigable; nation state attacks became a common occurrence; people prefer a few websites that do everything (Facebook becoming an eBay competitor). Even if it were possible to set rules banning Clearview or AI training, no nation outside of your own will follow them; an issue which even becomes a national security problem (are you sure, Taiwan, that China hasn't profiled everyone on your social media platforms by now?)

There is no solution. The dream itself was not sustainable. The only solution is either a global moratorium of understanding which everyone respectfully follows (wishful thinking, never happening); or splinternetting into national internets with different rules and strong firewalls (which is a deal with the devil, and still admitting the vision failed).

stevenAthompson

I hate that you're right.

To make matters worse, I suspect that not even a splinternet can save it. It needs a new foundation, preferably one that wasn't largely designed before security was a thing.

Federation is probably a good start, but it should be federated well below the application layer.

ToucanLoucan

I mean, it wasn't even that security wasn't a thing: the earliest incarnations of the Internet were defense projects, and after that, connections between university networks. Abuse was nonexistent because you knew everyone on your given network. Bob up the hall wouldn't try to steal your credit card or whatever, because you'd call the police.

I think a decent idea is, we need to bring personal accountability back into the equation. That's how an open-trust network works, and we know that, because that's how society works. You don't "trust" that someone walking by your car won't take a shit in your open window: they could. But there are consequences for that. We need rock solid data security policies that apply to anyone who does business, hosts content, handles user data online, and people need to use their actual names, actual addresses, actual phone numbers, etc. etc. in order to interact with it. I get that there are many boons to be had with the anonymity the Internet offers, but it also enables all of the horseshit we all hate. A spammer can spam explicitly because their ISP doesn't care that they do, email servers don't have their actual information, and in the odd event they are caught and are penalized, it's fucking trivial to circumvent it. Buy a new AWS instance, run a script to setup your spam box, upload your database of potential victims, and boom, you're off.

A lot of tech is already drifting this way. What is HTTPS at it's core if not a way to verify you are visiting the real Chase.com? How many social networking sites now demand all kinds of information, up to and including a photo of your driver's license? Why are we basically forbidden now by good practice from opening links in texts and emails? Because too many people online are anonymous, can't be trusted, and are acting maliciously. Imagine how much BETTER the Internet would be if when you fucked around, you could be banned entirely? No more ban evasion, ever.

I get that this is a controversial opinion, but fundamentally, I don't think the Internet can function for much longer while being this free. It's too free, and we have too many opportunistic assholes in it for it to remain so.

benatkin

Me too.

Federation is indeed a good start, but DeFi helps spur adoption by having a broader scope.

supportengineer

A walled garden where each a real, vetted human being is responsible for each network device. It wouldn't scale but it could work locally.

benatkin

Luckily the decentralization community has always been decentralized. There are plenty of decentralized networks to support.

Aeolun

The great firewall, but in reverse.

gjsman-1000

What other choice do we have?

Countries, whether it be Ukraine or Taiwan, can't risk other countries harvesting their social media platforms for the mother of all purges. I never assume that anything that happened historically can never happen again - no Polish Jew would have survived the Nazis with this kind of information theft. Add AI into the mix, and wiping out any population is as easy as baking pie.

Countries are tired of bad behavior. Just ask my grandmother, who has had her designs stolen and mass produced from China. Not just companies - many free and open source companies cannot survive with such reckless competition. Can Prusa survive a world where China takes, but never gives? How many grandmothers does it take being scammed? How many educational systems containing data on minors need to be stolen? The MPAA and RIAA has been whining for years about the copyright problem, and while we laugh at them, never underestimate them. The list goes on and on.

Startups are tired of paying Cloudflare or AWS protection money, and trying to evade the endless sea of SEO spam. How can a startup compete with Google with so much trash and no recourse? Who can build a new web browser, and be widely accepted as being a friendly visitor? Who can build a new social media platform, without the experience and scale to know who is friend or foe?

Now we have AI, gasoline and soon to be dynamite on the fire. For the first time ever, a malicious country can VPN into the internet of a friendly nation, track down all critics on their social media, and destroy their lives in a real world attack (physical or virtual). We are only beginning to see this in Ukraine - are we delusional enough to believe that the world is past warfare? For the first time, anyone in the world could make nudes of women and share them online, from a location where they'll probably never be taken down. If a Russian company offered nudes as a service to American customers with cryptocurrency payments and a slick website that went viral, do you think tolerance is a winning political position?

inetknght

> On the other hand, without Cloudflare I'd be seeing thousands of junk requests and hacking attempts everyday, people attempting credit card fraud, etc.

Yup!

> I honestly don't know what the solution is.

Force law enforcement to enforce the laws.

Or else, block the countries that don't combat fraud. That means... China? Hey isn't there a "trade war" being "started"? It sure would be fortunate if China (and certain other fraud-friendly countries around Asia/Pacific) were blocked from the rest of the Internet until/unless they provide enforcement and/or compensation their fraudulent use of technology.

marginalia_nu

A lot of this traffic is bouncing all over the world before it reaches your server. Almost always via at least one botnet. Finding the source of the traffic is pretty hopeless.

patrick451

When the government actually cares, they're able to track these things down. But they don't except in high profile cases.

jeroenhd

A lot of the fake browser traffic I'm seeing is coming from American data centres. China plays a major part, but if we're going by bot traffic, America will end up on the ban list pretty quickly.

inetknght

America does have laws against this kind of thing.

So instead of banning America, report the IP addresses to their American hosts for spam and malicious intent. If the host refuses to do anything, report it to law enforcement. If law enforcement doesn't do anything... then you're proving my point.

jacobr1

Slightly more complicated because a ton of the abuse comes from IPs located western countries, explicitly to evade fraud and abuse detection. Now you can go after the western owners of those systems (and all the big ones do have have large abuse teams to handle reports) but enforcement has a much higher latency. To be effective you would need a much more aggressive system. Stronger KYC. Changes in laws to allow for less due-process and more "guilty by default" type systems that you then need to prove innocence to rebut.

warkdarrior

And that assumes that the Western owners of those systems have any reason to listen to you, the one raising the complaint. How would they check that you are not lying?

RIMR

A wild take only possible if you don't understand how the Internet works.

inetknght

A wild opinion only valid if you have a defeatist attitude.

EVa5I7bHFq9mnYK

Credit card fraud exists because credit card companies can't (or won't) implement elementary security measures. There should be a requirement to confirm every online payment, but many sites today require just a cc number+date+code+zip, with no additional confirmation, can't call it other than complicity in the crime.

il-b

Lost sales due to 2fa are greater than losses due to refunds

xrisk

Why would 2FA cause lose sales? One would imagine it’s because people are being auto charged for shit they don’t want but haven’t noticed or forgot to cancel.

BytesAndGears

Something like iDeal, which is a payment processing system in the Netherlands.

It works so well and is very secure. You get to the checkout page on a website, click a link. If you’re on your phone, it hotlinks to open your banking app. If you’re on desktop, it shows a QR code which does the same.

When your bank app opens, it says “would you like to make this €28 payment to Business X?” And you click either yes or no on the app. You never even need to enter a card in the website!

You can also send money to other people instantly the same way, so it’s perfect for something like buying a used item from someone else.

Plus the whole IBAN system which makes it all possible!

carlosjobim

What kind of fraud protection does iDeal have for customers?

BytesAndGears

I’m not actually sure since I never had issues, but I’ve heard it’s not much since they’re basically just an API for transferring money between banks. Each bank app still needs to integrate with the network separately. [1]

I guess you get some security since each party that you transfer to must have their identity verified with a bank, so you could always get the police involved fairly easily

The iDeal website page on security [2] is in Dutch, but it translates to roughly:

> Before you make a purchase, make sure that the webshop or business is a reliable party. For example, you can read experiences of other consumers about webshops on comparison sites. Or you can use a Google search to check what is said (in reviews) about a webshop on the internet. Also check the overview of the police with known rogue trading parties and the page check seller data. Before making a purchase, always use the following rule of thumb: if something is too good to be true, don't do it.

[1] https://en.m.wikipedia.org/wiki/IDEAL

[2] https://www.ideal.nl/veiligheid

kobalsky

> people attempting credit card fraud

this is wrong.

if someone can use your site they can use stolen cards, and bots doing this will not be stopped by them.

cloudflare only raises the cost of doing it, it may make scrapping a million of product pages unprofitable but that doesn't apply to cc fraud yet.

hecanjog

They might be talking about people who are trying to automate the testing hundreds of stolen credit cards with small purchases to see if they are still working. This is basically why we ended up using cloudflare at work.

bragr

>that doesn't apply to cc fraud yet

It stops "card testing" where someone has bought or stolen a large number of cards and need verify which are still good. The usual technique is to cycle through all the cards on a smaller site selling something cheap (a $3 ebook for example). The problem is that the high volume of fraud in a short time span will often get the merchant account or payment gateway account shut down, cutting off legitimate sales.

As a consumer, you should also be suspicious of a mysterious low value charge on your card because it could be the prelude to much larger charges.

Aachen

Someone who steals money from thousands of individuals for a living won't hesitate to use a botnet either. Cloudflare isn't a payment provider (*shudders* yet), they can't verify transactions, they can only guess at who's "honest". I'm at the losing end of this guess so often as someone who frequently visits friends and family in the neighbouring country they come from, and someone who doesn't have tracking cookies anymore that were set only a few minutes ago, who uses a "non-standard" browser (Mozilla's Firefox), I don't feel like Cloudflare does a very good job at detecting when I'm trying to honestly use the site. At the same time, doing security testing as my job: the customer having Cloudflare enabled usually doesn't matter for us being able to reach and exploit vulnerable pages, it just decides to block you randomly the same way that it does in private time when I'm not trying to break anything. It doesn't properly do the job and it blocks legitimate people based on a gut feeling, and you have no recourse, you can suck it up. Whatcha gonna do, take Cloudflare to court for blocking your access to your bank? Under what law is that illegal? There is nothing you can do; your bank's customer support isn't going to disable Cloudflare for you.

Anyway, no, this guessing game isn't the solution to stolen bank details, the solution is for the payment provider to authenticate the account holder beyond merely entering a public number, especially if they suddenly see a flood of transactions from this one merchant as you describe. They can decide to ask for a second factor: send the person an SMS/email, ask to generate an authenticator code, whatever it is they've got on file beyond your card/account number. Anything else is just guesswork

null

[deleted]

markisus

If I were hosting a web page, I would want it to be able to reach as many people as possible. So in choosing between CDNs, I would choose the one that provides greater browser compatibility, all other things equal. So in principle, the incentives are there for Cloudflare to fix the issue. But the size of the incentive may be the problem. Not too many customers are complaining about these non-mainstream browsers.

porty

In that case you can turn off / not turn on the WAF feature(s) of Cloudflare - it's optional and configured by the webmaster.

doctor_radium

On one hand, I'm okay with that. If Cloudflare or some other self-appointed Internet cop blocks me from a site, I just go somewhere else, and I hope the site goes out of business as a result...which happens to businesses everyday for a variety of reasons. But given Cloudflare's sheer size, having so many businesses crank the shields to maximum actually affects using the web, and that's where I draw the line.

Aachen

> If I were hosting a web page, I would want it to be able to reach as many people as possible. So in choosing between CDNs

I host many webpages and this is exactly it. Anyone is welcome to use the websites I host. There is no CDN, your TLS session terminates at the endpoint (end to end encryption). May be a bit slower for the pages having static assets if you're coming from outside of Europe, but the pages are light anyway (no 2 MB JavaScript blobs)

lynndotpy

> On the other hand, without Cloudflare I'd be seeing thousands of junk requests and hacking attempts everyday, people attempting credit card fraud, etc. > > I honestly don't know what the solution is.

The solution is good security-- Cloudflare only cuts down on the noise. I'm looking at junk requests and hacking attempts flow through to my sites as we speak.

lynndotpy

Whoops-- this was a draft I didn't intend to post in this state. I must have fatfingered the "reply" button somehow. Alas, too late to edit or delete now.

Cloudflare cuts down on the noise, but also helps does the work of preventing scrapers, people who re-sell your site wholesale, and cutting down on the noise also means cutting down on the cost of network requests.

It also can help where security is lax. You should have measures against credential stuffing, but if you don't, Cloudflare might prevent (some) of your users from being hacked. Which isn't good enough, but is better than no mitigation at all.

I don't use Cloudflare personally, but I won't dismiss it wholesale. I understand why people use it.

carlosjobim

>Cloudflare only cuts down on the noise.

That sounds like the solution, that sounds like good security.

zlagen

I'm using chrome on linux and noticed that this year cloudflare is very agressive in showing the "Verify you are a human" box. Now a lot of sites that use cloudflare show it and once you solve the challenge it shows it again after 30 minutes!

What are you protecting cloudflare?

Also they show those captchas when going to robots.txt... unbelievable.

rurp

Cloudflare has been even worse for me on Linux + Firefox. On a number of sites I get the "Verify" challenge and after solving it immediately get a message saying "You have been blocked" every time. Clearing cookies, disabling UBO, and other changes make no difference. Reporting the issue to them does nothing.

This hostility to normal browsing behavior makes me extremely reluctant to ever use Cloudflare on any projects.

a_imho

I'm a Cloudflare customer, even their own dashboard does not work with linux+slightly older firefox. I mean one click and it is ooops, please report the error to dev null

mmh0000

At least you can get past the challenge. For me, every-single-time it is an endless loop of "select all bikes/cars/trains". I've given up even trying to solve the challenge anymore and just close the page when it shows up.

theamk

that's not Cloudflare, they stopped doing pictures years ago. You can tell because Cloudflare always puths their brand name on their page.

Cloudflare just blocks you without recourse nowdays.

Springtime

I run a few Linux desktop VMs and Cloudflare's Turnstile verification (their auto/non-input based verification) fails for the couple sites I've tried that use it for logins, on latest Chromium and Firefox browsers. Doesn't matter that I'm even connecting from the same IP.

I'd presumed it was just the VM they're heuristically detecting but sounds like some are experiencing issues on Linux in general.

abirch

I guess it’s time to update our user agent strings like I did with konquerer 20 years ago.

Looks like there’s a plugin for that https://chromewebstore.google.com/detail/user-agent-switcher...

nbernard

Check that you are allowing webworker scripts, that did the trick for me. I still have issues on slower computers (Raspberry pies and the like) as they seem to be to slow to do whatever Cloudflare wants as a verification in the allotted time, however.

ponector

Sounds like my experience browsing internet while connected to the VPN provided by my employer: tons of captcha and everything is defaulted to German (IP is from Frankfurt).

ranger_danger

The problem is that you are not performing "normal browsing behavior". The vast majority of the population (at least ~70% don't use ad-blockers) have no extensions and change no settings, so they are 100% fingerprintable every time, which lets them through immediately.

globalnode

linux + firefox. not sure what happened to me yesterday but the challange/response thing was borked and when i finally got through it all, it said i was a robot anyway. this was while trying to sign up for a skype acct, could have been a ms issue though and not necessarily cloudflare. i think the solution is to just not use obstructive software. thanks to this issue i discovered jitsi and that seems more than enough for my purposes.

sleepybrett

Yeah, Lego and Etsy are two sites I can now only visit with safari. It sucks. Firefox on the same machine it claims I'm a bot or a crawler. (not even on linux, on a mac)

fcq

I have Firefox and Brave set to always clear cookies and everything when I close the browser... it is a nightmare when I come back the amount of captchas everywhere....

It is either that or keep sending data back to the Meta and Co. overlords despite me not being a Facebook, Instagram, Whatsapp user...

ezfe

You don't need to clear cookies to avoid sending that data back. Just use a browser that properly isolates third party/Facebook cookies.

nacs

You don't even need to use a different browser - Firefox has an official "Multi-account containers" extension that lets you assign certain sites to open in their own sandbox so you can have a sandbox for Google, another for Facebook, etc.

ATechGuy

I wonder if browsers have a future.

nerdralph

I don't bother with sites that have cloudflare turnstyle. Web developers supposedly know the importance of page load time, but even worse than a slow loading page is waiting for cloudflare's gatekeeper before I can even see the page.

fbrchps

That's not turnstile, that's a Managed Challenge.

Turnstile is the in-page captcha option, which you're right, does affect page load. But they force a defer on the loading of that JS as best they can.

Also, turnstile is a Proof of Work check, and is meant to slow down & verify would-be attack vectors. Turnstile should only be used on things like Login, email change, "place order", etc.

supriyo-biswas

Managed challenges actually come from the same "challenges" platform, which includes Turnstile; the only difference being that Turnstile is something that you can embed yourself on a webpage, and managed challenge is Cloudflare serving the same "challenge" on an interstitial web page.

Also, Turnstile is definitely not a simple proof of work check, and performs browser fingerprinting and checks for web APIs. You can easily check this by changing your browser's user-agent at the header level and leave it as-is at the header level; this puts Turnstile into an infinite loop.

viraptor

The captcha on robots is a misconfiguration in the website. CF has lots of issues, but this one is on their costumer. Also they detect Google and other bots, so those may be going through anyway.

jasonjayr

Sure; but sensible defaults ought to be in place. There are certain "well known" urls that are intended for machine consuption. CF should permit (and perhaps rate limit?) those by default, unless the user overrides them.

JimDabell

Putting a CAPTCHA in front of robots.txt in particular is harmful. If a web crawler fetches robots.txt and receives an HTML response that isn’t a valid robots.txt file, then it will continue to crawl the website when the real robots.txt might’ve forbidden it from doing so.

null

[deleted]

potus_kushner

using palemoon, i don't even get a captcha that i could solve. just a spinning wheel, and the site reloads over and over. this makes it impossible to use e.g. anything hosted on sourceforge.net, as they're behind the clownflare "Great Firewall of the West" too.

inemesitaffia

See if changing user agent to Chrome/Firefox helps

progmetaldev

Whoever configures the Cloudflare rules should be turning off the firewall for things like robots.txt and sitemap.xml. You can still use caching for those resources to prevent them becoming a front door to DDoS.

kevincox

It seems like common cases like this should be handled correctly by default. These are cachable requests intended for robots. Sure, it would be nice if webmasters configure it but I suspect a tiny minority does.

For example even Cloudflare hasn't configure their official blog's RSS feed properly. My feed reader (running in a DigitalOcean datacenter) hasn't been able to access it since 2021 (403 every time even though backed off to checking weekly). This is a cachable endpoint with public data intended for robots. If they can't configure their own product correctly for their official blog how can they expect other sites to?

progmetaldev

I agree, but I also somewhat understand. Some people will actually pay more per month for Cloudflare than their own hosting. The Cloudflare Pro plan is $20/month USD. Some sites wouldn't be able to handle the constant requests for robots.txt, just because bots don't necessarily respect cache headers (if they are even configured for robots.txt), and the sheer number of bots that look at robots.txt and will ignore a caching header are too numerous.

If you are writing some kind of malicious crawler that doesn't care about rate-limiting, and wants to scan as many sites as possible for the most vulnerable to get a list together to hack, you will scan robots.txt because that is the file that tells robots NOT to index these pages. I never use a robots.txt for some kind of security through obscurity. I've only ever bothered with robots.txt to make SEO easier when you can control a virtual subdirectory of a site, to block things like repeated content with alternative layouts (to avoid duplicate content issues), or to get a section of a website to drop out of SERPs for discontinued sections of a site.

glandium

The best part is when you get the "box" on a XHR request. Of course no site handles that properly, and just breaks. Happens regularly on ChatGPT.

scarab92

Cloudflare is security theatre.

I scrape hundreds of cloudflare protected sites every 15 minutes, without ever having any issues, using a simple headless browser and mobile connection, meanwhile real users get interstitial pages.

It's almost like Cloudflare is deliberately showing the challenge to real users just to show that they exist and are doing "something".

jeroenhd

I just downloaded Palemoon to check and it seems the CAPTCHA straight up crashes. Once it crashes, reloading the page no longer shows the CAPTCHA so it did pass something at least. I tried another Cloudflare turnstile but the entire browser crashed on a segfault, and ever since the CAPTCHAs don't seem to come up again.

ChatGPT.com is normally quite useful for generating Cloudflare prompts, but that page doesn't seem to work in Palemoon regardless of prompts. What version browser engine does it use these days? Is it still based on Firefox?

For reference I grabbed the latest main branch of Ladybird and ran that, but Cloudflare isn't showing me any prompts for that either.

Hold-And-Modify

This crash is an even newer Cloudflare issue (as of yesterday, I believe). It is not related to the one discussed here, and will be solved in the next browser update:

https://forum.palemoon.org/viewtopic.php?f=3&t=32064

dvtkrlbs

Kinda funny and ironic thing is their forum just don't allow me to see the contents of their website from my hetzner box that I use as an exit node. More ironically if this site was using cloudflare I could at least solve a challenge and browse the forum instead of getting hit with a giant 403

YoshiRulz

I believe the problem in Ladybird's case is missing JS APIs https://github.com/LadybirdBrowser/ladybird/issues/226

willywanker

It uses a hard fork of Firefox's Gecko engine called Goanna, and is independently developed other than a few security patches from upstream. It has considerably diverged from contemporary Firefox so is not comparable.

ec109685

Seems seriously risky to be running a browser without access to mainstream security patches.

Perhaps it’s secure enough for now due to its obscurity.

mimasama

> without access to mainstream security patches

They do have access to them. The lead developer and project owner has sec bug access in bugzilla.

But vulnerabilities in newer Mozilla have over time become less and less relevant in Pale Moon's codebase, which led to the latter dropping the tracking of how many Mozilla security patches have been applied in the release notes (starting with 33.0.1).

picafrost

Companies like Google and Cloudflare make great tools. They give them away for free. They have different reasons for this, but these tools provide a lot of value to a lot of people. I’m sure that in the abstract their devs mean well and take pride in making the internet more robust, as they should.

Is it worth giving the internet to them? Is something so fundamentally wrong with the architecture of the internet that we need megacorps to patch the holes?

zamadatix

Whether something is "wrong" is often more a matter of opinion than a matter of fact for something as large and complex as the internet. The root of problems like this on the internet is connections don't have an innate user identity associated at the lower layers. By the time you get to an identity for a user session you've already driven past many attack points. There isn't really a "happy" way to remove that from the equation, at least for most people.

Hold-And-Modify

Forgot to clarify: this is not about an increased amount of captchas, or an annoyance issue.

The Cloudflare tool does not complete its verifications, resulting in an endless "Verifying..." loop and thus none of the websites in question can be accessed. All you get to see is Cloudflare.

randunel

Is this the behaviour you're observing? (my recording of HIBP) https://imgur.com/a/cloudflare-makes-have-i-been-pwned-unusa...

zeroimpl

I ran into exactly this the other day trying to browse a website from a browser app on an android-powered TV. Just couldn't get to the website.

PokestarFan

I was on Brave in iOS. I had to turn off Brave Shield.

lapcat

The worst is Cloudflare challenges on RSS feeds. I just have to unsubscribe from those feeds, because there's nothing I can do.

ezfe

That's misconfiguration on the web developers side.

kevincox

Yes, developers such as those that run Cloudflare's own official blog.

Maybe there should be some better defaults if they can't even use their own product correctly.

BTW a work around for this is to proxy the feed via https://feedburner.google.com/ which seems to be whitelisted by Cloudflare.

arielcostas

A lot of people are failing to conceive the danger that poses to the open web the fact that a lot of traffic runs through/to a few bunch of providers (namely, CloudFlare, AWS, Azure, Google Cloud, and "smaller" ones like Fastly or Akamai) who can take this kind of measures without (many) website owners knowing or giving a crap about.

Google itself tried to push crap like Web Environment Integrity (WEI) so websites could verify "authentic" browsers. We got them to stop it (for now) but there was already code in the Chromium sources. What makes CloudFlare MITMing and blocking/punishing genuine users from visiting websites?

Why are we trusting CloudFlare to be a "good citizen" and not block unfairly/annoy certain people for whatever reason? Or even worse, serve modified content instead of what the actual origin is serving? I mean in the cases where CloudFlare re-encrypts the data, instead of only being a DNS provider. How can we trust that not third party has infiltrated their systems and compromised them? Except "just trust me bro", of course

Retr0id

> Or even worse, serve modified content instead of what the actual origin is serving?

I witnessed this! Last time I checked, in the default config, the connection between cloudflare and the origin server does not do strict TLS cert validation. Which for an active-MITM attacker is as good as no TLS cert validation at all.

A few years ago an Indian ISP decided that https://overthewire.org should be banned for hosting "hacking" content (iirc). For many Indian users, the page showed a "content blocked" page. But the error page had a padlock icon in the URL bar and a valid TLS cert - said ISP was injecting it between Cloudflare and the origin server using a self-signed cert, and Cloudflare was re-encrypting it with a legit cert. In this case it was very conspicuous, but if the tampering was less obvious there'd be no way for an end-user to detect the MITM.

I don't have any evidence on-hand, but iirc there were people reporting this issue on Twitter - somewhere between 2019 and 2021, maybe.

progmetaldev

Cloudflare recently started detecting whether strict TLS cert validation works with the origin server, and if it does, it enables strict validation automatically.

SpicyLemonZest

I can easily conceive the danger. But I can directly observe the danger that's causing traffic to be so centralized - if you don't have one of those providers on your side, any adversary with a couple hundred dollars to burn can take down your website on demand. That seems like a bigger practical problem for the open web, and I don't know what the alternative solution would be. How can I know, without incurring any nontrivial computation cost, that a weird-looking request coming from a weird browser I don't recognize is not a botnet trying to DDOS me?

hombre_fatal

Exactly. If you're going to bemoan centralization, which is fine, you also need to address the reason why we're going in that direction. And that's probably going to involve rethinking the naive foundational aspects of the internet.

juped

how do you know a normal-looking request coming from google chrome is not a botnet trying to ddos you?

SpicyLemonZest

You deploy complex proprietary heuristics to identify whether incoming requests look more like an attack or more like something a user would legitimately send. If you find a new heuristic and try to deploy it, you'll immediately notice if it throws a bunch of false positives for Chrome, but you might not notice so quickly for Pale Moon or other non-mainstream browsers.

(And if I were doing this on my own, rather than trusting Cloudflare to do it, I would almost surely decide that I don't care enough about Pale Moon users to fix an otherwise good rule that's blocking them as a side effect.)

raffraffraff

I don't think people aren't aware that it's bad. They just don't care enough. And they think "I could keep all this money safely in my mattress or I could put it into one of those three big banks!" ... Or something like that.

progmetaldev

Maybe it's the customers I deal with, or my own ignorance, but what alternatives are there to a service like Cloudflare? It is very easy to setup, and my clients don't want to pay a lot of money for hosting. With Cloudflare, I can turn on DDoS and bot protection to prevent heavy resource usage, as well as turn on caching to keep resource usage down. I built a plugin for the CMS I use (Umbraco - runs on .NET) to clear the cache for specific pages, or all pages (such as when a change is made to a global element like the header). I am able to run a website on Azure with less than the minimum recommended memory and CPU for Umbraco, due to lots of performance analyzing and enhancements over the years, but also because I have Cloudflare in front of the website.

If there were an alternative that would provide the same benefits at roughly the same cost, I would definitely be willing to take a look, even if it meant I needed to spend some time learning a different way to configure the service from the way I configure Cloudflare.

nerdralph

What's the cost of annoying people trying to browse to your sites, some to the point where they'll just not bother?

zinekeller

This is rather blunt, but if it is between 98% (CF-protected) versus near-0% (heavily-DDoSed site), then you hopefully you now see the dilemma that other people faced.

progmetaldev

For companies that are just built around a marketing funnel to provide enough info to get you to fill out their contact form to sell you something, my guess is that Cloudflare is well worth the cost over increased hosting fees. I know it's not the answer anyone wants to hear, but I don't deal with too many companies selling anything more than around 5 or 6 figures, with products that you don't necessarily need very often.

I would like to know if there are alternatives somewhere close to the same cost, where I don't need to use Cloudflare. I don't enjoy annoying customers, or even dealing with sales and marketing, but I have built lots of software where I get to control the technology, and can get a new website up and running in 3 hours, with a ton of built-in functionality. I've spent about 12 years reducing the amount of memory the Umbraco CMS uses, compared to normal installs, and I love that aspect of my career. If I could get my clients to pay more and not use Cloudflare, I would happily go that route, believe me!

null

[deleted]

bux93

Of course we're trusting CloudFlare to be a good citizen. If they were not, they would be banned - unless they sold their business to a sovereign wealth fund.

arielcostas

I don't get if this is sarcasm (perhaps a reference to TikTok?), but in my case (european) it's a foreign third-party for me

pmdr

Cloudflare has essentially broken the internet. Blocking or restricting access of even residential IPs running in a real, common browser is evil. And just like that, we handed over the internet to a handful of companies, like it was never ours to begin with.

lopkeny12ko

Cloudflare has been blocking "mainstream" browsers too, if you are generous enough to consider Firefox "mainstream." The "verify you are a human" sequence gets stuck in a perpetual never-ending loop where clicking the checkbox only refreshes the page and presents the same challenge. Certain websites (most notably archive.is) have been completely inaccessible for me for years for this reason.

boomboomsubban

Do you have something that blocks some amount of scripts? I need to allow third party scripts from either Google or Cloudflare to get a lot of the web to function.

BenjiWiebe

I think the archive.is/Cloudflare issue is a known problem separate from the rest.

areyourllySorry

archive.is does not use cloudflare for bot protection.