0-click deanonymization attack targeting Signal, Discord, other platforms

473 comments

·January 21, 2025

internet_points

So if you send a picture to a Signal user, it's retrieved via cloudflare, and cached in a data center near that user; now you can look up the cache status and find the data center used. I'd say "deanonymization" is stretching it, unless the user is in the middle of nowhere (no other users near the data center). But interesting writeup anyway.

thrwaway1985882

"Near a user" is also a big assumption. I'm ~200 miles to ORD and ~500 to IAD, but my ISP's peering & upstream arrangements mean Cloudflare serves my traffic 700 miles from DFW.

But, at the same time: Cloudflare isn't going to serve me a cache from Seattle, Manchester, or Tokyo. Pinning down an unknown Signal user to even a rough geographic location is an important bit of metadata that could combine to unmask an individual. Neat attack!

btown

It's also quite insidious as you don't need to control anything on any server to get this information; as long as you can get your target to load a unique URL never before loaded by anyone else, you can simply later poll it with an unauthenticated HTTP GET from different locations, and find which one reports a Cloudflare HIT (or, even if they hid that information, finding the one that returns with lower latency).

If you're allowing user uploaded content, and you use Cloudflare as a CDN, you could mitigate and provide your users with plausible deniability by prefetching each uploaded URL from random data centers. But, of course, that's going to make your Cloudflare bill that much more expensive.

Cloudflare could allow security-sensitive clients to hide the cache-hit header and add randomized latency upon a cache hit, but the latter protection would also be expensive in how many connections must be kept alive longer than they otherwise would. Don't do anything on a personal device or account if you want your datacenter to be hidden!

chatmasta

Pre-fetching also becomes an issue for apps that are meant to be e2e encrypted, since it requires the server to download (read) every attachment. But if the app is already caching the attachment then they’re effectively reading it anyway.

(EDIT: Apparently signal e2e encrypts images prior to upload, so pre-fetching the encrypted blob from one or multiple servers would in fact be a mitigation of this attack.)

I do wonder if Telegram is as invulnerable as the author assumes. They might not be using Cloudflare for caching, or even HTTP, but the basic elements of this attack might still work. You’d just need to modify the “teleport” aspect of it.

ipaddr

Going forward uploaded content should never go through cloudflaire and it never really needed to.

Add unique urls.

Maybe just avoid it altogether.

judge2020

Note that CF will also route relative to the sites' plan. Enterprise sites are almost always routed to the closest DC, while if that DC is overloaded then lower tier websites, typically just Free sites, will get routed elsewhere (I suppose this is achieved via different anycast ranges where a specific DC is excluded). Although Discord, Signal, etc are almost certainly Enterprise sites.

I have this old site to test this (the list of sites is a bit old): https://cloudflare-test.judge.sh/

EE84M3i

WTF? the trace endpoint allows CORS from any origin?!? Why?!

8338550bff96

I doubt how useful it would be as an attack. As a single point of info it tells you next to nothing. As part of a composition of other indicators it would be the weak link in the chain probably just causing noise for the not un-likly scenario where the person you're targeting is using a VPN.

If it was any less specific we'd be talking about a deanonymization attack that outs whether or not a target is still on Earth.

raphman

Oh, this attack would be a useful tool for e.g., identifying whistleblowers that travel a lot (e.g., in academia, military). If you know their Signal ID, you could send them images from time to time and then compare their coarse locations with travel information for a number of suspects.

mmooss

> not un-likly scenario where the person you're targeting is using a VPN

Do you think a large proportion of Signal users also use VPNs? I'd expect it would be a higher proportion than the general population but still only a small minority.

ajsnigrutin

for "normal people", that's a pain, but with enough resources,...

Although. it has edge usecases even for "normal people":

Eg. you suspect your coworker to be catfishing you on eg. discord, you know that he's in your city now, verify, then wait for him to leave for a vacation to somewhere abroad, check again.

dotancohen

This is actually pretty smart, and shows that this exploit could be chained with other information to identify a specific individual. This could also be used to e.g. check which world-travelling reporter is communicating with you.

mmooss

It's not an edge case. Using multiple sources of information to paint a more complete picture is the norm. That's how marketing profiles work, for example.

shakna

Cloudflare does serve me from France. When I'm in Australia. (My ISP bought some IP addresses that were original regional France, back in the early 90s.)

So though this does have implications, the assumptions they utilise, like always, are not universal.

miyuru

> My ISP bought some IP addresses that were original regional France

CLoudflare uses anycast, and IP geo location is not how anycast works.

wkat4242

Wow doesn't that make things really slow due to the RTT of the acknowledgements?

null

[deleted]

null

[deleted]

null

[deleted]

bigbones

It gets more interesting when you think about the impact on groups. Sending an image to a group is enough for all devices associated with that group to be identifiable from CloudFlare's side, who additionally see a giant chunk of unencrypted traffic from the same client addresses going to other web sites. Given Cloudflare's less-than-straight approach to sales, it is astonishing the words "secure" and "Signal" ever appear in the same sentence.

CloudFlare get to see a fuckton of metadata from private and group chats, enough to trace who originally sends a piece of media (identifiable from its file size), who reads it, when it is is read, who forwards it and to whom. It really doesn't matter that they can't see an image or video, knowing its size upfront or later (for example in response to a law enforcement request) is enough

lolinder

> Given Cloudflare's less-than-straight approach to sales, it is astonishing the words "secure" and "Signal" ever appear in the same sentence.

This is an overly binary take. Security is all about threat models, and for most of us the threat model that Signal is solving is "mainstream for-profit apps snoop on the contents of my messages and use them to build an advertising profile". Most of us using it are not using Signal to skirt law enforcement, so our threat model does not include court orders and warrants.

Signal can and should append some noise to the images when encrypted (or better yet, pad them to a set file size as suggested by paulryanrogers in a sibling comment) to mitigate the risks of this attack for those who do have threat models that require it, but for the vast majority of us Signal is just as fit for purpose as we thought it was.

crawfordcomeaux

Hello, I'm an organizer for a system to coordinate multiple mutual aid networks, many of which are only organizing by Signal & Protonmail exclusively because they think they're secure and private.

People who are doing work to help people in ways the state tries to prevent (like giving people food) rely on this tech. These are the same groups who were able to mobilize so quickly to respond to the LA fires, but the Red Cross & police worked to shut down.

This impacts the people who are there for you when the state refuses to show up. This impacts the future version of you who needs it.

Most people aren't disabled, yet. Doesn't mean they don't need us building infrastructure for if/when they become disabled.

hedgedoops2

Maybe not individual warrants (at least not warrants to do non-scalable collections like hardware bugs in one's phone - I.e. warrants that, most users, with high probability, are not subject to). But mass surveillance, e.g. NSA, even with 'mass warrants' (e.g. Verizon-FISA warrant), that everyone is subject to, is probably in most people's attacker model. I don't have a study handy, but it seems reasonable that most users use signal to protect against mass surveillance and signal advertises itself as being good for this.

Also Marlinspike and Whittaker are quite outspoken about mass surveillance.

If cloudflare can compile a big part of the "who chats with whom" graph, that is a system design defect.

vel0city

> Signal can and should append some noise to the images when encrypted (or better yet, pad them to a set file size as suggested by paulryanrogers in a sibling comment) to mitigate the risks of this attack for those who do have threat models that require it

Adding padding to the image wouldn't do anything to stop this "attack". This is just watching which CF datacenters cache the attachment after it gets sent.

rangestransform

I think the threat model of enough signal users to matter is nation-state actors, and signal should be secure against those actors by default so that they may hide among the entire signal user population

doodlebugging

>It gets more interesting when you think about the impact on groups. Sending an image to a group is enough for all devices associated with that group to be identifiable from CloudFlare's side,

Doesn't this open up the possibility to identify groups that have been infiltrated by spies or similar posers? If you use this method to kinda-sorta locate or identify all the users in your group and one or more of those users ends up being located in a region where you should have no active group members then you may have identified a mole in your network.

Just thinking out loud here since there's no one else home.

gruez

>If you use this method to kinda-sorta locate or identify all the users in your group and one or more of those users ends up being located in a region where you should have no active group members then you may have identified a mole in your network.

...unless they happen to be using a VPN for geo-unblocking reasons or whatever.

paulryanrogers

I wonder if we'll see assets being padded to some common byte sizes to combat this.

greysonp

Hi there, Signal dev here. We do, in fact, pad attachments to a limited set of bucket sizes.

kijin

Nothing stops Cloudflare from inspecting the file contents, or using a hash to distinguish between identically-sized files.

The only reason we assume they don't do this is because it's a waste of resources for no good reason. But what if somebody gave them a good reason?

KennyBlanken

> , it is astonishing the words "secure" and "Signal" ever appear in the same sentence.

You misspelled "I do not understand what end to end encryption means"

xnorswap

It could be useful for correlation.

Say for example that you're an investigating agent in regular contact with someone.

A single data-point wouldn't mean anything. However, a sequence of daily image retrievals might tell you that they spend 90% of their time in WA and 10% of their time elsewhere.

That information alone still might not mean anything, but if you also have a specific suspect in mind, it may help confirm it. Or if you have access to the suspected person directly, if you're able to also befriend their "clean" profile, you might be able to pull the same trick and correlate the two location profiles.

De-anonymisation isn't about single pieces of information, but all information helps feed into a profile to narrow suspects or confirm suspicions.

( By "agent" I just mean a person, not an AI agent nor Law enforcement, who could presumably just get the information more directly from cloudflare. )

immibis

There's probably at least a few instances where you send someone you think is American a picture but it gets cached in Moscow, or vice versa. Or you post a meme to a Californian left-wing group and it gets cached in DC. Not hard to imagine situations where getting an unexpected rough location could be a valuable signal.

gruez

>Or you post a meme to a Californian left-wing group and it gets cached in DC. Not hard to imagine situations where getting an unexpected rough location could be a valuable signal.

Not really. Any public meme group is inevitably going to be monitored by intelligence agencies, and you should assume as such. Even if it isn't, I can imagine agitators from the other side joining the group with a Russian VPN to poison the well. If there's a private group of people that you supposedly trust, any competent mole is going to be using device/network level VPN to cover their tracks. Otherwise they're 1 click away (eg. if someone shared a link) from an opsec fail.

genewitch

you don't have to "befriend" them. you send a friend request because that defaults to a push notification for users with the discord app on their phone. Now, with signal, i don't use it so i don't know how initial chats start, or whatever. The discord one is 0-click because the PFP in the friend request is the payload delivered via PUSH.

And to someone else's point - they had to block the request on their end with a MITM to do the 1-click version on signal. No such MITM is needed with the friend request.

As an aside, one time i got doxxed hard in an IRC channel with several hundred active users. I had a suspicion of who it was, and i knew they lived in chicago. So i "accidentally" sent a link to "screenshot proof" that was hosted on one of my domains. there was 1 immediate click. instant. Chicago. "accidentally" because it looked like i pasted an email body.

Packed the real screenshot and a complaint to the ircadmin. they said "and so you dox them back?"

can't win for trying.

catlikesshrimp

You can also ping the same person multiple times, like once a day at different time of the day. That provides a more complete range.

lxgr

It's not stretching it. The expectation is that Signal does not reveal any observable aspect of your IP address or location when receiving messages on it.

Whether this specific level/type of deanonymization is a problem for your particular use case is an entirely different question. Personally, I wouldn't even care if mutual contacts were to see my IP address outright (and they do for calls), but I'm not every user.

genewitch

I don't care if users see "my" ipv4 because cgnat. I think i don't care if they can see my ipv6 because each machine gets a /64 to itself, that's the logic, right?

But my PBX and my matrix server both use coturn. Our 10 user "private" PBX we have to VPN into a fortigate in a DC to use, but to my understanding, there's literally no way to eavesdrop on those calls without already compromising the server it's running on, and if that's the case, no extra VPN steps or whatever will help.

anyhow even with a real, publicly routable IP, stock windows 11, stock macos (used to be true), and most linuxes won't get compromised by stuff like backorifice or whatever else l0pht put out as "remote administration tools". that is, there usually isn't any listening ports on a public IP these days. Shield's Up!

bigiain

> to my understanding, there's literally no way to eavesdrop on those calls without already compromising the server it's running on

That's probably correct (with the caveat that I suspect NSA/FSB/MSS/Mossad/whoever can reasonably be assumed to have backdoored Fortinet)

There is still the problem that an attacker with "global passive observer" capabilities (which almost certainly includes most non 3rd world nation states, and probably a few of the more problematic 3rd world ones too) can still do traffic analysis to uncover your social network (or criminal/terrorist/whistleblower/journalistic network) by identifying the call traffic endpoints.

bigiain

> I think i don't care if they can see my ipv6 because each machine gets a /64 to itself, that's the logic, right?

I suspect you're looking at that wrong.

It's each internet connection that gets a /64, not each machine. Your ISP hands you a /64 and you can do whatever you like with it on your home(/corporate) network.

So you can choose from 18 thousand trillion IPV6 addresses for any machine behind your ISP/internet connection, but the top half of your IPV6 address uniquely identifies that ISP and they can connect that to your account/payment details, with 4 billion times as much precision as an IPV4 address.

pyeri

Exactly. Especially when considering that Signal was often advertised as that *one* privacy friendly open-source messaging solution in a world dominated by data-collecting demons like WhatsApp, etc. I don't think even WhatsApp let's such status details leak; notwithstanding whatever they might be doing with the user data on the backend.

sojournerc

I can send a link in Whatsapp to a domain I control and track if clicked. How is that different?

soerxpso

"Deanonymization" doesn't have to refer to a full exact address. There are people who wish to conceal which country or region they live in, which this cripples.

There was a real example of that amount of information being relevant in the Silk Road investigation. Ulbricht accidentally revealed his timezone early on, which was useful to US authorities since it narrowed him down to being in the US, whereas without that information he could have been from anywhere in the world.

alp1n3_eth

Not really.

Anyone who wants to conceal what continent they're on will also be using a VPN 24/7, or will have the proxy setup in Signal (AKA running 24/7), which defeats this.

lolinder

Yep: If your threat model includes an attack like this and you're not always on a VPN already, you're likely already compromised.

This is a neat demo, but it should not fundamentally alter the way that anyone is using Signal. Either it doesn't matter to you or you already have mitigations in place.

harrall

When I was ~15 and this was ~2004, some friends and I ran a forum with a lot of users and did some bad things where we would track down repeat banned users and screw with them. (In our defense, they were screwing with us.)

We used everything, from browser fingerprinting (and EFF only made the world aware of it 6 years later), looking them up in databases, tracing every digital evidence they left, etc.

Every little thing counted. What I learned is that people leave a lot of traces and you can collect these traces to dox them. The way you write is even sometimes fairly identifiable.

hmottestad

If I know someone on Signal I can now check if they’ve left the country.

Or send this to a bunch of signal users whom you suspect one of them being a particular person, and if you know that the person you are looking for is going to travel you can send it once before and once after. Then see which of these users were in the home city and subsequently in the destination city.

sojournerc

A VPN obfuscates this. Assuming a target is even remotely aware, you might think they are in Australia, while they're actually in Nova Scotia

iforgot22

Say I send a message to someone who has a phone with push notifications enabled, showing message previews. Will the phone still be connected to the VPN when it wakes up to display the message? Because my iPhone doesn't seem to stay connected to my VPN when it sleeps, at least not reliably.

There really should be a "never use the internet without VPN" mode on devices.

oceanplexian

The real attack is that a law enforcement agency can trivially subpoena CloudFlare with the attachment URL they will hand over the IP address of the recipient of the image along with whatever other requests they made through the CDN which can pretty precisely and rapidly de-anonymize you.

alp1n3_eth

Cool writeup with some interesting techniques and approaches!

I'll echo the other comments and say "deanonymization" is stretching the definition of the word, along with "grab the user's location", as it isn't anything near precise. 150 miles is approx. a 2-hour drive on the highway from Atlanta, GA to Augusta, GA. In that radius, there's probably 700,000+ people.

I do think the auto-retrieve attachment feature of Signal is slightly concerning, as for a private messenger I'd expect there to be an option to turn it off (like turning off JS in Tor). I don't know if I'm not looking deep enough, but there doesn't seem to be a feature for that.

Signal appears to take a useful-by-default approach that balances privacy and ease-of-use in order to encourage adoption by the masses, I'd assume most people that are really concerned are hardening Signal, similar to what is in this guide: https://www.privacyguides.org/articles/2022/07/07/signal-con... . They've always recommended a VPN / proxy + a modification of settings for more high-security scenarios.

Caching isn't going anywhere, and neither is CloudFlare. The DoSing days of old in P2P multiplayer lobbies with exposed IPs seemed to carry more of a threat than this, CloudFlare's response seems to be the best out of the 3. Caching sensitive information is never recommended and the onus is on the application doing the communicating to tell their CDN / middle-service to not cache specific items.

giancarlostoro

> "deanonymization" is stretching the definition of the word, along with "grab the user's location", as it isn't anything near precise.

You'd think so, but you would be surprised how quickly this adds up to other details people share, like "oh I just drove 15 minutes to get Starbucks" or something to that effect, small things that eventually add up to a precise location over time.

alp1n3_eth

> you would be surprised how quickly this adds up

Yes, but if social engineering is involved and tracing back through user conversations across a platform, it's hardly a vulnerability, let alone one deserving of a bounty. The way this is currently functioning is intended functionality, and can be further locked down depending on the user's threat model.

This can essentially be classified as opsec failure for the Signal user. If they're trying to hide from a hit in a 300 mile radius, they've got bigger problems to worry about, and should already be using a VPN setup.

Every time you click on a link your external IP addresses is exposed, is this a vulnerability? Being online without a VPN / proxy is inherent consent to have your external IP & other required items to be shared with services / middlemen.

When it comes to Discord, if you have this strict of a threat model and you're still using it, idk what to tell you.

mmooss

This is all the classic dismissals of security issues, including blaming the user.

> opsec failure for the Signal user

Signal's mission is to provide security for users who don't know the word 'opsec'.

hmottestad

If I can send you a link and be guaranteed that you click on it. Then that’s definitely a security issue.

giancarlostoro

> When it comes to Discord, if you have this strict of a threat model and you're still using it, idk what to tell you.

I mean, you just never know... I've seen a lot of wild things, I've seen what drives people to doing crazy things. Just look up the "Deadly Runescape E Dater" who flew from the US to the UK to stab the girl he e-dated.

vel0city

You can disable the auto-download. Settings > Data and storage > Media auto-download, you can choose what to auto download for mobile data/wifi/roaming.

alp1n3_eth

Thank you! That's what I get for quick scrolling through the settings. I for sure thought it would have been under Privacy (for this concern), but that makes sense too.

LWIRVoltage

So, just to confirm my understanding, if one goes into those settings and disables all auto-download, that helps- but, then a user will manually download images, correct? Are they still vulnerable to this issue then at that time?

jcul

Ah I made the same mistake.

Whatsapp has this option and I'm pretty sure it is in privacy settings.

gorfian_robot

hmm. I find the auto-download setting in the mobile app but not on desktop (mac). anyone know?

gorfian_robot

(some comments seem to suggest that the desktop app always auto-downloads)

tribby

it looks like it can’t be disabled for view-once media (or at least, that’s what the settings screen says)

vel0city

I wonder if view-once media is even handled the same way as a regular attachment (using CF) or is sent more like a regular message.

I imagine if one really wanted it to be view-once, it wouldn't go to a CDN.

Thanks for pointing this out!

why_only_15

Random unrelated point: in a 100km radius circle between Atlanta and Augusta there are ~2,000,000 people (calculated using https://www.tomforth.co.uk/circlepopulations/ )

alp1n3_eth

Haha thank you for doing the math! I was lazy and just added the populations and a plus at the end.

maxrmk

Cool! Contrary to some of the other posters I think this definitely counts as deanonymization, or at least is close enough. How anonymous would satoshi be today if we had his location to within 250 miles?

Repeated applications of this attack (maybe disguised somehow?) could let you track someone’s travel over time, and it is usually only takes 4-5 zip code sized locations to uniquely identify someone.

aimazon

The counter point is that anyone who cares about being anonymous is using methods to disguise their identity that cannot be compromised by this attack, e.g: a VPN. Plus, there are much more effective versions of this attack, like sending a link to an endpoint that you control -- getting someone to click a link isn't hard if you're considered trustworthy enough to send them notifications. And less technical versions, like correlating when the user is online vs. offline with timezones around the world.

The method that both Apple and Cloudflare use in their own privacy software (iCloud Private Relay for apple, WARP for Cloudflare) is specifically based on the idea that your region is not information that reveals your identity. If you enable Apple Private Relay, your origin IP will be obscured but the IP your traffic is routed through will be in the same country -- same principle.

https://www.apple.com/icloud/docs/iCloud_Private_Relay_Overv...

This attack is academically interesting and novel but it's not "deanonymization".

tom1337

> The counter point is that anyone who cares about being anonymous is using methods to disguise their identity that cannot be compromised by this attack, e.g: a VPN.

Yes unless Apple is doing Apple things and ignores VPNs for things like push notifications…

https://x.com/mysk_co/status/1579997801047822336

fsflover

> The counter point is that anyone who cares about being anonymous is using methods to disguise their identity

https://news.ycombinator.com/item?id=42784398

tpoacher

Not everyone who indirectly cares about anonymity is an activist who feels they need to go to great lengths to disguise their identity. Sometimes anonymisation is part of a process, and the ability to collect potentially deanonymizing data this way is still a privacy breach.

E.g. imagine sending otherwise anonymised participants in a clinical trial a questionnaire, containing an image. The owner of the image could then partially deanonymize the trial participants. Or voters. Or demonstrators in a rally.

Not everyone who cares about privacy is Edward Snowden material.

rosseitsa

I am not sure I understand what you mean by "trustworthy enough to send them notifications". Do you need anything other than one's phone number to send them a signal message?

tEem21

The recipient would need to have this enabled, though it is by default. You can deactivate allowing others to initiate chats with you from your phone number (Settings > Privacy > Phone number)

amyames

On iCloud public relay, go to settings and select “use country and time zone” instead of “use general location.”

Now you’re no longer “within 250 miles,” hell my phone geo IPs everywhere from Louisiana to New Jersey , which are not even “in my time zone,” but there you go.

This setting was pissing meta/Facebook off big time because they also couldn’t narrow me down to a precise geographical area, resulting in much nagging and whining about “was this you signing in from [shreveport]?” and frequent account lockouts , password resets, and endless requests to approve my logins from a device that’s already logged in before I finally said to hell with it and deleted FB a few days ago.

I figure if a privacy setting makes meta mad , then it’s .. probably … a good setting. Must really irk them trying to sell location relevant ads when my state changes every other time I unlock my screen.

It’s a combined behavior of using private browsing and refusing to install their app, thereby giving them a permanent supercookie no matter what my IP is, so if you don’t like the sound of this it [might not] affect you if you use their apps. “X” does it too, just look up “inferred identity+ twitter” on google.

I’m editing out a tall claim in the last paragraph of this for some other time when I’m less tired and have sources next time we’re on the subject.

meowface

Satoshi's possible home IP address actually did leak shortly after Bitcoin's release, though it wasn't realized until years later.

(It definitely may not be him and might instead be a random early user. But I think there's a moderate chance it's him.)

Details: https://news.ycombinator.com/item?id=29728339

(I don't advocate attempting to find and publish his name and address, since it'd make his life difficult, but it's still very interesting in the abstract as a curious unsolved mystery for all these years despite the number of eyes on it.)

cenamus

How many people live in a 250 mile circle around New York?

everfree

I think the more important question is how many people in the world don't live within a 250 mile circle around New York? An investigator could potentially cut their geographical search down by 95%+.

modeless

Also the attack can be performed multiple times and if a person travels it could narrow down the possibilities quite a lot.

kiwijamo

How many people live in a 250 mile circle around their Cloudflare POP?

Which Cloudflare POP I hit depends on which RSP I use. In the country I live in, our biggest RSP peers with Cloudflare in a neighboring country (as it is much cheaper for Cloudfare to send traffic via that RSP's peering exchange there). So something like 40% of traffic will seem to be from a entirely different country than reality.

My RSP is a small RSP which until fairly recently only had two POPs in the entire country. So regardless of where you lived, customers of my RSP would have traffic exiting onto the internet via only one of two exit points. Rural users would seem to be coming from one of the two largest cities in my country even if they are easily >250miles way from their particular POP. They do peer with Cloudflare but obviously only at the locations where they and Cloudflare are in the same city (and I'm not sure this is the case -- it is possible all national traffic to Cloudflare traffic actually goes via the one POP in our biggest city).

The only reason this attack identifies the city I happen to be in is because I live in the same city as my little's RSP's biggest POP and Cloudflare happens to peer with that RSP at that POP. Where I am is a large city so doesn't narrow things down very much -- but even worse is that whoever is looking for me would actually need to look anywhere in my country.

I don't think I am an unique case as internet routing is rarely the most direct path for various technical, financial, political, etc reasons.

De-anonymization is definitely stretching the reality of what this 'attack' is capable of IMHO.

kachapopopow

You can already do the same with advertisement ID in (almost) every single one of these applications.

byearthithatius

Still quite anon. He almost certainly used a VPN, and if he didn't he likely lived in a major city which included thousands if not hundreds of thousands of capable engineers. If it said he was in SF during some messages that would tell us literally nothing.

kandesbunzler

... very anonymous because he was most likely using a VPN lmao

lxe

Not sure why so many top comments dismiss the severity of this. This is just exactly the type of attack that give law enforcement or a malicious actor a way to establish proof of whereabouts.

byearthithatius

I would guess some are just jealous of his age, but some do find the claim of de anonymizing to simply be overblown given it doesn't tell you nearly enough to find anyone except in very niche cases. This "attack" is easily defeated with a VPN or living in any major city.

Aachen

You don't need to live in a major city. Cloudflare is never going to set up a caching proxy for a hamlet in the desert; you'll always be part of a huge group that a given caching proxy serves. The attacker can be happy if they can narrow the recipient's location down as much as to a single country

palmfacehn

Posters are missing the point by projecting themselves into the scenario. Yes, it probably isn't a concern for someone living in the US or the EU. The calculus is different if you live in a smaller country, a politically sensitive area or are involved in activism against an authoritarian state.

Even for individuals in those large, developed suprastates, it opens the door for catfishing and other social engineering approaches.

gtsop

Interesting you touched on his age. I got extremely curious, why did the OP did such a flex?(assumming they are telling the truth). The first sentence is such a weird brag that it felt suspicious. The report is highly technical and extremely well written. We're either dealing with a pure genious or a fraud. But why would a genious flex? Doesn't make sense.

adamrezich

I don't know what you think is genius about any of this, but you're right, the flex is odd. It's something I've been seeing more and more of lately, and I find it off-putting, because, Back In My Day, I never had such a phase, where I felt like I should be given more credit for my 1337 h4xx0r skillz, because I was in high school or whatever—and I don't remember anyone else doing it, either.

I can only assume this is a consequence of modern social media having shifted the Internet from being a bunch of pseudonymous people making and sharing stuff, to everything being myopically focused on one's identity first, and what they do second (as is literally the case here).

And it looks like it works to achieve its desired effect, too—a significant portion of the comments here are congratulating the guy for doing such a thorough technical write-up, given his age. Maybe this is just me being a grumpy “old” man now, but I would've found that condescending when I was his age, and would've rather concealed my age than be condescended to as such. But, to each his own, I suppose.

iforgot22

I can believe a very talented 15yo pulling this off. But the number of anonymous "I'm 15 and this is my impressive feat" posts on HN made me wonder if it's just a joke.

Shocka1

The flex is probably because they are 15 and a bit immature.

An anecdote - I used to be the lead for an infotainment security program for a well known manufacturer. They would send me to a lot of small security meetups and training. One in specific was a security training event for some HS students, which I would define as gifted. There were roughly 40 of them, from 14 to 17 years old, and they were all extremely impressive in things ranging from reverse engineering applications to assembly code all the way Linux systems stuff.

Something like this would have been easy for them - I mean the basics of this is that the attacker sends a message to a discord/signal user and then sends a request to a Cloudfare server. Not exactly splitting atoms IMO. What I think is special about this 15 year old in question is the epiphany that it might be possible and then giving it a go. This is the true hacker spirit.

iforgot22

Someone on GitHub called him out for making a Twitter account in 2017, since he'd have to be 8 years old at the time... I don't see what's so unbelievable about an 8yo making a Twitter account.

midtake

It's common in our community to be jealous of youth, especially youth with genius, but I believe the github user in TFA is lying. Something about it is off. It also happens in posts here on HN, though.

I think after being exposed to so much internet, you realize how many are simply living a fantasy. In me it provokes a sense of disgust, I see it as part of a broader groomer problem on the internet, but this is a distantly minority take.

gtsop

I believe most people (me included) dismiss the OP's claimed severity, as if it is being oversold. I see a balance of opinions saying "great find, but not as critical as claimed" so they don't seem dismissive. It is important to correctly classify the severity of issues. Proof of whereabouts is not deanonymization, especially when the abouts are so loose

jcul

Even knowing their country from this would be a big first step.

mmooss

They dismiss it for the same reason people dismiss disruptive new technology - they are uncomfortable with it. It's a signal (ha) that the threat is very real.

First dismiss it and see if the problem is still there in the morning. Hope that before then, someone finds a reason it's not a problem. Anyone?

kelnos

Why has Signal even enabled caching for those URLs? The most common case is going to be that the attachment is downloaded once, and that's it.

I would even expect that Signal wouldn't allow you to download it more than once, and would immediately delete it after the first successful download. Well, ok, maybe the client fails mid-way through, so allow some grace period for a re-download. But I can't imagine that would be the common case either, and so disabling caching on their CDN would fix this issue, and hopefully not increase their costs much.

At any rate, "deanonymization" is a bit clickbaity here. Narrowing someone's location to within 250 miles or so isn't great, but it doesn't deanonymize them.

Edit: I didn't think about the case where an attachment is sent to a group chat, where multiple people will be downloading it. But in that case wouldn't the attachment be encrypted individually for each person in the group? I'm not sure how this works, of course.

alp1n3_eth

Signal's default setup is more usability focused while supporting E2E, and less about tinfoil hat threat models about being present on a continent you're a citizen of.

The items you mentioned can essentially be configured, for those that want the insane level of privacy / security. Messages can be auto-deleted 30 seconds after being seen, a proxy can be configured to route all your traffic through it, and tons of other things can be done to customize it more to the user's liking.

I'd imagine they're caching it because of egress costs. File attachments, voice mail, video, etc. can all add up.

mqus

> Signal's default setup is more usability focused while supporting E2E

If images/attachments were e2ee, this problem probably wouldn't exist, right? or are the images on cloudflare encrypted?

Edit: I should clarify. I didn't mean the encryption itself fixes the problem, but rather that: If this were handled like the text messages we send (not via cloudflare CDNs) then this wouldn't exist. I get that attachments are quite some bytes bigger than text but shouldn't the security guarantees be the same?

tom1337

I actually also wondered about this because if Signal does not encrypt attachments and delivers them via CloudFlare and that would suck as CloudFlare could just look into all them.

It seems that signal is indeed encrypting all attachments and therefore the encrypted attachments are cached and served via CloudFlare.

alp1n3_eth

From what I know* (heavy on the asterisk there), they are. I'm guessing at their setup at this point, but it sounds like the "large" data is probably being stored (while encrypted) in a different way / separately than the messaging. Since it's supposedly E2E (not gonna pretend I've hand verified it), it's decrypted on the device, but it needs to be grabbed in the first place from said separate place.

So, I'm guessing the images are encrypted where they're stored. And from his post it sounds like it doesn't happen with the messages, so the motivation for using CloudFlare probably is around egress pricing, or they could be using CloudFlare R2 for storage as well.

iforgot22

Group chats and multi-device users maybe

tech234a

Note: this person is the same 15-year old who found the Zendesk Slack takeover exploit a few months ago [1].

[1]: https://news.ycombinator.com/item?id=41818459

yaomtc

Given the twitter account was made in 2017, they would have been eight: https://x.com/hackermondev

And that bug report to Adobe was made when they would have been five years old: https://hackerone.com/daniel?type=user

aimazon

I think that's just a quirk of HackerOne's username system. The username daniel was previously owned by another account (now known as daniel-hamid) which submitted a bug to Adobe. If you go through @hackermondev's tweets (starting in 2018) they are without question a kid (making games in Roblox and Minecraft) and then started to show an interest in hacking in 2020 (which lines up with when they created their HackerOne account). The claim of being 15 years old is plausible (presumably with parents / guardians who are accomplished in technology).

yaomtc

Thanks for pointing that out, I missed the username change.

iforgot22

He'd be 8 when he made the Twitter account, not when he discovered that exploit. Pretty sure there are tons of 8yos with Twitter accounts.

null

[deleted]

hypeatei

This is certainly an "attack" but not one you'd normally associate with zero click. There is no code execution, but some tricks to see which Cloudflare datacenter cached the image -- giving a very rough area the user is in. Impressive and insightful nonetheless.

sim7c00

depending on the circumstance, the rough area might already be useful to adversaries of the person trying to hide. I wouldn't expect things like criminals etc. to suffer from this, 300 miles is a big radius for example... but if you want to know if 'the guy is still in country' or something like that (for instance law enforcement) it's useful for them. such parties could then collaborate with local resources to do further investigations. knowing which local resources in what area to enable might save a lot of 'costs'.

as you said, impressive and insightful. :D kinda feel like the docs on it were a bit chatGPT aided, they are super clear and full of 'certain sentences'. (this is totally an excellent use-case for that, so not bashing on it at all!).

nice read.

sitkack

You would know if they are over a cellular network or checking on mobile.

If someone sends you a youtube link and you hit play, YT knows who you are, both from a network perspective and potentially the logged in user.

If you are using signal in a high risk environment, you should be using it from a system that contains no extra information about you. This is the same posture one should take when using Tor.

Basic opsec.

I don't think these kinds of things are in signals threat model. It is meant? as a message platform for people with nothing to hide?

sim7c00

i don't think you can call opsec basic, since it requires tons of knowledge about technology and techniques adversaries might deploy against you. targets of attacks don't neccesarily have this kind of knowledge.

opsec is _incredibly_ hard for a person not deeply into technology and this type of information. you might argue that you need to stick with certain tools and techniques that are known good, but new vulnerabilities and techniques implemented against you can completely shatter previous knowledge on whats good and bad opsec and still break it despite doing it 'very well'. (like certain darknet markets being closed down due to new vulnerabilities being found in the platforms they use...)

most people who rely on opsec/tradecraft for a living, also rely on teams of people to help them maintain it and validate it constantly... (or eventually fail and get bitten).

you are right though that its unlikely a company or app producer would have a threat model tuned to people who want to hide stuff. those things generally tend to be closed down sooner or later. (encrochat and such services...)

lovasoa

Law enforcement could probably just ask cloudflare for the exact IP address that retrieved the attachment.

rlpb

Only if they're from a friendly country. If the reason a user needs anonymity is geopolitical, that isn't a guarantee.

sitkack

Anyone can do this are per the TFM, which is an excellent read.

sim7c00

do you think law enforcement in Iran will get an answer from cloudflare?

open-sesame

Unless I'm missing something, this seems like an incredibly long winded way to check the users IP location?

For example, connecting to a VPN and checking https://cloudflare.com/cdn-cgi/trace gives me `colo:CPH` (Copenhagen) which is far from my nearest CF datacenter (geographically), closer to the IP location from my VPN provider (Oslo) but still not particularly close?

If I don't use a VPN, I don't even get the capital city of my country (which I'm in right now), I get a colo approx 250 miles north. So I also dispute that Cloudflare always returns the "nearest available datacenter".

Don't get me wrong, the write up is cool and certainly interesting - just not convinced on the real world applications here...

zild3d

> Unless I'm missing something, this seems like an incredibly long winded way to check the users IP location?

It's less accurate than that. IP Geocoding can be down to the city level in many cases. This is _maybe_ nearest cloudflare data center

ziddoap

>just not convinced on the real world applications here...

As a piece of data alone, the results are probably not of significant use.

The real-world application (and potential danger) is when this data is combined with other data. De-anonymization techniques using sparse datasets has been an active area of research for at least 15 years and it is often surprising to people how much can be gleaned from a few pieces of seemingly unconnected data.

andix

> The real-world application (and potential danger) is when this data is combined with other data.

That's exactly the point. In this case it's only really possible to de-anonymize people who take long distance trips. But based on two data points it might be possible to know which flight or train a person travelled with.

With three different data points it might be quite unique. For example you might find out somebody travelled from Italy to Norway on Monday evening and then to France on Wednesday morning. There are probably not so many people who did a trip like that, it might come down to only one (or a handful) people who fits this itinerary. With other data sources it might be possible to uniquely identify this person.

gruez

>The real-world application (and potential danger) is when this data is combined with other data. De-anonymization techniques using sparse datasets has been an active area of research for at least 15 years and it is often surprising to people how much can be gleaned from a few pieces of seemingly unconnected data.

Seems pretty handwavy. Can you describe concretely how this would work?

ziddoap

>Seems pretty handwavy.

It has a whole Wikipedia article and everything.

https://en.wikipedia.org/wiki/De-anonymization#Re-identifica...

>Can you describe concretely how this would work?

Here's one of the earlier papers I remember off-hand, demonstrating one methodology. New (and improvements to existing) statistical techniques have happened in the ~18 years since this was published. Not to mention their is significantly more data to work with now.

https://www.cs.utexas.edu/~shmat/shmat_oak08netflix.pdf

"We apply our de-anonymization methodology to the Netflix Prize dataset, which contains anonymous movie ratings of 500,000 subscribers of Netflix, the world’s largest online movie rental service. We demonstrate that an adversary who knows only a little bit about an individual subscriber can easily identify this subscriber’s record in the dataset."

From the Wiki I linked:

"Researchers at MIT and the Université catholique de Louvain, in Belgium, analyzed data on 1.5 million cellphone users in a small European country over a span of 15 months and found that just four points of reference, with fairly low spatial and temporal resolution, was enough to uniquely identify 95 percent of them." [...] "A few Twitter posts would probably provide all the information you needed, if they contained specific information about the person's whereabouts."

Point being that operational security is hard, and it takes a lot less to "slip up" and accidentally reveal yourself than most people think. Obtaining a location within 250 miles (or whatever) can be a key piece of information that leads to other dots being connected.

Other examples (albeit with less explanation) include police take downs of prolific CSAM producers by gathering bits and pieces of information over time, culminating in enough to make an identification.

botanical76

Do you not buy that a user's IP location needs to be protected?

There is a reason applications go to so much effort to proxy requests to resources such as images. It's not free to do this.

lxgr

Having your IP address not revealed to people that can message you on Signal seems like a pretty reasonable privacy expectation.

gruez

Your IP isn't revealed though, only your vague geographic area.

lxgr

That's marginally better, but can still be a problem. Just consider e.g. a whistleblower working for a company with a very small satellite office in a given country.

udev4096

Did you even read it? There's no IP leak. And if you're a high target, then using some kind of proxy is literally the first step you take. The attack is nothing but an exaggeration and has no merit in real world

lxgr

Yes, I read it. Information about your IP address is leaked, as that's how Cloudflare routes you to a given datacenter.

And I strongly disagree that being able to uncover somebody's rough geographic location is not a privacy problem.

I wouldn't be surprised if this, for example, lets you deduce if somebody is currently home, at work, or commuting (as all three ISPs might be hitting different Cloudflare datacenters). That's not information everybody is comfortable broadcasting to the world.

kgeist

I guess it can be useful for tracking fugitive political dissidents, terrorists, etc. If you can narrow their location down to 250 miles, it's already very useful information. And without raising any suspicions.

vel0city

It's not really narrowing it down to 250 miles; its narrowing it down to a circle whose radius is at least 250 miles or ~196,000mi^2.

My closest Cloudflare CDN is just listed as "DFW". The DFW metro area is about 8,700mi^2, and I imagine I could be even further than the "metro area" and still get the "DFW" Cloudflare datacenter.

In their little video animation, the area inside the overlap of those two circles encompasses several states. The edges of the two circles go from Washington to Florida and almost include Chicago. The target could have been in Denver or St Louis or Las Vegas or Phoenix or San Diego or San Francisco or Amarillo or El Paso.

kgeist

I think it's still useful. Going from "we don't know where Osama bin Laden is at all" to "he's somewhere in Pakistan".

alam2000

[dead]

thayne

What is the benefit of caching images in a cdn for Signal?

Assuming local client-side caching, the total number of requests for that resource should be very small, probably one in the vast majority of cases.

On an unrelated note, it seems like CloudFront could very easily fix this by not returning the cf-ray header, or at least having an option for the customer to remove it. Although, it might still be possible to get that information based on timing information...

angry_octet

It isn't caching, it's CDNing. It is just an artefact of CDNs that they act as caches for the original content, and for improved distribution response time they cache to the nearest server from the response. ('Nearest' being an approximate heuristic, it is property of the anycast route tables in the BGP routers the request passes through, it is actually a 'best route'.)

thayne

That caching is something you can turn off, at least for every CDN that I have worked with.

The Cache-Control http header has a `private` directive specifically to inform CDNs and similar not to cache the response.

Aachen

> it seems like CloudFront could very easily fix this by not returning the cf-ray header

Then you just look at the response time. If the resource needs to be fetched from another continent, this is probably reliably measurable

Same for websites trying to hide which users exist: do a login request for an existing username and it'll do the password hashing (usually adds at least 50 ms to the response time), whereas for an invalid username it early exits. The fix is to always run the same code, so always do the hashing, which very few sites do. (Or not care about revealing this and telling people straight out that their username is unknown, if that fits with your threat model.) So to get back to Cloudflare's case: it won't help unless they delay responses, which is the opposite of what they're supposed to do

jrochkind1

I dont' believe the Signal app/network is choosing to cacheimages in a CDN?

But any user can send anyone other user a message that includes a link to a CDN-cached resource. Isn't that the "attack" here? Or am I misunderstanding?

aprilnya

Signal does cache them in a CDN. If the vulnerability was sending any link, you could just set up your own web server and get the person’s IP

jrochkind1

Ah, and the attack is knowing what CDN that is that signal itself is using, and examining it directly? I had missed that somehow.

modeless

Yes, Cloudflare should allow customers to disable that header, and Signal shouldn't cache images sent to a single person, or even groups of less than a few hundred people.

dualogy

> the total number of requests for that resource should be very small

"For that server" is the other number-of-requests..

popcalc

So that law enforcement can ask Cloudflare for the IP logs... Signal is a joke.

https://simplex.chat/

moe_sc

Signal claims to be a private, not anynomous, chat application.

Theirt defaults are set so they can get mass market addoption, whilst beeing a big step up in privacy compared to the usual players in the space (like whatsapp and telegram). You simply won't be able to get the average user on apps that make use more complicated and apps like simplex doe exactly that.

If you want Signal to be more secure, you can circumvent this attack vector by disableing auto downloads for media.

I'm not saying Signal is perfect, there has been a bunch to critisize over the years.

But why argue about use cases they never claimed to solve?

aja12

I'm a bit at a loss there. Has _anyone_ ever considered Signal to be anonymous? Or Discord? If so, I have bad news: they are not anonymous. At all. Not even slightly anonymous. Nor did they ever claim to be, they only claim to not be able to read your messages (Signal claims that, I don't know about Discord, I doubt it). And that claim has flaws (sure the crypto is sound but have you thoroughly reviewed and compiled the version you are using right now?)

At the very best, they are weakly pseudonymous, but that's about it. And yes, loading media by default has always been a staple of applications who prioritize their users' convenience at the expense of some security, a fine choice for the usual threat model of their users. And embedding media in messages has always been a staple of deanonymization attacks.

So ok, the tracking pixel has been shown to still be a relevant technique today, that's nice but not surprising.

If you want to remain anonymous though, don't use Discord or even Signal, and I'd advise against posting on HN either. Maybe, if you automate the pasting of messages (no js!) that has been reworded by a local llm from throwaway accounts through whonix, at random times that can't be correlated to your timezone, you _might_ have your chances. Don't bet on it.

Anonymity does not exist any longer.

upofadown

I am currently banned from the Signal subreddit for pointing out that we only have Signal's word that they don't collect metadata. So, yeah, people do consider Signal anonymous...

iforgot22

People do use Signal and Telegram* in settings where anonymity matters. Sure they aren't meant for that, but there's no other widely-understood solution, and most of the time it's good enough for them.

* Funny enough, not vulnerable this time because they use an in-house protocol, which is maybe even worse.

null

[deleted]

DirkH

People keep forgetting anonymous and private are two different things

kovariantenkak

Looking at the locations where Cloudflare has their servers [1] in the middle of Europe. With Geneva, Zurich and Munich there is definitely the possibility that this attack on Signal will leak whether someone is at home or not.

I don't understand how Signal could dismiss this so easily. I'm starting get a bad feeling about their responses to these "low" stakes attacks. They already missed the ball on the database encryption mishap on desktop.

[1]: https://www.cloudflare.com/network/

notatoad

This is just the fundamental way the internet works, and is the reason that anonymizing proxies like Tor exist.

If you don’t want people to be able to detect your rough geographic location, you should be using a proxy to hide it. For everybody else, knowing the edge server you are closest to is really not a threat.

mmooss

People for whom it's a threat don't necessarily understand anonymizing proxies - very few do. Signal is supposed to provide security for those who do not.

Aachen

Where does Signal claim that, or who decides what they're "supposed" to provide?

If wishes had wings, sheep would fly. People who want their computer to do a certain thing can also be expected to do a quick web search for how to make it do said thing. E.g.: hiding location? Use onion routing. Signal doesn't claim to hide your country (heck, they require your phone number!) so it seems wishful thinking to say they should have included e.g. a Tor client and enabled it by default

perbu

No, it isn't. This is Cloudflare passing exposing metadata when it really shouldn't. Having a configuration option or a origin response header akin to CloudflareCache: private or something is trivial for them to implement.

The same information would then be available in the timing, but given the distributed nature here, that would be a lot harder to pull off.

null

[deleted]

iforgot22

There's a real difference between Discord itself knowing your location and any Discord user in the world knowing it. Just like there's a difference between the VPN provider knowing your ipaddr and every website you visit knowing it.