MapTCHA, the open-source CAPTCHA that improves OpenStreetMap [video]

53 comments

·February 13, 2025

Presentation Video: https://fosdem.org/2025/schedule/event/fosdem-2025-5879-mapt...

Repo: https://github.com/ciupava/maptcha_dev

I didn't make this I just wanted to share here before I add it to my weekly urbanism roundup newsletter https://urbanismnow.com

Visit

neilv

Comments in case the demo developer sees this:

* The swiping in the demo was a rough for me, on a laptop with Firefox. One of the tricks was to be sure to release the mouse button before the pointer hits the edge of white rectangle fake screen. Swiping off the edge without a button-up event doesn't seem to be handled.

* At the end of a swipe that registers, after you mouse-up, there's a noticeable lag of sometimes up to approx. one second, during which the card is frozen in place, before it finishes sliding off.

* The rotation effect on the card as it's being swiped wasn't intuitive, IMHO. It doesn't follow the vertical movement in the swipe, and there's no obvious physical metaphor for why it's doing that. Perhaps especially with the other roughness going on, the rotation confuses things a little bit more; but maybe, if the other behavior was perfect, the rotation would be fine.

fragebogen

Fun and potentially useful project, love it! When I tried it though, it was quite often hard to see whether the bounding box is "really" correct, as it hides what's underneath. Maybe some slight opaqueness could help.

Also, my first image had no bounding box at all. Being met by "Swipe right if the red shape is correctly outlining a building. If not swipe left", it felt like the wording or the UX could be improved by filtering for images that are guaranteed to have such a box.

efilife

FYI opaque means not transparent, so you probably meant slight transparency. Or I'm missing something

glaucon

Positive about the general idea, I hope it works out.

> Also, my first image had no bounding box at all.

Me too.

> "Swipe right if the red shape is correctly outlining a building. If not swipe left"

Rather confusing on a laptop which was showing "Correct" and "Incorrect" buttons.

mcv

Yeah, it's not always easy to see whether it's a building, and whether the outline is correct. Also: should I say it's correct when it's only roughly correct, or should it be perfect? I think you'll get a lot of noise from this.

But according to the presentation, they analyze that noise and manage to use it to improve the data.

teruakohatu

I applaud the effort but in the demo I am unsure how this could be used as a CAPTCHA. The examples I saw could be trivially solved with a bot running a simple CNN image classifier model.

The training data is available, existing OSM building outlines and satellite data scraped from Google, and training a image classifier would be very straightforward. I am sure a bot would have a much higher success rate than a person.

berkes

You sound like you know it to be trivial to build this. If you can build it, (which I doubt) please contribute to OSM by getting involved with e.g. HOT OSM. https://www.hotosm.org/get-involved

And I doubt it is that easy, because smart people, at a.o. HOT OSM, have been building exactly this: tools to automatically categorize, detect, rate etc, mapping data from satellite imagery. Not to bypass a CAPTCHA, but to make editing and improving OSM data easier.

They then concluded that some problems there are hard to solve with AI, and e.g. need better and more training data. If it's hard to solve with AI, then building that CAPTCHA bypass bot is probably just as hard if not harder.

teruakohatu

> And I doubt it is that easy, because smart people, at a.o. HOT OSM, have been building exactly this: tools to automatically categorize, detect, rate etc, mapping data from satellite imagery. Not to bypass a CAPTCHA, but to make editing and improving OSM data easier.

I apologise if I caused any offence. I certainly didn’t intend to.

I claimed building a bot to bypass the captcha was trivial. For a captcha to be bypassed, it needs only perform better than a human doing the same task on average. Because humans make mistakes captchas also require a margin of error.

A bypass bot does not need to be able to annotate all buildings on the earth perfectly.

OSMs problem is that not that ML cannot do solve problem but rather they would like the result to be close to perfect and apply it on scale to the entire earth. This is a harder problem than detecting errors in a handful of images.

There is a lot of academic literature on the subject of “building footprint extraction” if you are interested:

https://scholar.google.com/scholar?hl=en&as_sdt=0%2C5&q=buil...

phoronixrly

It can also be solved by a captcha-solving farm, as can recaptcha, what is your point?

alex_duf

Also if a captcha solving farm ends up working on a project that is benefiting the world, it's still a net improvement over clicking a box that says "I'm not a robot", and an improvement over exclusively helping google with its OCR and image recognition.

remram

The question you have to ask is how inaccurate the data can be while passing the CAPTCHA challenge. The people trying to pass it at scale using bots or farms don't care to pass it correctly.

Eduard

playing with the demo at https://maptcha.crown-shy.com/ , I had problems recognizing the outlines, as they often were colored similar to their structure (low contrast). Some examples I couldn't find any outline at all.

Also, seeing a rectangular outlined thing from above, it's often impossible to tell if it qualifies as a building.

gregoriol

Same feeling here about the demo: sometimes there is not red shape, so should I click correct or incorrect? sometimes the red shape covers two different styles of roof (I don't know if it is one or two buildings), is this correct or incorrect? sometimes the shape covers a very small thing, could be a phone box or kiosk, it's not clear compared to other buildings around, is this correct or incorrect?

I really hope such a project can thrive and be useful though, will check it again after some time!

morder

i really need the ability to zoom in or have it larger. the default size for me (even with my browser default at 130%) is way too small for me to make anything out.

sudahtigabulan

I think it would help if they used dashed line. Or another pattern that is unlikely to occur naturally.

Also, they use red for the outlines... (Red-green color blindness is the most common form.)

xvilka

Another project that would certainly benefit from crowdsourcing is the OpenMetroMaps[1]. Lack of the good universal public transportation website and mobile app is disheartening. I know there's is Transportr[2][3] but it's coverage is still pretty limited.

[1] https://github.com/OpenMetroMaps

[2] https://transportr.app/

[3] https://github.com/grote/Transportr

jazzyjackson

I was very pleased to find Open Railway Maps includes the passenger network of Morocco, which Google and Apple lack.

[0] https://www.openrailwaymap.org/

pietervdvn

OpenRailwayMap is OpenStreetMap-data, it just highlights the railway data and hides the other data.

thepuppet33r

This is amazing. Much prefer this to helping train autonomous vehicles.

poisonborz

You mean killer drones

https://www.vegard.net/google-helps-pentagon-train-killer-dr...

progval

There is no evidence in the article that it does. "Google is providing Pentagon with access to TensorFlow" means nothing because TensorFlow is open source.

mcv

Great, now I can see a "select all images that contain a <specified ethnic group>" reCaptcha.

memsom

My issue isn't that it does or does not highlight an area, it is that sometimes it highlights an area that is not a building and the instructions are a bit vague as to is this is "correct" or not. One example - I see it highlighting someone's back yard with a swimming pool. It is not a building. Is this correct or not? I mean,it has identified the shape perfectly, but it is not a building. I feel like lazy people clicking through CAPTCHA is not the place to make important decisions on OpenStreetMap map level features without a lot of filtering of the results.

Mayzie

I had a go, but there were a few I wasn't really sure by. To the creator, can you add a "Not sure" option?

mcv

I think they recognize the "not sure" by various users giving different answers.

barbazoo

I’m not sure how effective captchas really are at filtering out bots. But if they do work, I’d much rather have my efforts contribute to a public good like OSM rather than feeding data to Google, which seems to be the default these days.

If anyone from OSM is listening, it would be great to have a way to flag malicious uses of captchas, like in phishing attempts. The existing captcha platforms make this very hard.

cm2187

Mostly they were created on the absolutely wrong assumption that they wouldn't annoy as fuck the users. Unless I badly need to get in, I tend to browse away when prompted for a captcha. The ones I hate the most are those where you are supposed to click on the part of the picture that contain a motorcycle, and like are you supposed to click if the motorcycle overlaps by 2 pixels, are you supposed to click on the passenger, and if it's a traffic light, is the pole part of the traffic light, etc?

This is as user hostile as it gets (and of course always combined with a gdpr pop up followed by a subscribe to email pop up which overlaps with the please login pop up).

j-bos

> are you supposed to click if the motorcycle overlaps by 2 pixels, are you supposed to click on the passenger, and if it's a traffic light, is the pole part of the traffic light

100%, I have never passed those captcha, guess I'm really not human.

lupusreal

Google and Cloudflare's captchas are the worst because if they have previously decided they don't like your IP, if your browser has privacy features enabled, or a few other factors, they will never let you past the captcha no matter how correctly you answer the prompts. But instead of just telling you "we simply don't trust you, go away" they'll let you attempt the captchas as long as you like, rejecting your answers every single time.

It's abhorrent. Lying to users, gas-lighting them into thinking they weren't answering correctly and need to try harder when in fact no answer will ever be accepted. Ostensibly meant to be a system which prevents automatic systems from abusing resources meant for people, it becomes an automatic system abusing people. This bullshit should be illegal. If they want to turn people away for defying their surveillance apparatus they should do that upfront, without the inhumane deception.

null

[deleted]

raybb

I'm not from OSM but could you say more about malicious uses of captchas or how it's related to phishing?

bradleyjkemp

It's really common now for phishing kits to use interstitial pages that require solving a captcha before the actual phishing content is shown

Victims just click through the captcha without thinking, but it makes automatic verdicting by security scanners a pain because they just see a captcha page: can't tell the brand being impersonated, or even if it's a phishing site

I wrote a post about a number of these which actually pretend to be Cloudflare! https://phish.report/blog/fake-cloudflare-interstitials

barbazoo

Interesting! What I was thinking of was use of legitimate captcha integrations (reCAPTCHA, hCaptcha) in front of fake banking websites. Drives me crazy that there isn't an easy avenue to report those.

aaron695

[dead]

glaucon

As I said elsewhere I applaud the idea, a couple of comments.

1. On Firefox on Ubuntu the text is _far_ too faint, to the extent the initial text is essentially unreadable.

2. I was getting brown coloured areas surrounded by a slightly different brown colour. The outlined area could have been a roof or just some slight change in l and colour. I understand this is why help is needed to classify them but it felt wrong clicking either "Correct" or "Incorrect" ... maybe with enough input it doesn't matter ? Not sure.

talkingtab

A sideways look. The predominant business model for the internet is tracking users (for what is often called "advertising"). This is caused by the value of internet transactions being so very tiny, and there is no other way to charge that amount. But here is another business model. The "fee" is proof of work, as in the helping make the product better. Cool.

karel-3d

Can the demo write "successful" or "unsuccessful"? :) It's hard to know how well it works otherwise

johnisgood

> I have identified features (like buildings) from satellite imagery before

Exactly, how would I know this? I actually have no idea as there is no feedback at all.

HN

MapTCHA, the open-source CAPTCHA that improves OpenStreetMap [video]

MapTCHA, the open-source CAPTCHA that improves OpenStreetMap [video]