Skip to content(if available)orjump to list(if available)

A glitch in an online survey replaced the word 'yes' with 'forks'

wat10000

Maybe twenty years ago, Google Translate had a fun bug with the word "amistad." In Spanish->English mode, it would correctly translate this to "friendship." If you added an exclamation mark, it would translate with the exclamation mark, so "amistad!" became "friendship!" If you added more, it would add to the translation, so "amistad!!!" became "friendship!!!" Except if you used exactly five exclamation marks, no more and no less, it would translate "amistad!!!!!" to "murder!"

tgv

Probably around that time, Google once translated "Berlin" with "Paris" or something like that. IIRC, it was in a sentence about head offices. The training materials consisted of parallel documents, including manuals, where the German document would say "contact our head office in Berlin", and the French one "contact our head office in Paris." So it learned to translate capitals in certain contexts.

yongjik

Similarly, there was a time when Google translated "Japan" in Japanese to "Korea" in Korean... because documents scraped from the web would say "Japan and Korea" in Japanese and "Korea and Japan" in Korean.

cperciva

Common with international treaties. The trade agreement known as USMCA in the USA is known as CUSMA in Canada and T-MEC (Treaty between Mexico, USA, and Canada) in Mexico. Countries almost always put themselves first.

ben_w

I've seen something similar, briefly and ages ago now, with it doing the equivalent of* en:"I speak English" -> fr:"Je parle français".

* It might not have been French, I remember the effect not the detail

carlmr

I really like this one, it basically learned that "first country" is "this country", and "second country" is the "other country".

Heliodex

An interesting occurrence I've seen is with flag emojis, for example translating some English text with an England/UK/US flag after it will sometimes turn the flag into that of a different country where the language of the translated text is spoken.

erehweb

As Ionesco may have said, the French for London is Paris.

01HNNWZ0MV43FF

"I will proceed to type the characters _e_ and _gu_"

https://www.youtube.com/watch?v=3-rfBsWmo0M

I never found out what a DECEARING EGG is

jbarbs

As a native Spanish speaker it took me a moment to understand why "yes" was being translated as "forks" until it clicked and it's not an error.

In Spanish "Ye" is how we call the letter Y and "La ye" is a word used, at least in the version of Spanish spoken where I come from, to refer to the place where a road forks. Hence the fork in the road is "La ye" and the plural would be "Las yes" or the forks. In this context forks is referring to where the road forks not to the eating utensil (which would be "tenedores").

soyyo

I am also a native Spanish speaker and I haven't ever heard this. I have always called and heard from every other Spanish speaker the letter Y as i griega (greek i)

Not saying that this is wrong, in fact one can check the Spanish wikipedia to confirm that "ye" is a valid naming for Y but definitely not used where I live nor for the letter or a fork in the road.

justusthane

Wonderful explanation, thank you!

baggy_trough

Amazing!

miki123211

I think I have a better one.

I was trying to download some software from a Japanese website (this was about 10 yrs ago I think). There was an entire survey you had to fill out first. Since I speak no Japanese, I naturally used Google Translate (to Polish, not even to English).

That survey had a "gender" field, and the options were something like "person" and "woman".

Then there are the automatic (mis)translations of button labels and other similar strings in software, where the translating tool often only has a single and ambiguous word to go on, with no context whatsoever.

THe funniest ones I've seen are "branch" (satellite office) versus "branch" (of a tree), "book" (a flight) versus "book" (something you read), "character" (ASCII or Unicode) versus "character" (in a story), "clear" (to remove all) versus "clear" (translated as "cloudless", referring to the sky), "letters" (delivered by a postman) versus "letters" (and not digits), "rate" (how fast. i.e. speech rate) versus "rate" (how much you charge per minute), "prune" (remove all) versus "prune" (dried plum), "manual" (and not automatic) versus "manual" (a user manual), "clubs" (places you go to) versus "clubs" (and not diamonds or spades, "queen of clubs" hav a particularly interesting meaning here), "number" (that you call) versus "number" (of things), "at" (@) versus at (home), "tab" (key) versus "tab" (in a browser), "close" (to me) versus "close" (something), "back" (button) versus "back" (a body part), as well as all the "party" (adventuring team in an RPG) versus "party" (legal entity, i.e. "third party") versus "party" (as in an event you have fun at) shenanigans.

Mistakes like these often hide in accessibility labels, and are hence far more obvious to screen reader users. In normal UI, they're usually found quickly enough that users never notice, but accessibility labels are often overlooked when testing translations.

pjc50

There was a fun one going around where the title of the game "Watchdogs" had turned into Norwegian for "look at dogs".

Re: gender, there's all sorts of chaos going between languages which have no grammatical gender (CJK languages, Finnish etc) and those which have mandatory grammatical gender (Latin languages), because the information may simply not be available in the same sentence. So you get obvious misgenderings of "(female name) .. he" or vice versa.

miki123211

And then there's the "man" (not a female) versus "man" (as in "human", i.e. "first ,man on the moon" / "all men are created equal").

climb_stealth

Hah! I fondly remember some Linux distro that had the Desktop translated to a literal table top.

Translating without context must be really tough. Especially with a language like English that is utterly ambiguous.

mook

In Chinese Windows, Desktop has been officially translated as literal table top since at least Windows 95. I don't see the issue here XD

once_inc

I've had a similar experience where Dutch surnames were translated from English to Dutch by Excel for some reason. Since many Dutchman have a surname prefixed with the Dutch word "van" (which means "of") Excel dutifully translated it to "Busje", which meant that many of our clients suddenly were called "Lieke Busje Lexmond" or "Vincent Busje Gogh".

It got a chuckle from our marketing department which caught the error before badges were printed for the very high-profile event we had planned for the next day.

kimos

My favourite part of the Dutch language is how it adds the -je diminutive to other words to make new words. A van? That’s just like a cute baby bus.

bigiain

This made me wonder what a poffert is, if a poffertje is a cute baby version of them. So I Googled it. Sure enough, it's a cake.

qsi

That would be a poffer, as the interstitial t is added to words ending in r when forming the diminutive.

And no, I didn't know what a poffer was either. :)

dreghgh

French localisation of OnlyFans used to translate the word 'tip' (quite essential to the OF experience) both as 'bout', 'tip' in the sense of 'end', and as 'astuce', 'tip' in the sense of 'piece of advice'.

bondarchuk

Must've been high profile indeed if both Vincent van Gogh and Lieke van Lexmond were there

netsharc

A colleague of mine got a Christmas card from Microsoft adressed to "New Year's Eve"...

Hi Silvester!

frnx

I'm wondering where the "forks" translation came from in the first place. Google Translate used to be fairly reliable for simple translations, but I've seen several examples in the last couple of years where it goes batshit crazy, including starting to loop hallucinating sentences on repeat. Is absolutely nobody checking how well it performs before deploying nowadays?

o11c

Probably, interpreting "Yes" as the plural of "Y-junction of a path".

shadowgovt

If you have a plan for automated checking the output of Google Translate against all possible character input strings, be sure to mention it to your recruiter as part of the hiring process.

More seriously: Google Translate's bread-and-butter data source is documents that were human-translated from one language to another with high reliability (such as UN publications). That turns out to work remarkably well for building a neural net that can extrapolate how one sentence should translate to the same sentence in another language. But like most neural networks, it's vulnerable to garbage-in, garbage-out: much like you can get an animal detector to hallucinate "zebra!" if you feed it a noise-pattern as input, if you feed it character sequences that aren't actually words in the input language, it'll try to extrapolate what reality should be between all the corpus it's seen and you'll get garbage on the output side.

Since the tool doesn't actually know what words mean, it has no way, at present, to know "Yes" isn't a Spanish word (and as other commenters have mentioned, it may actually be "a Spanish word" in one weird context in one weird document somewhere in the corpus of all translated documents accessible from the Internet... Or some doc somewhere contains a close-enough typo in the Spanish input document that is over-reflected in the output because no other document contradicts the typo's apparent translation).

cryptonector

There is no "yes" in Spanish.

Reminds me of Inodoro Pereira, a comic strip character in Argentina who was a peon in the countryside and rather ignorant, and he'd sometimes respond affirmatively with "yes como dijo un tal Chespier" ("Yes, as some Shaespier once said").

doubletwoyou

really? i thought that was a latin problem (sic)

doesn’t spanish have sí? or is it something like portuguese where the verb conjugated to an affirmation is preferred over something like sí?

pilaf

OP meant that "yes" is not a word in Spanish. The word "sí" is indeed the affirmative and it's used mostly the same as yes in English.

cryptonector

The comic strip character was acting all knowledgeable by quoting Shakespeare as saying "yes" when the character meant "sí", but misspelling Shakespeare's name as "Chespier", something like misspelling it "Shaespier" in English.

ben_w

For a while now, I've had the idea that some malicious person could have a browser extension that detects and modifies news sources to spread propaganda. We've already got joke versions of this, s/{some annoying topic}/{joke} — and for the moment, this is still in the "ha ha what a silly bug" domain.

One day, I expect it won't be silly. It'll be some more subtle transformation that rewrites one party, or one person, as having constantly sinister motives.

Will we even hear about it, if such a thing comes our way?

Dylan16807

"our vendor now matches the lightbox scripting to the language of the text on the webpage"

"The auto-translate pop-up may still be triggered on occasion, but the HTML in the survey wrapper prevents it from changing the content on the webpage."

I have no idea what either of these sentences mean, and they're both very important to the fix.

jhy

My guess (it would be nice if they actually said...) is that they were missing the required lang attribute on their HTML.

  <html lang=en>
If not defined it will default to unknown (not to the user's locale) and so this makes Chrome guess. And there wasn't much text in the lightbox (which might be a different page?) for the browser to infer from.

N19PEDL2

That's probably true, however I'd be really curious to know why Chrome's guess for "yes" is the Spanish word for "Y-junctions" instead of the English word yes.

jasonjmcghee

This was my immediate thought, but it doesn't sound like what they did. They also mention they still get the Google translate pop up - which suggests they didn't.

Though it sounds like they serve many languages, so they'd need to do each survey individually.

Maybe the survey part is an iframe?

lmz

#1 is probably to subset the loaded lightbox text-localization files to only the survey language. And #2 is to use the translate=no HTML attribute (or its predecessor) to disable translation of that section.

jwrallie

I also got disappointed when they skipped how they actually detected it with just:

> With a little more experimenting, we were able to identify translation as the root of the issue.

Maybe it’s just me but I was really curious to what finally pointed out the issue but all I got was “experiments”.

kazinator

Perhaps the article needs to be read as if prefixed with:

"I was asked to write the following explanation for the public, to put on our website, and talked to the programmers. I have idea what they were saying. I took some notes, which likewise mean nothing to me, but here goes ..."

mschuster91

> For some respondents, this prompted their browser to believe our survey was written in a language other than English (even though, again, it was in English) and ask if they wanted the page to be translated to English – or, we think, automatically try to translate the page to English.

I goddamn hate this "feature" so much. Especially since it sometimes resets and then I have to find out where the fuck Google moved the disable button to again.

No Google, I speak fluent German and English and can reasonably read Croatian - if I wanted a translation I would explicitly ask for it myself thanks.

netsharc

You're going to love YouTube automatically translating video titles to your geo-ip language...

For the next project-manager-promotion-worthy feature, they also do auto-dubbing of the speech now!

mook

The best part is YouTube has a bug that would sometimes force a dubbed language; as in, a video originally with an English audio track would get stuck between French and Italian even when you try to manually change the language back to English.

fsh

They now sometimes dub the audio track. The result is about as horrible as one would imagine. Whoever decided to turn this on by default clearly didn't give a damn.

jack1243star

> You're going to love YouTube automatically translating video titles to your geo-ip language...

This is the worst feature of YouTube for me. Any idea how I can turn this off and show the original language title?

vintermann

It's not automatic, channels opt into it. It's in fact a great way to identify spammy channels, because

1. They know YouTube's algorithm rewards them for it

2. They know it makes their stuff objectively worse

and the only channels who know that and still opt into it, are channels which have no shame and are pure view-bait trash.

So when I see a bork-Norwegian video title like "Du gjetter aldri hva som skjedde neste!" I hit "never recommend this channel".

vintermann

I have a standard message for people who use machine translation on their "content":

"I know that machine translation exists. I know where to find it, should I need it. When you push it without asking, half the time I have to translate back from machine-junk to your language, in order to figure out what the hell you were trying to say. You make more work for me, not less. MACHINE TRANSLATION SHOULD ALWAYS BE OPT-IN."

cdelsolar

I run an app (Aerolith.org) for studying Scrabble words. For years I would get weird bug reports (https://github.com/domino14/Webolith/issues/331), where at the end of each round, when a user viewed the definitions, the words for definitions would get randomly replaced with other unrelated words. I double and triple checked the code; since I had recently moved the word-related logic to a Go microservice I assumed I had some crazy race condition. I remember trying to replicate it with many simultaneous requests, looking for memory leaks, thinking there was something wrong with the sqlite driver, etc.

Finally, I figured out that it was a Google Chrome auto-translation issue. For example the word JIBER was replaced with "BECAUSE OF" because JIBER means BECAUSE OF in Kurdish. There were many other similar ridiculous cases.

shadowgovt

There is a way to explicitly specify the language of the document in the HTML metadata: `<html lang="en">`.

Google doesn't guess if it has strong signal on what the language is.

kgeist

Once, a customer complained that we had corrupted their documents on our platform. Seemingly out of nowhere, words like "Messi" and "Zidane" started appearing countless times in the titles and content of their documents - overnight. It was so bizarre and random. We eventually found out they had a broken browser extension (something similar to Google Translate, I don't remember).

I was only able to find a single instance of such a bug elsewhere, on a Microsoft support forum — the guy was furious that Outlook would insert 'Messi' and 'Zidane' into his emails every time he tried to send them. Something very specific seemed to trigger it.

dylan604

> Something very specific seemed to trigger it.

Being not a fan of La Liga???

withinrafael

Windows recently had [Compress to postcode] in its context menu for en-GB users.

null

[deleted]

epistasis

Oh this is absolutely delightful, both in how complex the bug is and the actual result.