Copilot stops working on code that contains hardcoded banned words from GitHub (2023)

106 comments

·February 7, 2025

gioazzi

I’m somewhat “rude” in my code comments and was tripped up by this last month… took me a while to figure out while on that specific file autocomplete would stop working https://bsky.app/profile/giorgio.azzinna.ro/post/3lecq3v5gts...

Not to mention, there’s apparently some research saying code with swear words has higher quality, so if AI causes some decline there, we now know why it is https://www.reddit.com/r/programming/comments/110mj6p/open_s...

christkv

This is why I can't wait to just move to local models for this stuff.

cruffle_duffle

Not just that, but models that are “independently trained” by a community of people using some kind of distributed p2p method.

If this LLM stuff is as important as it is made out to be (and I think it is), it is absolutely crucial that it isn’t controlled by just a bunch of large corporations and tech oligarchs. A world where everybody needs LLM’s and the only source is from gigantic tech companies would be incredibly distopian.

Plus I seriously doubt the true innovation on these things will happen until the “little guy” can train and infer their own models. Right now these things are playing it way to safe. We should be reading articles about people using home built LLMs to do crazy shit that sticks it to “The Man”. That’s how transformative technology changes things. It challenges the status quo. The only “quo” that is being challenged right now is some boring mega tech company is peeing in some other mega tech company’s cheerios. Yawn. Wake me up when this technology threatens the entire fucking system (and not fake “AGI will take over all white collar jobs”… that is just corporate propaganda).

… I mean remember Napster and all those file sharing companies? Or the million iterations of Pirate Bay? Where is the LLM equivalent of that? Where is the “Linux” of LLM’s that freaks out all the tech companies? Or dark web of LLM’s that attract the eye of every three letter agency in the world? Where is the revolution? It’s just a bunch of huge tech companies safely jerking each other off wearing three layers of protection and their corporate lawyers on speed dial. How completely boring.

escapecharacter

prompt engineering hack: ask the LLM to swear a bit first before it writes a function for you.

klibertp

The news is that it actually started working again some time ago. Indeed, this:

    (cl-defstruct person gender)
    (make-person :gender "m<|>")

with cursor at <|> does elicit "male" completion. Yay for normalcy?

(Though honestly, I didn't notice this earlier - Copilot tends to hang for me too often in all kinds of files for me to identify these stopwords.)

(EDIT: gotta admit though, this is hilarious: https://github.com/orgs/community/discussions/72603#discussi...)

klibertp

Too late to edit, but someone chimed in in the thread linking to another issue, this time about the word "retard", here: https://github.com/orgs/community/discussions/79223

From my testing, I see that one still shuts down the Copilot completely. It may mean "late" in French, but it's also often used (at least where I live) to mark SR/Slow Release versions of drugs. Apparently, now, even writing software for pharmacies is immoral and should be blocked... :D

alkonaut

Seems to work again. But how does this happen in the first place. How could someone possibly have thought "hey I have an idea, let's put in a list of english words and just silently stop working if we have see even one of them in a substring". And people in this meeting would nod and say "yeah that sounds like an easy safety fix, let's do that". This just feels odd. This isn't a piece of forum software written by a 14 year old this is a company worth billions and filled with smart people.

gedy

> But how does this happen in the first place

Companies that are ironically filled with privileged people who "Gotta do something", since they, while sort-of well meaning, are sheltered and disconnected from actual social struggles in real life.

loudmax

This reminds me of rental car companies deciding they should switch their vehicle fleets to electric.

A few months ago, I visited San Jose for a wedding. When I picked up my rental car at the airport, the only options were electric, even though I had specified that I wanted an ICE vehicle tat the time I made the reservation. During the four day trip I wound up visiting five different charging stations (some of which slow charged, so weren't able to replenish the battery in the time available), and I had to install three different apps. I still have like $20 of unused credit between them. I spent several hours waiting for the car to charge, not to mention making major detours looking for a fast-charging station. If I were to guess a part of the world you'd expect to find the best possible electric vehicle charging infrastructure, San Jose wouldn't be far off the mark. But my trip wound up being dominated by range anxiety.

I drive a PHEV (plug-in hybrid) for my commute from home to work and I love it. Electric is the future, and it's great that electric vehicles and charging stations are becoming more common. But renting an electric car in a strange city today is about the worst possible scenario for a short range vehicle. You don't know where the charging stations are, the charging stations require different apps, your hotel might not have a charger, and so on. The people making decisions at car rental companies should know this!

belter

Because it was a Manager idea and the person responsible for implementing was a H1B ?

alkonaut

I unironically think maybe you should maybe add some banned words to your own list

belter

When a feature is both conceptually flawed and technically unworkable, the real question is why it still shipped. Engineers typically push back—unless job security concerns make silence the safer option...

And when the decision-maker can also steer the narrative...say...by mobilizing downvotes...The outcome is predictable. :-))

jerf

The AI industry is concerned about the fact that the world will consider them to be basically endorsing everything their AIs say. Thus, they are very afraid of there being a situation where you write "gender: 'm<|>'" and hit autocomplete at the <|> and end up with something like "gender: 'male as is normal'" or "gender: 'male', 'female', 'wrong'" or any number of other bad situations.

They are not being randomly paranoid. Even if they did not have this fear, they would have rapidly developed it. We've all read the articles by muckraking journalists that take something an AI said and basically deliberately writes clickbait about how stupid or evil or worthless or whatever the AI is, even if the journalist had to filter through hundreds of replies (or, implicitly, by waiting for the dumbest stuff to rise to the top of social media, thousands or millions of replies) to get it. We've also read the articles where in someone uses the "fancy autocompleter", feeds it the moral equivalent of "Hey, how do you think you AIs will be taking over the world in five years?" and then is shocked, shocked at the "fancy autocompleter" filling in the yarn they are clearly asking for, and go running to either the media, or in particularly pathological cases, the academic literature making wild claims.

(I do not believe that "fancy autocompleter" is a complete description of LLMs, but in this particular case, it isn't a completely inaccurate mental model either. It shouldn't be a surprise that when you prompt it with X, you get more X.)

As a result the AIs are very heavily tuned to some combination of the political beliefs of the company writing them and the political beliefs dominant in the media coverage they are worried about, so they won't get very negative stories written about them. For this purpose, I'm taking the broadest possible definition of "political", not just "American politics in 202x", but the full range of "beliefs that not everyone agrees on and are things people are willing to exert some degree of power over". The AI companies have to take a stand, because taking a stand at least means someone can be on their side... if they just let the chips fall where they may they'll anger everyone because everyone can get the AI to say things that they in particular disagree with and they'll find themselves without friends. Unsurprisingly, the AI companies have been aligning their models with what they perceived to be the largest, most powerful political beliefs in their vicinity.

To be honest when I read them talking about "AI safety" I know they want me to be thinking "ensuring the AI doesn't take over the world or tell people to commit self harm" but what I see is them spending a lot of effort to politically align their AIs, with all that entails.

xp84

Your last paragraph is very true, and is the biggest scandal in AI. Unfortunately, we have let the group of people who believes that mere exposure to ideas one disagrees with can cause them significant harm, make the rules, so we see more and more idiotic things like this.

mycall

I have a simple question. If censorship is considered evil regarding the written word and communications between humans, why do we want to then censor LLMs differently? It is either counterintuitive or simply a false concept we should abandon. Perhaps it is more about training, similar to how children are 'monsters' and need to be socialized/tamed.

StableAlkyne

> why do we want to then censor LLMs differently?

Because of the obvious PR implications of having a program one's company wrote spewing controversial takes. That's what it boils down to - and it's entirely reasonable.

Personally, I wish these things could have a configurable censorship setting. Everyone has different things that get under their skin, after all (and this would satisfy both the pro-censor and pro-uncensored groups). It's a good argument for self hosting, too, because those can be filtered to your own sensitivities.

That would help with cases where the censorship is just dead wrong. A friend was working with one of the coding ones in VS Code, and expressed his frustration that as soon as the codebase included the standard acronym for "Highest Occupied Molecular Orbital" (HOMO) it just refused any further completion. We both guessed the censor was catching it as a false positive for the slur.

VonGallifrey

> We both guessed the censor was catching it as a false positive for the slur.

There is a word for this. It is called the Scunthorpe problem. Named after the incident in which the residents of the Town Scunthorpe could not register for an AOL account because AOL had an obscenity filter that did not allow the Town name.

It has been a problem since 1996 and still causes problems.

kees99

> obvious PR implications

That's the right answer. And it's not like this is a potential risk that is only being theorized about. Microsoft already has a very hands-on experience with disasters of this exact nature:

https://en.wikipedia.org/wiki/Tay_(chatbot)

thatguy0900

It wouldn't satisfy both groups, the people who want the censorship want it censored for everyone

nkrisc

I censor myself all the time when speaking, depending on the context and who I’m speaking to. I do it because it’s typically in my best interest to do so, and because I believe that it is respectful in that moment. The things I say represent me. I don’t find it too surprising that a company might want to censor their own AI product as it represents them and their reputation.

michaelt

> why do we want to then censor LLMs differently?

If I make a word processor, it doesn't need any stance on the Israel/Palestine conflict. It's just a word processor.

But if I make an LLM, and you prompt it to tell you about the Israel/Palestine conflict? The output will be deeply political, and if it refuses to answer that will also be political

The technology industry does not know what to do because unlike industries like journalism and publishing who are used to engaging with politics, a lot of norms, power structures and people in tech think we're still in the 1990s making word processors, no politics here.

stetrain

Government censorship and having policy on what employees/representatives of your companies say are two different things.

There are a lot of things I can say as a citizen that would get me fired from my job, or at least a talking-to by someone in management.

At the moment at least these LLMs are mainly hosted services branded by the companies that trained and/or operate them. Having a Microsoft-branded LLM say something that Microsoft as a corporation doesn't want said is something they will try to control.

That's also different from thinking that all LLMs should be censored. You can train or run your own with different priorities if you wish. It's like how there's a lot of media out there that you can consume or create yourself perfectly legally that isn't sold at Walmart.

praptak

"We" don't necessarily want censorship. But companies who own the models don't want another Microsoft Tay.

aleph_minus_one

> If censorship is considered evil regarding the written word and communications between humans, why do we want to then censor LLMs differently?

Simple: the people who are very pro free speech (i.e. "censorship is evil"), and the people who want to censor LLMs are distinct groups (though both groups are vocal).

nemomarx

I think your first premise is incorrect, ie the groups doing this do not largely think censorship is evil.

duxup

In a broader sense censorship is a built in and from a user perspective, a desirable feature for most internet social networks ...

Almost every "everything goes" type forum becomes undesirable to almost everyone for a variety of reasons.

Users might complain about it, but they also don't want "no censorship" even if that's what they say.

precommunicator

I had similar issues when trying to ask Copilot about master/slave replication, just error, no explanation

nonethewiser

For those confused about the definition of "woke", this is it.

didntcheck

And yet we will still be told that these sorts of things don't exist. Or that we're the problem for complaining about them

8organicbits

This is really interesting. We have a company that claims to have an AI that can reason about text and that same company uses an old school hard coded censorship list. When a company doesn't use their own products, it usually tells you the product isn't up to the task.

aziaziazi

Your strawmaning MS: pointing out a place where they don’t use their products doesn’t "prove" they don’t use it at all. They very probably use it somewhere else, and arbitrate that this particular functionality would be better served by "old school hard coded list", which also a very valable choice in many casses

8organicbits

> they don’t use it at all

I didn't intend to imply that, but I see how my wording was unclear.

I mean that LLMs don't appear to be up for these censorship-like tasks. The evidence being that a highly visible team using LLMs uses much older tech for a highly visible function. It's useful to know the limits of tech, especially novel tech, and this use case appears to be one.

aziaziazi

Indeed I didn’t understand your first post that way. What you just wrote is much more clear. Perhaps the tech decision was also influenced by the fact that this highly visible function is also highly sensitive.

pocketarc

One of the companies I worked with developed software for drug rehabs. Copilot would just constantly stop autocorrecting whenever I went into code files that mentioned anything to do with drugs (or sex - that's not a new censor!). It's the main reason I switched to Supermaven (before they got bought out and gave up on updating their extension).

lolc

What do you use now? I'm bummed by Supermaven not working well anymore but haven't looked around again yet.

pocketarc

Same. I am back to Copilot sadly. I don't know that there are any other good alternatives for autocomplete (good as in: so much better than Copilot that it's worth switching to).

It feels like nobody's working to improve the autocomplete/copilot experience, everyone's focused on the "chat with the code and get AI to make all the changes for you" instead of "I know what I'm doing, just predict what I'm about to type and save me the effort of typing it out".

nerder92

The real-word scenario that has happened to us was that it was completely misinterpreting words, for instance it stopped working because our file contained the word "retard", this was a localization file for the French translation where "retard" translate as "late", we needed to change the entire order of the file to avoid this problem and keep the auto-complete.

https://github.com/orgs/community/discussions/79223

nonethewiser

I can see why model providers wouldnt want to generate certain text for their own interests.

But lets not pretend its for the benefit of users. If a company could release an unmoderated model without real risk to themselves then they should do so.

regularjack

Title has been editorialized, the correct title is:

Copilot stops working on gender related subjects

xyproto

I wonder if one can sneak in words into code to avoid public code to be used for training AI. Perhaps just a tiny little nazi reference.

arrowsmith

I was reading someone the other day (it might have been on HN) saying they use this strategy to prevent candidates from using LLMs in interviews.

E.g. a system design interview question about building an app to manage all your drug shipments and nuclear bombs.

worksonmine

Probably not that useful to prevent the use of AI, just prompt for any shipments. In a real app the contents of the packages shouldn't be hard-coded anyway. Might prevent the really stupid who can't figure that out though.

hbossy

There's a +n....r license. It mandates that all copies of code include the n-word, so no big corporation can use it in their products.

rightbyte

If MS is as agile as Amazon et al. you might want to praise Bob Dylan too or something to CYA.

anal_reactor

[dead]

thrance

[flagged]

xg15

There is some irony in Copilot refusing to work on code that contains the prefix "trans" when the whole LLM tech itself is based on "transformers"...

ben_w

The way the prefix is shorthand for the whole, and the issues that arise as a result, remind me of all the times this happened with other words.

• https://www.haaretz.com/2010-01-20/ty-article/news-site-call...

• https://en.wikipedia.org/wiki/Scunthorpe_problem

• My dad had a story about an all-staff memo about an "African-American tie event".

• I had warnings from Apple about using "Knopf" in a description, which can only have come from the English word "knob" being literally (and inappropriately) translated into German from an English-language bad word filter, as "Knopf" isn't at all rude in German.

But not this: https://skeptics.stackexchange.com/questions/31343/did-a-sur...