AI suggestions make writing more generic, Western
74 comments
·April 30, 2025chrismorgan
diffeomorphism
Correct but besides the point. The question of the study is "how much and how do you quantify this?".
There are lots of blindingly obvious qualitative statements, where the quantitative parts is far from obvious. That makes them a good starting point for research.
This is like writing about Newton's theory of gravity as "scientist finds out apples fall downwards. Wasn't that already obvious?".
twelvechairs
'Global south' is just the latest in a fraught space where every term seems to eventually get dragged down. Originally it was 'third world' that started off as a positive, aspirational term but became a derisive term. Then it was 'developing country' that was supposed to be not judgmental but became so. I'm sure 'global south' will go the same way.
mort96
The "global south" is a term which originates in writings about the vietnam war in the 60s. It's by no means a new term.
Use third world country if you want, but at least in my mind, the terms "first/second/third world" are more tied to which hegemony you fall under, where the first world is the US hegemony, and the third world is without a hegemon. "First world" is kinda synonymous with "the west". To me, the term "global south" communicates that it's being contrasted against all of "the north", both east and west, while using the term "third world" communicates that the context makes a distinction between "the west" ("first world") and the old eastern bloc ("second world") somehow relevant. Some "second world" countries such as China are also part of the "global south".
I'm sure there are people who use the term "global south" simply because they perceive it to be less judgemental somehow, but I might've used it in this context because "third world" communicates something different.
Honestly though, I would've probably just used the term "non-western" (maybe "non-first-world"?), since that's the distinction the article actually draws. Eastern Europe is also affected here, after all. Or maybe I would've drawn the distinction at "US" and "non-US", since Europeans don't necessarily want their writing to sound American either and the old US hegemony seems to be on its last legs.
AuryGlenz
It’s funny, because on the face of it, it seems the worst of all. You might as well just say “dark skinned countries,” since that’s what the term is effectively getting at.
trhway
>You might as well just say “dark skinned countries,”
Argentine is 97% European descent. Where is non-white countries like Japan and South Korea are in the Global North.
It isn't about race/skin color. The "Third World" was well suitable until Soviet Block (Second World) collapsed with most of its components going either into the First/Western World or into the Third World thus resulting in the world being partitioned just into 2 large parts with Russia+Belarus being the distant small 3rd, so small that it is just easier to count them into the First/Western style developed world especially considering that they both moved into capitalism losing that major trait - socialism - of the Second World. (Though 30+ years later Russia does more and more drift toward the former Third World, i.e Global South).
flowerthoughts
A few months ago, there were articles about English having more a Nigerian (I believe) dialect because that's where the labelling of the supervised learning happened in the early days.
If that had continued, combined with where actual users are, perhaps it would have broadened English instead?
jononor
I agree that homogenization is the likely default outcome. ML models do tend to have a strong tendency to prioritize modelling the most common things well, which makes output in a generative models also biased towards mean/mode. And there are the limitations in representation in the datasets themselves. And also bias in that the developers are primarily English speaking, so that language gets priority. But I do no see homogenization as inevitable. LLMs do pick up a wide range of speech patterns, both regional, different periods of time , sociolects and styles. And they are pretty darn good at "role-playing", outputting language tailored to a particular role. And this can be configured rather effectively using a session or system prompt. So if everybody picked their voice, we could perhaps get a broadening of outcomes. Maybe I should add "always speak like a refined scholar from the Victorian era, spiced up with 80ies British goth"...
mrtksn
You know what, actually if a non-American movie non-ironically tries to be like American it instantly becomes kitsch. Even British movies are quite distinct from the from American.
Anything AI writes is dull anyway, it writes stuff that nobody wants read beyond getting some information. Maybe if you are learning English you may pick up something from it though.
Also, I recall something about AI English actually being Nigerian English because those companies used a lot of Nigerians in training.
otabdeveloper4
> Apparently I have moved south by migrating from Australia to India.
Yeah, I we understand "south" to mean "closer to the equator". (That's kind of how it works in the popular imagination. E.g., southern Brazil is more "nordic" than northern Brazil.)
spacechild1
> That's kind of how it works in the popular imagination
It absolutely does not. South means, well, South.
> E.g., southern Brazil is more "nordic" than northern Brazil.)
How so? Southern Brazil is clearly closer to the South pole than Nothern Brazil.
otabdeveloper4
It's a racial term. "South" as in "dusky" or "brown", etc.
chrismorgan
That’s definitely not how I’ve ever seen it used in Australia or India. North and south are fairly strictly geographic terms. South India is roughly the southern half of the country—following a cultural and geographical divide, but pretty neat overall. Southern Australia follows the southern edge of the country (not to be confused with South Australia, a state in the south). One is in the northern hemisphere and one in the southern, but they use the terms the same way, pointing to the poles.
rlupi
Why say Western, when they mean US english? US tech & culture (Hollywood, Netflix, etc.) has an habit of bulldozing over non-English Western culture too.
decimalenough
UK, Ireland, Australia, Canada also contribute a non-trivial part of "Western" (English) culture. I'm always surprised by how many Hollywood stars are actually Aussies.
gitremote
Yes, because Australian and British actors often must fake US American accents perfectly to get roles in Hollywood. Margot Robbie, Tom Holland, etc.
decimalenough
Less than you'd think. Most Australians don't sound like Crocodile Dundee, and the educated/upper class Australian accent is quite neutral/unobvious to American ears.
gitremote
Non-Western English speakers might be unaware of the difference between US English and general Western English, due to the dominance of US English in Western English.
pk97
As an Indian, I am sad that the world will lose out on short to-the-point words/phrases such as "please do the needful", "i have a doubt", "prepone" and many others :( We are like this only.
chrismorgan
> "i have a doubt"
This one is problematic when used with non-Indians.
When an Indian says they have a doubt, they mean “I have a question and seek clarification on one point”. Someone not familiar with this Indian English idiosyncrasy will instead interpret it as “I’m not convinced that what you’re saying is true”, potentially even casting aspersions on your integrity. The question that follows will normally clear things up enough that it’s not disastrous, but it will still tend to leave a bad taste in the hearer’s mouth. It took me quite some time to really get used to it.
dalmo3
In PT-BR is also more common to say you have a "dúvida" than a "questão", so I immediately got the intended meaning.
I imagine Spanish speakers will have no problem either.
pk97
Thanks for sharing. Another interesting part is how similar the word "dúvida" is to Hindi's word for "doubt": "Duvidha" which itself is derived from Sanskrit!
palmotea
> As an Indian, I am sad that the world will lose out on short to-the-point words/phrases such as ... "i have a doubt"
Please explain that one to me, because every time I've heard it used it seems to amount to "I have a question," which to me is confusing.
chrismorgan
You are correct. Indian English uses “doubt” to mean “question”, rather than lack of belief as is its standard English meaning. Different dialects use words differently, and there’s generally not much you can do about it. At least in this case the concepts are relatively similar, unlike by/into which normally mean multiplication/division, but are inverted in India.
pk97
exactly, it stands for "I have a question" :) It stems from the school/coaching system where you are encouraged to ask questions as you figure out say a problem set in dedicated "doubt clearing" sessions with your teachers/instructors. That carries over to the workplace where you are more likely to hear this phrase when someone has a question in a technical discussion or similar, from my observations.
bohrbohra
You're right. "I have a doubt" means "I have a question".
We used to have "doubt-solving sessions" in coaching centres. Everytime one of the students would ask "Sir, I have a doubt" I would always snigger within that the student was insinuating something sinister or nefarious about the instructor's character. I always found it hilarious.
But that's just how English is used in India.
aitchnyu
Screw "I have a doubt" though. Pretend your messages are carried by steam trains and write everything that recipient must act upon.
hunglee2
Given that the US internet is overwhelming dominant, the base training data for Common Crawl will lead to the gradual Americanisation of global culture - not only linguistic style, but also modes of thinking and hierarchy of values. Chinese Internet is generally locked into super app walled gardens, so no real competition there
selfhoster11
I specifically configure my LLM prompts to disdain American-style thinking and values to avoid this issue. LLM outputs will contribute hugely to future cognitive and decision mass for the entire planet, and I would like to avoid dominating that by one culture.
hunglee2
interesting mitigation - what is the prompt?
michaelbrave
It's interesting because just yesterday I was asking the AI to speak more Californian and less Indian (it was using words like kindly and now a lot, to be specific I was asking it to make affirmations for a coloring book but the phrases it was giving me did not feel American/British but were closer to other major English dialects like Indian).
dqv
"Tool which offers English (United States) as its sole English language option makes writing more like that locale"
Why didn't the study use something like Grammarly, which has awareness of American English, British English, Canadian English, Australian English, and Indian English?
I should clarify that I get the point. Like it's still useful to study how an American-English-biased model affects writers of a different dialect, but being able to see what it does when it can switch dialects would be way more useful and still be able to convey the same point that models specialized to a dialect will affect writing outside that dialect.
vjk800
This is really just part of a bigger trend of tech homogenizing the culture and language across the world.
Smaller languages have suffered from the dominance of English long before AI. Most of the content in Reddit, X, or any internet platform really, is in English. All new tech is, at least initially, only in English. English language, and the culture of those who produce the English language content, dominates the world now. Especially when it comes to commercial culture. With government grants, etc. smaller languages can be propped up to some degree, but how about creating a massive block buster movie in Estonian language? Forget about it.
numpad0
This gets annoying fast.
> Most of the content in Reddit, X, or any internet platform really, is in English. All new tech is, at least initially, only in English.
Content advertised into your timeline. Not content in general. Twitter had been like only 35% English, Bluesky is 30% Brazilian or something. Only Reddit is like actually >80% English because those other languages has other dominant platforms.
You don't see stats like "xyz is 99% English" because every Chinese guys speak unaccented American English, it's because WWW statistics are based on and reference counts, rather than by wgeting random IP, and they start from an English URL, so discovery ends where anglosphere ends.
It's not like Chinese contents actually occupy >85% of everything, just that English is not the 99.999%, but still. "American English won the great game, Earth 999.999% English" is just a collective hallucination.
selfhoster11
I'm interested in only a handful of stereotypically STEM-adjacent topics. Out of 176 YouTube subscriptions, 2 of these are in my home language. And they are both musicians (= non-verbal content). Content dried up on 2 more that I've already unsubscribed from.
I'm using NewPipe instead of the official UI, so I know for a fact this stuff isn't being advertised at me. I pick my own feeds, and all the best content is English-based.
danjc
Writing suggestions aren't just more western, they are a specific person that is basically the average of all western.
userbinator
I believe you can prompt an LLM to tell you how to write in Indian English too, if you really want that.
orbital-decay
All existing LLMs (including those optimized for creative writing) are extremely bad at this, they tend to write in a narrow subset of American English sentence structure and idioms, even if you prompt them to imitate someone's style. This is inevitable due to English being prevalent in the dataset and RL murdering the variance.
AI slop reads unnatural even in English due to its lack of variance. And it heavily leaks into all other languages, even Ancient Greek.
selfhoster11
RL absolutely murders variance. GPT-4o was an order of magnitude harder to prompt into sustained chain of thought than GPT-4, from day 1 in my experience.
ruuda
I wrote about this before, this generic writing style really sucks out the joy of interacting with others: https://ruudvanasseldonk.com/2025/llm-interactions
ljsprague
How would an LLM go about its business without "diminishing nuances that differentiate cultural expression"?
selfhoster11
Allow people to fine-tune for regional and idiosyncratic expression.
Offer more than just 1 master version that everyone must share.
Improve training processes to not overwhelm the regional expression and reasoning with synthetic/curated data of just one culture.
Hire annotators and data entry services across a whole multitude of countries that cover a varied array of cultures, styles, languages, etc.
At least the things above should counteract the effects somewhat.
null
Barrin92
One of the best pieces of advice I got in uni from a teacher on writing, which sounds pretty simple but can be tough to do, was: always write in your own voice. Writing is thinking, and when you're adopting phraseology, entire sentences and turns of phrases from others really you're not just sounding like someone else, you're not thinking on your own. You'll end up on autopilot, then metaphorically, now apparently literally.
At an individual level people have always been doing it, now with automation it's not surprising that a study finds it happening collectively. That's why I don't see much good in these tools. They strip writing of personality, subjectivity, unique perspective, and they just seem to diminish the capacity of people to use their own minds.
> “This is one of the first studies, if not the first, to show that the use of AI in writing could lead to cultural stereotyping and language homogenization,”
I just want to make sure others agree and it wasn’t just me (or perhaps non-Americans in general)—it was blindingly obvious this would be, must be, the case, right? That although this might be the first formal study of it, there would have been literally no doubts as to what the outcome of such a study might be? That at least some degree of language homogenisation will be quite inescapable if you do LLMs the way we have?
On the cultural aspects, it’s well-documented and -understood what effects US TV and movies have had on other countries. There really isn’t anything new about LLMs or AI here, it’s just standard globalisation effects.
(I also just now learned what a crazy term “Global South” is <https://en.m.wikipedia.org/wiki/Global_North_and_Global_Sout...>, and how it does not mean at all what I thought it meant or what any sane person would expect. Was it not enough that “Western” bears no strong correlation to geography, that we need more terms that utterly abuse geographical references when they’re actually about socioeconomic characteristics? Apparently I have moved south by migrating from Australia to India.)