Skip to content(if available)orjump to list(if available)

When ChatGPT broke the field of NLP: An oral history

teruakohatu

I am in academia and worked in NLP although I would describe myself as NLP adjacent.

I can confirm LLMs have essentially confined a good chunk of historical research into the bin. I suspect there are probably still a few PhD students working on traditional methods knowing full well a layman can do better using the mobile ChatGPT app.

That said traditional NLP has its uses.

Using the VADER model for sentiment analysis while flawed is vastly cheaper than LLMs to get a general idea. Traditional NLP is suitable for many tasks people are now spending a lot of money asking GPT to do just because they know GPT.

I recently did an analysis on a large corpus and VADER was essentially free while the cloud costs to run a Llama based sentiment model was about $1000. I ran both because VADER costs nothing but minimal CPU time.

NLP can be wrong but it can’t be jailbroken and it won’t make stuff up.

Cheer2171

That's because VADER is just a dictionary mapping each word to a single sentiment weight and adding it up with some basic logic for negations and such. There's an ocean of smaller NLP ML between that naive approach and LLMs. LLMs are trained to do everything. If all you need is a model trained to do sentiment analysis, using VADER over something like DistilBERT is NLP malpractice in 2025.

vjerancrnjak

CNNs were outperforming traditional methods on some tasks before 2017.

Problem was that all of the low level tasks , like part of speech tagging, parsing, named entity recognition , etc. never resulted in a good summarizing system or translating system.

Probabilistic graphical models worked a bit but not much.

Transformers were a leap, where none of the low level tasks had to be done for high level ones.

Pretty sure that equivalent leap happened in computer vision a bit before.

People were fiddling with low level pattern matching and filters and then it was all obliterated with an end to end cnn .

jsemrau

I was contrasting FiNER, GliNER, and Smolagents in a recent blog post on my substack and while the first two are fast and provide somewhat good results, running a LLM locally is 10x better easily.

languagehacker

Great seeing Ray Mooney (who I took a graduate class with) and Emily Bender (a colleague of many at the UT Linguistics Dept., and a regular visitor) sharing their honest reservations with AI and LLMs.

I try to stay as far away from this stuff as possible because when the bottom falls out, it's going to have devastating effects for everyone involved. As a former computational linguist and someone who built similar tools at reasonable scale for largeish social media organizations in the teens, I learned the hard way not to trust the efficacy of these models or their ability to get the sort of reliability that a naive user would expect from them in practical application.

Legend2440

They are far far more capable than anything your fellow computational linguists have come up with.

As the saying goes, 'every time I fire a linguist, the performance of the speech recognizer goes up'

suddenlybananas

Don't try and say anything pro-linguistics here, people are weirdly hostile if you think it's anything but probabilities.

JumpCrisscross

> learned the hard way not to trust the efficacy of these models or their ability to get the sort of reliability that a naive user would expect from them in practical application

But…they work. Linguistics as a science is still solid. But as a practical exercise, it seems to be moot other than for finding niches where LLMs are too pricey.

philomath_mn

Curious what you are expecting when you say "bottom falls out". Are you expecting significant failures of large-scale systems? Or more a point where people recognize some flaw that you see in LLMs?

softwaredoug

I’m curious how have large language models impacted linguistics and particularly the idea of a universal grammar?

sp1nningaway

For me as a lay-person, the article is disjointed and kinda hard to follow. It's fascinating that all the quotes are emotional responses or about academic politics. Even now, they are suspicious of transformers and are bitter that they were wrong. No one seems happy that their field of research has been on an astonishing rocketship of progress in the last decade.

dekhn

The way I see this is that for a long time there was an academic field that was working on parsing natural human language and it was influenced by some very smart people who had strong opinions. They focused mainly on symbolic approaches to parsing, rather than probabilistic. And there were some fairly strong assumptions about structure and meaning. Norvig wrote about this: https://norvig.com/chomsky.html and I think the article bears repeated, close reading.

Unfortunately, because ML models went brr some time ago (Norvig was at the leading edge of this when he worked on the early google search engine and had access to huge amounts of data), we've since seen that probabilistic approaches produce excellent results, surpassing everything in the NLP space in terms of producing real-world sysems, without addressing any of the issues that the NLP folks believe are key (see https://en.wikipedia.org/wiki/Stochastic_parrot and the referenced paper). Personally I would have preferred if the parrot paper hadn't also discussed environmental costs of LLMs, and focused entirely on the semantic issues associated with probabilistic models.

I think there's a huge amount of jealousy in the NLP space that probabilistic methods worked so well, so fast (with transformers being the key innovation that improved metrics). And it's clear that even state-of-the-art probabilistic models lack features that NLP people expected.

Repeatedly we have seen that probabilistic methods are the most effective way to make forward progress, provided you have enough data and good algorithms. It would be interesting to see the NLP folks try to come up with models that did anything near what a modern LLM can do.

hn_throwaway_99

This is pretty much correct. I'd have to search for it but I remember an article from a couple years back that detailed how LLMs blew up the field of NLP processing overnight.

Although I'd also offer a slightly different lens through which to look at the reaction of other researchers. There's jealousy, sure, but overnight a ton of NLP researchers basically had to come to terms with the fact that their research was useless, at least from a practical perspective.

For example, imagine you just got your PhD in machine translation, which took you 7 years of laboring away in grad/post grad work. Then something comes out that can do machine translation several orders of magnitude better than anything you have proposed. Anyone can argue about what "understanding" means until they're blue in the face, but for machine translation, nobody really cares that much - people just want to get text in another language that means the same thing as the original language, and they don't really care how.

Tha majority of research leads to "dead ends", but most folks understand that's the nature of research, and there is usually still value in discovering "OK, this won't work". Usually, though, this process is pretty incremental. With LLMs all of a sudden you had lots of folks whose life work was pretty useless (again, from a practical perspective), and that'd be tough for anyone to deal with.

macleginn

The way I have experienced this, starting from circa 2018, it was a bit more incremental. First, LSTMs and then transformers lead to new heights on the old tasks, such as syntactic parsing and semantic role labelling, which was sad for the previous generation, but at least we were playing the same game. But then not only the old tools of NLP, but the research questions themselves became irrelevant because we could just ask a model nicely and get good results on very practical downstream tasks that didn't even exist a short while ago. NLP suddenly turned into general document/information processing field, with a side hustle in conversational assistants. Already GPT2 essentially mastered the grammar of English, and what difficulties remain are super-linguistic and have more to do with general reasoning. I would say that it's not that people are bitter that other people make progress, it's more that there is not much progress to be had in the old haunts at all.

Karrot_Kream

Even 15-ish years ago when I was in school, the NLP folks viewed probabilistic models with suspicion. NLP treated everyone from our Math department with suspicion and gave them a hard time. It created so many politics that some folks who wanted to do statistical approaches would call themselves CS so that the NLP old guard wouldn't give them a hard time.

peterldowns

All of this matches my understanding. It was interesting taking an NLP class in 2017, the professors said basically listen, this curriculum is all historical and now irrelevant given LLMs, we’ll tell you a little about them but basically it’s all cutting edge sorry.

rdedev

Same for my nlp class of 2021. Just directly went onto talking about transformers after a brief intro of the old stuff

foobarian

The progression reminds me of how brute force won out in the chess AI game long ago with Deep Blue. Custom VLSI and FPGA acceleration and all.

permo-w

do transformers not use a symbolic and a probabilistic approach?

Tainnor

I agree with criticism of Noam Chomsky as a linguist. I was raised in the typological tradition which has its very own kind of beef with Chomsky due to other reasons (his singular focus on English for constructing his theories amongst other things), but his dislike of statistical methods was of course equally suspect.

Nevertheless there is something to be said for classical linguistic theory in terms of constituent (or dependency) grammars and various other tools. They give us much simpler models that, while incomplete, can still be fairly useful at a fraction of the cost and size of transformer architectures (e.g. 99% of morphology can be modeled with finite state machines). They also let us understand languages better - we can't really peek into a transformer to understand structural patterns in a language or to compare them across different languages.

suddenlybananas

That is simply false about UG only being based on English. Maybe in 1950 but any modern generativist theory uses data from many, many languages and English has been re-analysed in light of other languages (see here for an example of quantifiers being analysed in English on the basis of data in a Salish language https://philpapers.org/rec/MATQAT )

levocardia

Sounds like the bitter lesson is bitter indeed!

dekhn

On the contrary, to some of us (who have focused on probability, big data, algorithms, and HPC, while eschewing complex theories that require geniuses to understand) the bitter lesson is incredibly sweet.

Very much like when I moved from tightly coupled to "embarassing" parallelism. A friend said "don't call it embarassing... it's pleasant not to have to think about hard distributed computing problems".

rdedev

It's a truly bitter pill to swallow when your whole area of research goes redundant.

I have a bit of background in this field so it's nice to see even people who were at the top of the field raise concerns that I had. That comment about LHC was exactly what I told my professor. That the whole field seems to be moving in a direction where you need a lot of resources to do anything. You can have 10 different ideas on how to improve LLMs but unless you have the resources there is barely anything you can do.

NLP was the main reason I pursued an MS degree but by the end of my course I was not longer interested in it mostly because of this.

Agingcoder

Well, if you’ve built a career on something, you will usually actively resist anything that threatens to destroy it.

In other words, what is progress for the field might not be progress for you !

This reminds me of Thomas Kuhn’s excellent book ´the structure of scientific revolutions’ https://en.wikipedia.org/wiki/The_Structure_of_Scientific_Re...

bpodgursky

> No one seems happy that their field of research has been on an astonishing rocketship of progress in the last decade.

Well, they're unhappy that an unrelated field of research more-or-less accidentally solved NLP. All the specialized NLP techniques people spent a decade developing were obviated by bigger deep learning models.

criddell

The field is natural language processing.

dang

I think we can squeeze it in there. Thanks!

AndrewKemendo

If Chomsky was writing papers in 2020 his paper would’ve been “language is all you need.”

That is clearly not true and as the article points out wide scale very large forecasting models beat that hypothesis that you need an actual foundational structure for language in order to demonstrate intelligence when in fact is exactly the opposite.

I’ve never been convinced by that hypothesis if for no other reason that we can demonstrate in the real world that intelligence is possible without linguistic structure.

As we’re finding: solving the markov process iteratively is the foundation of intelligence

out of that process emerges novel state transition processes - in some cases that’s novel communication methods that have structured mapping to state encoding inside the actor

communications happen across species to various levels of fidelity but it is not the underlying mechanism of intelligence, it is an emerging behavior that allows for shared mental mapping and storage

aidenn0

Some people will never be convinced that a machine demonstrates intelligence. This is because for a lot of people, intelligence exists a subjective experience that they have and the belief that others have it too is only inasmuch as others appear to be like the self.

simonw

It's called the AI effect: https://en.wikipedia.org/wiki/AI_effect

> The author Pamela McCorduck writes: "It's part of the history of the field of artificial intelligence that every time somebody figured out how to make a computer do something—play good checkers, solve simple but relatively informal problems—there was a chorus of critics to say, 'that's not thinking'."

meroes

It doesn’t mean they tie intelligence to subjective experience. Take digestion. Can a computer simulate digestion, yes. But no computer can “digest” if it’s just silicon in the corner of an office. There are two hurdles. The leap from simulating intelligence to intelligence, and the leap from intelligence to subjective experience. If the computer gets attached to a mechanism that physically breaks down organic material, that’s the first leap. If the computer gains a first person experience of that process, that’s the second.

You can’t just short-circuit from simulates to does to has subjective experience.

And the claim other humans don’t have subjective experience is such non-starter.

aidenn0

I think you're talking about consciousness rather than intelligence. While I do see people regularly distinguishing between simulation and reality for consciousness, I don't often see people make that distinction for intelligence.

> And the claim other humans don’t have subjective experience is such non-starter.

What about other primates? Other mammals? The smarter species of cephalopods?

Certain many psychopaths seem to act as if they have this belief.

dekhn

This is why I want the field to go straight to building indistinguishable agents- specifically, you should be able to video chat with an avatar that is impossible to tell from a human.

Then we can ask "if this is indistinguishable from a human, how can you be sure that anybody is intelligent?"

Personally I suspect we can make zombies that appear indistinguishable from humans (limited to video chat; making a robot that appears human to a doctor would be hard) but that don't have self-consciousness or any subjective experience.

bsder

"There is considerable overlap between the intelligence of the smartest bears and the dumbest tourists."

LLMs are not artificial intelligence but artificial stupidity.

LLMs will happily hallucinate. LLMs will happily tell you total lies with complete confidence. LLMs will give you grammatically perfect completely vapid content. etc.

And yet that is still better than what most humans could do in the same situation.

We haven't proved that machines can have intelligence, but instead we are happily proving that most people, most of the time just aren't very intelligent at all.

shmel

How do they convince themselves that other people have intelligence too?

6stringmerc

It is until proven otherwise because modern science still doesn’t have a consensus or standards or biological tests which can account for it. As in, highly “intelligent” people often lack “common sense” or fall prey to con artists. It’s pompous as shit to assert a black box mimicry constitutes intelligence. Wake me up when it can learn to play a guitar and write something as good as Bob Dylan and Tom Petty. Hint: we’ll both be dead before that happens.

aidenn0

I can't write something as good as Bob Dylan and Tom Petty. Ergo I'm not intelligent.