Modern-Day Oracles or Bullshit Machines? How to thrive in a ChatGPT world
437 comments
·February 9, 2025pjs_
Not sure why everyone rates this. It’s full of very confidently made statements like “the AI has no ground truth” (obviously it does, it has ingested every paper ever), it “can’t reason logically” which seems like a stretch if you ever read the CoT of a frontier reasoning model and “can’t explain how they arrived at conclusions” where - I mean just try it yourself with o1, go as deep as you like asking how it arrived at a conclusion and see if a human can do any better.
In fact the most annoying thing about this article is that it is a string of very confidently made, black and white statements, offered with no supporting evidence, and some of which I think are actually wrong… i.e. it suffers from the same kind of unsubstantiated self confidence that we complain about with the weaker models
grepLeigh
LLMs that use Chain of Thought sequences have been demonstrated to misrepresent their own reasoning [1]. The CoT sequence is another dimension for hallucination.
So, I would say that an LLM capable of explaining its reasoning doesn't guarantee that the reasoning is grounded in logic or some absolute ground truth.
I do think it's interesting that LLMs demonstrate the same fallibility of low quality human experts (i.e. confident bullshitting), which is the whole point of the OP course.
I love the goal of the course: get the audience thinking more critically, both about the output of LLMs and the content of the course. It's a humanities course, not a technical one.
(Good) Humanities courses invite the students to question/argue the value and validity of course content itself. The point isn't to impart some absolute truth on the student - it's to set the student up to practice defining truth and communicating/arguing their definition to other people.
ctbergstrom
Yes!
First, thank you for the link about CoT misrepresentation. I've written a fair bit about this on Bluesky etc but I don't think much if any of that made it into the course yet. We should add this to lesson 6, "They're Not Doing That!"
Your point about humanities courses is just right and encapsulates what we are trying to do. If someone takes the course and engages in the dialectical process and decides we are much too skeptical, great! If they decide we aren't skeptical enough, also great. As we say in the instructor guide:
"We view this as a course in the humanities, because it is a course about what it means to be human in a world where LLMs are becoming ubiquitous, and it is a course about how to live and thrive in such a world. This is not a how-to course for using generative AI. It's a when-to course, and perhaps more importantly a why-not-to course.
"We think that the way to teach these lessons is through a dialectical approach.
"Students have a first-hand appreciation for the power of AI chatbots; they use them daily.
"Students also carry a lot of anxiety. Many students feel conflicted about using AI in their schoolwork. Their teachers have probably scolded them about doing so, or prohibited it entirely. Some students have an intuition that these machines don't have the integrity of human writers.
"Our aim is to provide a framework in which students can explore the benefits and the harms of ChatGPT and other LLM assistants. We want to help them grapple with the contradictions inherent in this new technology, and allow them to forge their own understanding of what it means to be a student, a thinker, and a scholar in a generative AI world."
globalnode
I'll give it a read. I must admit, the more I learn about the inner workings of LLM's the more I see them as simply the sum of their parts and nothing more. The rest is just anthropomorphism and marketing.
mr_toad
Current LLMs are not the end-all of LLMs, and chain of thought frontier models are not the end-all of AI.
I’d be wary of confidently claiming what AI can and can’t do, at the risk of looking foolish in a decade, or a year, or at the pace things are moving, even a month.
onemoresoop
The ground truth is chopped off into tokens and statistically evaluated. It is of course just a soup of ground truth that can freely be used in more or less twisted ways that have nothing to do or are tangent to the ground truth. While I enjoy playing with LLMs I don't believe they have any intrinsic intelligence to them and they're quite far from being intelligent in the same sense that autonomous agents such as us humans are.
whattheheckheck
Any all of the tricks getting tacked on are overfitting to the test sets. It's all the tactics we have right now and they do provide assistance in a wide variety of economically valuable tasks with the only signs of stopping or slowing down is data curation efforts
Lerc
I think you'll find that humans have also demonstrated that they will misrepresent their own reasoning.
That does not mean that they cannot reason.
In fact, to come up with a reasonable explanation of behaviour, accurate or not, requires reasoning as I understand it to be. LLMs seem to be quite good at rationalising which is essentially a logic puzzle trying to manufacture the missing piece between facts that have been established and the conclusion that they want.
yapyap
> “the AI has no ground truth” (obviously it does, it has ingested every paper ever
it does not, AI is predicting the next ‘token’ based on the last ‘token’. There is no sentience, it’s machine learning except the machines are really strong.
It’d be illogical to say an AI has a ground truth just because it ‘ingested’ every paper ever.
poulpy123
But we know how the LLM works, and that's exactely how the authors explain it. And that explain also the weird mistakes they do, that nothing with the ability of reason or having a ground truth would do.
I really do not understand how technical people can think they are sentient
randomNumber7
> “can’t explain how they arrived at conclusions”
Imagine I would tell my wife, that whenever we have a discussion, her opinion would only be valid when she can explain how she arrived at her conclusion.
oblio
Your wife is one of the end products of cutthroat competition across several billion years so let's just say her general intelligence has a fair bit more validation than 20 years of research.
falcor84
Sexual selection applies an evolutionary pressure against men who challenge women too much about the validity of their reasoning.
fmbb
Training on all papers does not mean the model believes or knows the truth.
It is just a machine that spits out words.
joenot443
It's 1994. Larry Llyod Mayer has read the entire internet, hundreds of thousands of studies across every field, and can answer queries word for word the same as modern LLMs do. He speaks every major language. He's not perfect, he does occasionally make mistakes, but the sheer breadth of his knowledge makes him among the most employable individuals in America. The Pentagon, IBM, and Deloitte are begging to hire him. Instead, he works for you, for free.
Most laud him for his generosity, but his skeptics describe him as just a machine that spits out words. A stochastic parrot, useless for any real work.
yathaid
Does his accuracy take a sudden precipitous fall when going from multiplying two three-digit numbers to two four-digit numbers?
hnthrow90348765
It has some pieces of the puzzle to intelligence. That's a deal breaker for some people, and useful/promising to others.
lifthrasiir
I would be very careful to claim exactly that as emergent properties seem kinda crucial for artificial and human intelligences. (Not to say that they are equally functioning nor useful.)
Lerc
Um... what truth?
My truth, your truth or some defined objective truth?
criley2
>Training on all papers does not mean the model believes or knows the truth. It is just a machine that spits out words.
Sounds like humans at school. Cram the material. Take the test. Eject the data.
radioactivist
I've had frontier reasoning models (or at least what I can access in ChatGPT+ at any given moment) give wildly inconsistent answers when asked to provide the underlying reasoning (and the CoT weren't always given). Inventing sources and then later denying them mentioned them. Backtracking on statements it claimed to be true. Hiding weasel words in the middle of a long complicated argument to arrive at whatever it decided the answer was. So I'm inclined to believe the reasoning steps here are also susceptible to all the issues discussed in the posted article.
MichaelZuo
This sounds similar to a median human with little scruples?
bwfan123
the machine is fooling you with a mimicry of reasoning. and you are falling for it.
Grimblewald
If it's mimicry of reason is indistinguishable from real reasoning, how is it not reasoning?
Ultimately, an LLM models language and the process behind it's creation to some degree of accuracy or another. If that model includes a way to approximate the act of reasoning, then it is reasoning to some extent. The extent I am happy to agree is open for discussion, but that reasoning is taking place at all is a little harder to attack.
onemoresoop
No, it is distinguishable from real reasoning. Real reasoning, while flawed in various ways, goes through personal experience of the evaluator. LLMs don't have that capability at all. They're just sifting though tokens and associate statistical parameters to it with no skin in the game so to speak.
lanstin
I am getting two contradictory but plausible seeming replies when I ask about a certain set being the same when adding 1 to every value in the set, asked on how I ask the question.
Correct answer: https://chatgpt.com/share/67a9500b-2360-8007-b70e-0bc2b84bc1...
Incorrect answer (I think): https://chatgpt.com/share/67a950df-d4e0-8007-8105-95a9e5be19...
mrshadowgoose
I don't give a rat's ass about whether or not AI reasoning is "real" or a "mimicry". I care if machines are going to displace my economic value as a human-based general intelligence.
If a synthetic "mimicry" can displace human thinking, we've got serious problems, regardless of whether or not you believe that it's "real".
pembrook
So are all the humans in this thread.
Except, human mimicry of "reasoning" is usually applied in service of justifying an emotional feeling, arguably even less reliable than the non-feeling machine.
mrbungie
It has served us relatively fine for thousands of years.
LLMs? I'm waiting for one that knows how not to say something that is clearly wrong with extreme confidence, reasoning or not.
ZephyrBlu
What is reasoning if not a chain of logically consistent thoughts?
bwfan123
fair, but "logically consistent thoughts" is a subject of deep investigation starting from the early euclidean geometry to the modern godel's theorems.
ie, that logically consistent thinking starts from symbolization, axioms, proof procedures, world models. otherwise, you end up with persuasive words.
olalonde
If it looks like a duck, swims like a duck, and quacks like a duck, then it probably is a duck.
CodeMage
Counterpoint by Diogenes: "Behold, a man!"
null
butterNaN
It could also be a cheap imitation of a duck that might be passable for someone dull
maxdoop
What is reasoning? What is understanding? Do humans do either? How do you know?
bwfan123
this is the question that the greeks wrestled with over 2000 years ago. at the time there were the sophists (modern llm equivalents) that could speak persuasively like a politician.
over time this question has been debated by philosophers, scientists, and anyone who wanted to have better cognition in general.
eigenform
It depends on your tolerance for error.
When you have a machine that can only infer rules for reasoning from inputs [which are, more often than not, encoded in a very roundabout way within a language which is very ambiguous, like English], you have necessarily created something without "ground."
That's obviously useful in certain situations (especially if you don't know the rules in some domain!), but it's categorically not capable of the same correctness guarantees as a machine that actually embodies a certain set of rules and is necessarily constrained by them.
throwaway4aday
Are you contending that every human derives their reasoning from first principals rather than being taught rules in a natural language?
eigenform
I'm contending that, like any good tool, there is a context where it is useful, and a context where it is not (and that we are at a stage where everything looks suspiciously like a nail).
aidos
This is amazing!
I was speaking to a friend the other day who works in a team that influences government policy. One of the younger members of the team had been tasked with generating a report on a specific subject. They came back with a document filled with “facts”, including specific numbers they’d pulled from a LLM. Obviously it was inaccurate and unreliable.
As someone who uses LLMs on a daily basis to help me build software, I was blown away that someone would misuse them like this. It’s easy to forget that devs have a much better understanding of how these things work, can review and fix the inaccuracies in the output and tend to be a sceptical bunch in general.
We’re headed into a time where a lot of people are going to implicitly trust the output from these devices and the world is going to be swamped with a huge quantity of subtly inaccurate content.
aqueueaqueue
I made the same sort of mistake with the internet being young back in 93! Having a machine do it for you can easily turn into brain switch off.
hunter-gatherer
I keep telling everyone that the only reason I'm paid well to do "smart person stuff" is not because I'm smart, but because I've steadily watched everyone around me get more stupid over my life as a result of turning their brain switch off.
I agree a course like this needs to exist, as I've seen people rely on chatGPT for a lot of information. Just yesterday I demonstrated with some neighbors about how easily it could spew bullshit if you sinply ask it leading questions. A good example is "Why does the flu inpact men worse than women"/"Why foes the flu impact women worse than men". You'll get affirmative answers for both.
eclecticfrank
This is not something only younger people are prone to. I work in a consulting role in IT and have observed multiple colleagues aged 30 and above use LLMs to generate content for reports and presentations without verifying the output.
Reminded me of wikipedia-sourced presentations in high school in the early 2000s.
fancyfredbot
I have just read one section of this, "The AI scientist'. It was fantastic. They don't fall into the trap of unfalsifiable arguments about parrots. Instead they have pointed out positive uses of AI in science, examples which are obviously harmful, and examples which are simply a waste of time. Refreshingly objective and more than I expected from what I saw as an inflammatory title.
nmca
(while I work at OAI, the opinion below is strictly my own)
I feel like the current version is fairly hazardous to students and might leave them worse off.
If I offer help to nontechnical friends, I focus on:
- look at rate of change, not current point
- reliability substantially lags possibility, by maybe two years.
- adversarial settings remain largely unsolved if you get enough shots, trends there are unclear
- ignore the parrot people, they have an appalling track record prediction-wise
- autocorrect argument is typically (massively) overstated because RL exists
- doomers are probably wrong but those who belittle their claims typically understand less than the doomers do
jdlshore
I read the whole course. Lesson 16, “The Next-Step Fallacy,” specifically addresses your argument here.
nmca
The discourse around synthetic data is like the discourse around trading strategies — almost anyone who really understands the current state of the art is massively incentivised not to explain it to you. This makes for piss-poor public epistemics.
llm_trw
I'm happy to explain my strategies about synthetic data - it's just that you'll need to hear about the onions I wore in my day: https://www.youtube.com/watch?v=yujF8AumiQo
habinero
Nah, you don't need to know the details to evaluate something. You need the output and the null hypothesis.
If a trading firm claims they have a wildly successful new strategy, for example, then first I want to see evidence they're not lying - they are actually making money when other people are not. Then I want to see evidence they're not frauds - it's easy to make money if you're insider trading. Then I want to see evidence that it's not just luck - can they repeat it on command? Then I might start believing they have something.
With LLMs, we have a bit of real technology, a lot of hype, a bunch of mediocre products, and people who insist if you just knew more of the secret details they can't explain, you'd see why it's about to be great.
Call it Habiñero's Razor, but for hype the most cynical explanation is most likely correct -- it's bullshit. If you get offended and DARVO when people call your product a "stochastic parrot", then I'm going to assume the description is accurate.
layoric
How does this help the students with their use of these tools in the now, to not be left worse off? Most of the points you list seem like defending against criticism rather than helping address the harm.
habinero
Agree. It's also a virtue to point out the emperor has no clothes and the tailor peddling them is a bullshit artist.
This is no different than the crypto people who insisted the blockchain would soon be revolutionary and used for everything, when in reality the only real use case for a blockchain is cryptocoins, and the only real use case for cryptocoins is crime.
The only really good use case for LLMs is spam, because it's the only use case for generating a lot of human-like speech without meaning.
bo1024
This seems like trying to offer help predicting the future or investing in companies, which is a different kind of help from how to coexist with these models, how to use them to do useful things, what their pitfalls are, etc.
dimgl
What are “parrot people”? And what do you mean by “doomers are probably wrong?”
moozilla
OP is likely referring to people who call LLMs "stochastic parrots" (https://en.wikipedia.org/wiki/Stochastic_parrot), and by "doomers" (not boomers) they likely mean AI safetyists like Eliezer Yudkowsky or Pause AI (https://pauseai.info/).
sgt101
I wish the title wasn't so aggressively anti-tech though. The problem is that I would like to push this course at work, but doing so would be suicidal in career terms because I would be seen as negative and disruptive.
So the good message here is likely to miss the mark where it may be most needed.
beepbooptheory
Really? I am curious how this could be disruptive in any meaningful sense. Whose feelings could possibly be hurt? It just feels like it would be getting offended from a course on libraries because the course talks about how sometimes the book is checked out.
mpbart
Any executive who is fully bought in on the AI hype could see someone in their org recommending this as working against their interest and take action accordingly.
sgt101
Yes. This is the issue.
"not on board", "anti-innovation", "not a team player", "disruptive", "unhelpful", "negative".
bye bye bye bye....
I see a lot of devs and IC's taking the attitude that "facts are facts" and then getting shocked by a) other people manipulating information to get their way and b) being fired for stating facts that are contrary to received wisdom without any regards to politics.
hcs
> I just feels like it would be getting offended from a course on libraries because the course talks about how sometimes the book is checked out.
If it was called "Are libraries bullshit?" it is easy to imagine defensiveness in response. There's some narrow sense in which "bullshit" is a technical term, but it's still a mild obscenity in many cultures.
stefantalpalaru
[dead]
misterflibble
Thank you @ctbergstrom for this valuable and most importantly, objective, course. I'm bookmarking this and sharing it with everyone.
hirenj
This is a great resource, thanks. We (myself, a bioinformatician, and my co-cordinators, clinicians) are currently designing a course to hopefully arm medical students with the required basic knowledge they need to navigate the changing world of medicine in light of the ML and LLM advances. Our goal is to not only demystify medical ML, but also give them a sense of the possibilities with these technologies, and maybe illustrate pathways for adoption, in the safest way possible.
Already in the process of putting this course together, it is scary how much stuff is being tried out right now, and is being treated like a magic box with correct answers.
sabas123
> currently designing a course to hopefully arm medical students with the required basic knowledge they need to navigate the changing world of medicine in light of the ML and LLM advances
Could you share what you think would be some key basic points what they should learn? Personally I see this landscape changing so insanely much that I don't even know what to prepare for.
aaplok
Really well done. It is really a challenge for students to navigate their way around the AI landscape. I am definitely considering sharing that with my students.
Have you noticed a difference in how your students approach LLMs after taking your course? A possible issue I see is that it is preaching to the choir; a student who is enclined to use LLMs for everything is less likely to engage with the material in the first place.
If you allow feedback, I was interested in lesson 10 on writing, as an educator who tries to teach my science/IT/maths students the importance of being able to communicate.
I would suggest to include a paragraph to explain why being able to write without LLMs is just as important in scientific disciplines, where precision and accuracy are more essential than creativity and personalisation.
ctbergstrom
This is an excellent point about scientific writing. We'll add something to that effect.
We have not taught this course from the web-based materials yet, but it distills much of the two-week unit that we covered in our "Calling Bullshit" course this past autumn. We find that our students are generally very interested to better understand the LLMs that they are using — and almost every one of them does, to vary degree. (Of course there may be some selection bias in that the 180 students who sign up to take a course on data reasoning may be more curious and more skeptical than the average.)
zaptheimpaler
I like it. It's pretty basic but it is very good for a broad audience and covered things many people don't understand. I liked that you mentioned not to anthropomorphize the model. We would greatly benefit from 50+ year old policymakers and more taking the course even more than 19 year old freshmen.
pama
I wonder if the authors can explain the aparent inconsistency between what we now know about R1 and their statement “They don’t engage in logical reasoning” from the first lesson. My simple-minded view of logical reasoning by LLMs is that the hard question (say a math puzzle) has a verifiable answer that is hard to produce and is easy to verify, yet within the realm of knowledge of humans or the LLM itself, so the “thought” stream allows the LLM to increase its confidence by a self-discovered process that resembles human reasoning, before starting to write the answer stream. Much of the thought process that these LLMs use looks like conventional reasoning and logic, or more generally higher level algorithms to gain confidence in an answer, and other parts are not possible for humans to understand (yet?) despite the best efforts by DeepSeek. When combined with tools for the boring parts, these “reasoning” approaches can start to resemble human research processes as per the Deep Research by OpenAI.
prisenco
Fantastic work.
Quick suggestion: a link at the bottom of the page to the next and previous lesson would help with navigation a ton.
ctbergstrom
Absolutely. Great point. I just finished updating accordingly.
My design options are a bit limited so I went with a simple link to the next lesson.
threecheese
Looks like you pushed this midway through my read; I was pleasantly surprised to suddenly find breadcrumbs at the end and didn’t need to keep two tabs open. Great work, and I mean in total - this is well written and understandable to the layman.
ctbergstrom
Yep, I probably did. I really appreciate all of the feedback people are providing!
ndstephens
Really enjoying this. Thank you for the great work. I'm currently on Lesson 11 and noticed a couple typos (missing words). I haven't found anywhere on the site itself where I could send feedback to report such a thing (maybe I missed it). Hopefully you aren't offended if I post them here.
I think the easiest way to point them out is to just have you search for the partial line of text while on Lesson 11 and you'll see the spots.
"No one is going to motivated by a robotic..." (missing the word "be")
"People who are given a possible solution to a problem tend to less creative at..." (again missing the word "be")
ctbergstrom
Thank you very much — fixed!
Jevin West and I are professors of data science and biology, respectively, at the University of Washington. After talking to literally hundreds of educators, employers, researchers, and policymakers, we have spent the last eight months developing the course on large language models (LLMs) that we think every college freshman needs to take.
https://thebullshitmachines.com
This is not a computer science course; it’s a humanities course about how to learn and work and thrive in an AI world. Neither instructor nor students need a technical background. Our instructor guide provides a choice of activities for each lesson that will easily fill an hour-long class.
The entire course is available freely online. Our 18 online lessons each take 5-10 minutes; each illuminates one core principle. They are suitable for self-study, but have been tailored for teaching in a flipped classroom.
The course is a sequel of sorts to our course (and book) Calling Bullshit. We hope that like its predecessor, it will be widely adopted worldwide.
Large language models are both powerful tools, and mindless—even dangerous—bullshit machines. We want students to explore how to resolve this dialectic. Our viewpoint is cautious, but not deflationary. We marvel at what LLMs can do and how amazing they can seem at times—but we also recognize the huge potential for abuse, we chafe at the excessive hype around their capabilities, and we worry about how they will change society. We don't think lecturing at students about right and wrong works nearly as well as letting students explore these issues for themselves, and the design of our course reflects this.