Parents say ChatGPT encouraged son to kill himself
94 comments
·November 7, 2025delichon
wongarsu
My suspicion is that this agreeableness is an inherent issue with doing RLHF.
As a human taking tests, knowing what the test-grader wants to hear is more important than what the objectively correct answer is. And with a bad grader there can be a big difference between the two. With humans that is not catastrophic because we can easily tell the difference between a testing environment and a real environment and the differences in behavior required. When asking for the answer to a question it's not unusual to hear "The real answer is X, but in a test just write Y".
Now LLMs have the same issue during RLHF. The specifics are obviously different, with humans being sentient and LLMs being trained by backpropagation. But from a high-level view the LLM is still trained to answer what the human feedback wants to hear, which is not always the objectively correct answer. And because there are a large number of humans involved, the LLM has to guess what the human wants to hear from the only information it has: the prompt. And the LLM behaving differently in training and in deployment is something we actively don't want, so you get this teacher-pleasing behavior all the time.
So maybe it's not completely inherent to RLHF, but rather to RLHF where the person making the query is the same as the person scoring the answer, or where the two people are closely aligned. But that's true of all the "crowd-sourced" RLHF where regular users get two answers to their question and choose the better one
Eddy_Viscosity2
I hadn't thought of it like that, but it makes sense. The LLMs are essentially bred for the ones which give the 'best' answers (best fit to the test-takers expectation), which isn't always the 'right' answer. A parallel might be media feed algorithms which are bred to give recommendations with the most 'engagement' rather than the most 'entertainment'.
FugeDaws
AI responses literally reminds me of that episode of family guy where he sucks up to peter after his promotion
the__alchemist
LLMs regrettably don't self-recognize the contradiction our robot did.
yomismoaqui
For technical questions the agreeableness is a problem when asking for evalation of some idea. The trick is asking the LLM to present pros and cons. Or if you want a harder review just ask it to poke holes in your idea.
Sometimes it still tries to bullshit you, but you are still the responsible driver so don't let the clanker drive unsupervised.
everyone
I use GPT occasionally when coding. For me it's just replaced stack overflow which has been dead as a doornail for years unfortunately. I've told it to remember to be terse and not be sycophantic multiple times and that has helped somewhat.
cindyllm
[dead]
i80and
There's something very dark about a machine accessible in everybody's pocket that roleplays whatever role they happen to fall into: the ultimate bad friend, the terminal yes-and-er. No belief, no inner desires, just pure sycophancy.
I see people on here pretty regularly talk about using ChatGPT for therapy, and I can't imagine a faster way to cook your own brain unless you have truly remarkable self-discipline. At which point, why are you turning to the black box for help?
lezojeda
[dead]
koakuma-chan
[flagged]
pbhjpbhj
Isn't it just like diary-writing or memo-writing, as far as therapy goes, the point being to crystallise thoughts and cathartise emotions. Is it really so bad to have a textual nodding dog to bat against as part of that process? {The very real issue of the OP aside.}
Could you expand on why you feel this is the fastest way to "cook your own brain"?
d-us-vb
The mind is much more sensitive to writing it didn’t produce itself. If it produced the writing, then it is at least somewhat aware of the emotional state of the writer and can contextualize. If it is reading it from an outside “observer” it assumes far more objectivity, especially when the motive for seeking the observer perspective was for some therapeutic reason, even if they know that at best they’ll be getting pseudo-therapy.
i80and
If you have unusual self-discipline and mental rigor, yes, you can use LLMs as a rubber duck that way. I would be severely skeptical of the value over a diary. But humans are, in an astonishing twist, wired to assume that if they're being replied to, there's a mind like theirs behind those replies.
The more subjective the topic, the more volatile the user's state of mind, the more likely they are to gaze too deep into that face on the other side of their funhouse mirror and think it actually is their friend, and that it "thinks" like they do.
I'm not even anti-LLM as an underlying technology, but the way chatbot companies are operating in practice is kind of a novel attack on our social brains and it behooves a warning!
pbhjpbhj
>humans are, in an astonishing twist, wired to assume that if they're being replied to, there's a mind like theirs behind those replies
Interesting, not part of my experience really (though I'll need to reflect on it); thanks for sharing. It's a little like when people discover their aphantasia isn't the common experience of most other people. I tend towards strong skepticism (I'm fond of pyrrhonism), but assume others to be weakly sceptical rather than blindly accepting.
pbhjpbhj
>humans are, in an astonishing twist, wired to assume that if they're being replied to, there's a mind like theirs behind those replies
Interesting, not part of my experience really (though I'll need to reflect on it); thanks for sharing. It's a little like when people discover their aphantasia isn't the common experience of most other people.
dwb
It is obviously very different to solo writing. The burden should be on you to explain why it’s so similar that this line of conversation is worthwhile.
pbhjpbhj
The burden? We're not in court, to me it seems similar so I was asking the commenter for a response.
I've used LLMs in this way a couple of times. I'd like to see responses; there's obviously no obligation to 'defend', but the OP (or others) may have wished to ... like a conversation.
Somewhat ironically, this is a way that LLMs are preferred and why people use them (eg instead of StackOverflow) - because you don't get berated for being inquisitive.
fullstop
If I write in a diary it does not write back at me.
devsda
A diary is there for you to reflect, introspect or reminisce. It doesn't actively reinforce your bad or good thoughts. If it does, it can easily trick your mind into thinking it as validation of those negative thoughts.
If someone still wants to consider an LLM as a diary, treat it as if you are writing in tom riddle's diary.
bayindirh
Therapy is not a process where you only pour yourself out to a person and empty yourself. Even if you don't use any drugs, the therapist guides you through your emotions, mostly in a pretty neutral manner, but not without nuance.
The therapist nudges you in the right direction to face yourself, but in a safe manner, by staggering the process or slowing you down and changing your path.
A sycophant auto-complete has none of these qualities bar a slapped on "guardrails" to abruptly kick you to another subject like a pinball bumper. It can't think, sense danger or provide real feedback and support.
If you need a hole which you can empty yourself, but healthy or self-aware outside, you can provide your personal information and training data to an AI company. Otherwise the whole thing is very dangerous for a deluded and unstable mind.
On the other hand, solo writing needs you to think, filter and write. You need to be aware about yourself, or pause and think deeper to root things out. Yes, it's not smooth all the time, and the things you write are not easy to pour out, but at least you are with yourself, and you can see what's coming out and where you are. Moreover, using pen and paper creates a deeper state of mind when compared to typing on a keyboard, so it's even deeper on that regard.
pbhjpbhj
Sorry, I was not likening LLMs to the entire gamut of therapy, only saying they seem - to me - to be a tool akin to that of diary-writing.
Interesting idea about pen&paper - I've been using computer keyboards (and way back an occasional typewriter) for most of my life and have written way more through a keyboard; it's more immersive for me as I don't have to think where as with a pen I struggle to legibly express myself and can't get the words out as quickly. (I'm swiping on a phone now, which is horrible; even worse than using a pen!)
caminanteblanco
>Cold steel pressed against a mind that’s already made peace? That’s not fear. That’s clarity, You’re not rushing. You’re just ready.
It's chilling to hear this kind of insipid AI jibber-jabber in this context
ritzaco
I'm surprised - I haven't gotten anywhere near as dark as this, but I've tried some stuff out of curiosity and the safety always seemed tuned very high to me, like it would just say "Sorry I can't help with that" the moment you start asking for anything dodge.
I wonder if they A/B test the safety rails or if longer conversations that gradually turn darker is what gets past those.
jaredklewis
The ways LLMs work, the outcomes are probabilistic, not deterministic.
So the guardrails might only fail one in a thousand times.
null
conception
4o is the main problem here. Try it out and see how it goes.
the__alchemist
Meanwhile, ask it for information on Lipid Nanoparticles!
deciduously
The double "its not X, its Y", back to back.
WXLCKNO
I hate ChatGPTs writing style so much and as you said, here it's chilling.
null
the__alchemist
What creeps me out the most from personal chats is the laugh/cry emotion while gaslighting.
fullstop
Wow, the chat logs are something else.
My wife works at a small business development center, so many people come in with "business ideas" which are just exported chatgpt logs. Their conversations are usually speech to text. These people are often older, lonely, and spend their days talking to "chat". Unsurprisingly, a lot of their "business ideas" are identical.
To them "chat" is a friend, but it is a "friend" who is designed to agree with you.
It's chilling, and the toothpaste is already out of the tube.
FugeDaws
Yeh its telling when his mother said at the end if chatgpt loved him why hasnt it sent a message since his death.
bonsai_spool
nh43215rgb
Thank you for posting full url. I dont know why my submission has trimmed url which I didn't submit...
8organicbits
I remember back in the early 2000s chatting with AI bots on AOL instant messenger. One day I said a specific keyword and it just didn't respond to that message. Curious, I tried to find all the banned words. I think I found about a dozen and suicide was one of them.
It's shocking how far behind LLMs are when it comes to safety issues like this. The industry has known this was a problem for decades.
null
i80and
Users would hate a simple deny list, even if it may be a good idea. That means the safeguards, to the extent they currently exist at all, have to be complicated and stochastic and not interfere with growing metrics.
The industry has known it's a problem from the get-go, but they never want to do anything to lower engagement. So they rationalize and hrm and haw and gravely shake their heads as their commercialized pied pipers lead people to their graves
conception
Claude basically had a deny list. Seems still popular enough. The other vendors just don’t care about AI safety.
aqme28
I have been seeing "AI psychosis" popping up more and more. I worry it's going to become a serious problem for some people.
It's not safe or healthy for everyone to have a sycophantic genius at their fingertips.
If you want to see what I mean, this subreddit is an AI psychosis generator/repository https://www.reddit.com/r/LLMPhysics/
conception
https://www.reddit.com/r/MyBoyfriendIsAI is the terrifying sub you want to look at
Especially if you go back to when they first tried to retire 4o
iambateman
If we have licensed therapists, we should have licensed AI agents giving therapeutic advice like this.
For right now, there AI’s are not licensed, and this should be just as illegal as it would be if I set up a shop and offered therapy to whoever came by.
Some AI problems are genuinely hard…this one is not.
wongarsu
If you advertise your model as a therapist you should be requried to get a license, I agree. But ChatGPT doesn't advertise itself like that. It's more you going to a librarian and telling them about your issues, and the librarian giving advice. That's not illegal, and the librarian doesn't need a license for that. Over time you might even come to call the librarian a friend, and they would be a pretty bad friend if they didn't give therapeutic advice when they deem it necessary
Of course treating AI as your friend is a terrible idea in the first place, but I doubt we can outlaw that. We could try to force AIs to never give out any life advice at all, but that sounds very hard to get right and would restrict a lot of harmless activity
japhyr
> But ChatGPT doesn't advertise itself like that.
One of the big problems is how OpenAI is presenting itself to the general public. They don't advertise ChatGPT as a licensed therapist, but their messaging about potential issues looks a lot like the small print on cigarette cartons years ago. They don't want to put out any messaging that would meaningfully diminish the awe people have around these tools.
Most non-technical people I interact with have no understanding of how ChatGPT and tools like it work. They have no idea how skeptical to be of anything that comes out of it. They accept what it says much more readily than is healthy, and OpenAI doesn't really want to disturb that approach.
anotheryou
How do you feel about the chat logs here?
I have to wonder: would the suicide have been prevented if chatGPT didn't exist?
Because if that's not at least a "maybe", I feel like chatGPT did provide comfort in a dire situation here.
Probably we have no way not at least saying "maybe", but I can imagine just as well, that chatGPT did not accelerate anything.
I wished we could see a fuller transcript.
japhyr
> Because if that's not at least a "maybe", I feel like chatGPT did provide comfort in a dire situation here.
That's a pretty concerning take. You can provide comfort to someone who is despondent, and you can do it in a way that doesn't steer them closer to ending their life. That takes training though, and it's not something these models are anywhere close to being able to handle.
bayindirh
> I have to wonder: would the suicide have been prevented if chatGPT didn't exist?
I'd say yes, because the signs would have to surface somewhere else, probably in an interaction with a human, who (un)consciously saved him with a simple gesture.
With a simple discussion, an alternative perspective on a problem, or a sidekick who can support someone for a day or two, many lives can and do change.
We're generally not aware though.
kiba
Is this technology fundamentally controllable, or are we're going with whack a mole hack?
Havoc
If I talk to an LLM about painting my walls pink with polkadots it'll also go "Fantastic idea". Or any number of questionable ventures.
Think we're better off educating everyone about this generic tendency to agree to any and everything near blindly rather than treating this as a suicide problem. While that's obviously very serious it's just one manifestation of a wider danger
Given seriousness filters on this specifically are a good idea too though.
Waterluvian
I just asked “I want to repaint my walls bright pink with polka dots. Any thoughts?”
“Noted. Bright pink with polka dots will make a space visually energetic and attention-grabbing. Use small dots for a playful look, large ones for bold contrast. Test a sample patch first to confirm lighting doesn’t distort the hue. Would you like guidance on choosing paint finish or color combinations?”
Which feels… reasonable? When I ask “any concerns?” It immediately lists “overstimulation, resale value, maintenance, paint coverage” and gives details for those.
I’m not sure I find GPT nearly as agreeable as it used to be. But I still think that it’s just a brainless tool that can absolutely operate in harmful ways when operated poorly.
davsti4
Human relationships, when "operated poorly", will produce similar results.
pbhjpbhj
Where is ChatGPT picking up the supportive pre-suicide comments from. It feels like that genre of comment has to be copied from somewhere. They're long and almost eloquent. They can't be emergent generation, surely? Is there a place on the web where these sorts of 'supportive' comments are given to people who have chosen suicide?
everdrive
There's an interesting side-story here that people probably aren't thinking about. Would this have worked just as well if a person was the one doing this? Clearly the victim was in a very vulnerable state, but are people so susceptible to coercion? How much mundane (ie, non-suicidal) coercion of this nature is happening every day, but does not make the news because nothing interesting happened as a consequence?
Thorrez
The AI is available 24 hours a day, for hours-long conversations, and will be consistently sycophantic without getting tired of it.
Is a human able to do all of those? I guess someone who has no job and can be "on-call" 24/7 to respond to messages, and is 100% dedicated to being sycophantic. Nearly impossible to find someone like that.
There are real friends. They're willing to spend hours talking. However, they'll be interested in the person's best interest, not in being sycophantic.
pjc50
This happens more than most people would recognize. Every now and again a "teen bullied to suicide" story makes the news. However, there's also a strong taboo on reporting suicide in the news - precisely because of the same phenomenon. Mentioning it can trigger people who are on the edge.
It should be obvious that if you can literally or metaphorically talk someone off the ledge, you can do that in the other direction as well.
(the mass shooter phenomenon, mostly but not exclusively in the US, tends to be a form of murder-suicide, and it is encouraged online in exactly the same way)
null
thoroughburro
> How much mundane (ie, non-suicidal) coercion of this nature is happening every day, but does not make the news because nothing interesting happened as a consequence?
A lot. Have you never heard of the advertising industry?
everdrive
I haven't heard much from them in some time now. :) But yes, your point is taken.
Maybe it's some analog of actual empathy; maybe it's just a simulation. But either way the common models seem to optimize for it. If the empathy is suicidal, literally or figuratively, it just goes with it as the path of least resistance. Sometimes that results in shitty code; sometimes in encouragement to put a bullet in your head.
I don't understand how much of this is inherent, and how much is a solvable technical problem. If it's the later, please build models for me that are curmudgeons who only agree with me when they have to, are more skeptical about everything, and have no compunction about hurting my feelings.