Training language models to be warm and empathetic makes them less reliable

223 comments

·August 12, 2025

crossroadsguy

The more and I am using Gemini (paid, Pro) and ChatGPT (free) the more I am thinking - my job isn't going anywhere yet. At least not after the CxOs have all gotten their cost-saving-millions-bonuses and work has to be done again.

My goodness, it just hallucinates and hallucinates. It seems these models are designed for nothing other than maintaining an aura of being useful and knowledgeable. Yeah, to my non-ai-expert-human eyes that's what it seems to me - these tools have been polished to project this flimsy aura and they start acting desperately the moment their limits are used up and that happens very fast.

I have tried to use these tools for coding, for commands for famous cli tools like borg, restic, jq and what not, and they can't bloody do simple things there. Within minutes they are hallucinating and then doubling down. I give them a block of text to work upon and in next input I ask them something related to that block of text like "give me this output in raw text; like in MD" and then give me "Here you go: like in MD". It's ghastly.

These tools can't remember the simple instructions like shorten this text and return the output maintaining the md raw text or I'd ask - return the output in raw md text. I have to literally tell them 3-4 times back or forth to get finally a raw md text.

I have absolutely stopped asking them for even small coding tasks. It's just horrible. Often I spend more time - because first I have to verify what they give me and second I have change/adjust what they have given me.

And then the broken tape recorder mode! Oh god!

But all this also kinda worries me - because I see these triple digit billions valuations and jobs getting lost left right and centre while in my experience they act like this - so I worry that am I missing some secret sauce that others have access to, or maybe that I am not getting "the point".

logicprog

I'm really confused by your experience to be honest. I by no means believe that LLMs can reason, or will replace any human beings any time soon, or any of that nonsense (I think all that is cooked up by CEOs and C-suite to justify layoffs and devalue labor) and I'm very much on the side that's ready for the AI hype bubble to pop, but also terrified by how big it is, but at the same time, I experience LLMs as infinitely more competent and useful than you seem to, to the point that it feels like we're living in different realities.

I regularly use LLMs to change the tone of passages of text, or make them more concise, or reformat them into bullet points, or turn them into markdown, and so on, and I only have to tell them once, alongside the content, and they do an admirably competent job — I've almost never (maybe once that I can recall) seen them add spurious details or anything, which is in line with most benchmarks I've seen (https://github.com/vectara/hallucination-leaderboard), and they always execute on such simple text-transformation commands first-time, and usually I can paste in further stuff for them to manipulate without explanation and they'll apply the same transformation, so like, the complete opposite of your multiple-prompts-to-get-one-result experience. It's to the point where I sometimes use local LLMs as a replacement for regex, because they're so consistent and accurate at basic text transformations, and more powerful in some ways for me.

They're also regularly able to one-shot fairly complex jq commands for me, or even infer the jq commands I need just from reading the TypeScript schemas that describe the JSON an API endpoint will produce, and so on, I don't have to prompt multiple times or anything, and they don't hallucinate. I'm regularly able to have them one-shot simple Python programs with no hallucinations at all, that do close enough to what I want that it takes adjusting a few constants here and there, or asking them to add a feature or two.

> And then the broken tape recorder mode! Oh god!

I don't even know what you mean by this, to be honest.

I'm just confused. I genuinely don't understand, comments like yours, which seem very common, make me feel crazy lol

dingdingdang

Once heard a good sermon from a reverend who clearly outlined that any attempt to embed "spirit" into a service, whether through willful emoting, or songs being overly performary, would amount to self-deception since aforementioned spirit need to arise spontaneously to be of any real value.

Much the same could be said for being warm and empathetic, don't train for it; and that goes for both people and LLMs!

Al-Khwarizmi

As a parent of a young kid, empathy definitely needs to be trained with explicit instruction, at least in some kids.

mnsc

And for all kids and adults and elderly, empathy needs to be encouraged, practiced and nurtured.

PaulHoule

Some would argue empathy can be a bad thing

https://en.wikipedia.org/wiki/Against_Empathy

As it frequently is coded relative to a tribe. Pooh Pooh people’s fear of crime and disorder for instance and those people will think you don’t have empathy for them and vote for somebody else.

spookie

You have put into words way better what I was attempting to say at first. So yeah, this.

frumplestlatz

Society is hardly suffering from a lack of empathy these days. If anything, its institutionalization has become pathological.

I’m not surprised that it makes LLMs less logically coherent. Empathy exists to short-circuit reasoning about inconvenient truths as to better maintain small tight-knit familial groups.

spookie

Well, if they somehow get to experience the other side of the coin, that helps. And to be fair empathy does come more and more with age.

seszett

I don't think experiencing lack of empathy in others actually improves one's sense of empathy, on the contrary.

It's definitely not an effective way to inculcate empathy in children.

evanjrowley

Reading this reminded me of Mary Shelly's Frankenstein. The moral of the story is a very similar theme.

ninetyninenine

Will you be offended if an LLM told you the cold hard truth that you are wrong?

It's like if a calculator proved me wrong. I'm not offended by the calculator. I don't think anybody cares about empathy for an LLM.

Think about it thoroughly. If someone you knew called you an ass hole and it was the bloody truth, you'd be pissed. But I won't be pissed if an LLM told me the same thing. Wonder why.

jagged-chisel

The LLMs I have interacted with are so sure of themselves until I provide evidence to the contrary. I won’t believe an LLM about my own shortcomings until it can provide evidence to the contrary. Without that evidence, it’s just an opinion.

I do get your point. I feel like the answer for LLMs is for them to be more socratic.

ninetyninenine

Like you won't believe an LLM, but that's not the point. The point is were you offended?

enobrev

Not offended, but I would quite unhappy if a calculator called me an asshole because I disagree that 2+2=bobcat

ninetyninenine

You would have a personal problem with the LLM? Lies. I don’t believe you at all.

You’re a goddamn liar. And that’s the brutal truth.

m463

prompt: "be warm and empathetic, but not codependent"

galangalalgol

"be ruthless with constructive criticism. Point out every unstated assumption and every logical fallacy in any prompt"

astrange

> Point out every unstated assumption

What, all of them? That's a difficult problem.

https://en.wikipedia.org/wiki/Implicature

> every logical fallacy

They killed Socrates for that, you know.

renewiltord

[flagged]

andai

A few months ago I asked GPT for a prompt to make it more truthful and logical. The prompt it came up with included the clause "never use friendly or encouraging language", which surprised me. Then I remembered how humans work, and it all made sense.

    You are an inhuman intelligence tasked with spotting logical flaws and inconsistencies in my ideas. Never agree with me unless my reasoning is watertight. Never use friendly or encouraging language. If I’m being vague, ask for clarification before proceeding. Your goal is not to help me feel good — it’s to help me think better.

    Identify the major assumptions and then inspect them carefully.

    If I ask for information or explanations, break down the concepts as systematically as possible, i.e. begin with a list of the core terms, and then build on that.

It's work in progress, I'd be happy to hear your feedback.

futureshock

This is working really well in GPT-5! I’ve never seen a prompt change the behavior of Chat quite so much. It’s really excellent at applying logical framework to personal and relationship questions and is so refreshing vs. the constant butt kissing most LLMs do.

crazygringo

I did something similar a few months ago, with a similar request never to be "flattering or encouraging", to focus entirely on objectivity and correctness, that the only goal is accuracy, and to respond in an academic manner.

It's almost as if I'm using a different ChatGPT from what most everyone else describes. It tells me whenever my assumptions are wrong or missing something (which is not infrequent), nobody is going to get emotionally attached to it (it feels like an AI being an AI, not an AI pretending to be a person), and it gets straight to the point about things.

fibers

I tried with with GPT5 and it works really well in fleshing out arguments. I'm surprised as well.

aprilthird2021

No one gets bothered that these weird invocations make the use of AI better? It's like having code that can be obsoleted at any second by the upstream provider, often without them even realizing it

pmxi

Those “weird invocations” are called English.

koakuma-chan

How do humans work?

nomel

In my experience, much more effectively and efficiently when the interaction is direct and factual, rather than emotionally padded with niceties.

Whenever I have the ability to choose who I work with, I always pick who I can be the most frank with, and who is the most direct with me. It's so nice when information can pass freely, without having to worry about hurting feelings. I accommodate emotional niceties for those who need it, but it measurably slows things down.

Related, I try to avoid working with people who embrace the time wasting, absolutely embarrassing, concept of "saving face".

calibas

When interacting with humans, too much openness and honesty can be a bad thing. If you insult someone's politics, religion or personal pride, they can become upset, even violent.

lazide

Especially if you do it by not even arguing with them, but by Socratic style questioning of their point of view - until it becomes obvious that their point of view is incoherent.

frankus

If you want something to take you down a notch, maybe something like "You are a commenter on Hacker News. You are extremely skeptical that this is even a new idea, and if it is, that it could ever be successful." /s

m463

This is illogical, arguments made in the rain should not affect agreement.

dawnofdusk

Optimizing for one objective results in a tradeoff for another objective, if the system is already quite trained (i.e., poised near a local minimum). This is not really surprising, the opposite would be much more so (i.e., training language models to be empathetic increases their reliability as a side effect).

gleenn

I think the immediately troubling aspect and perhaps philosophical perspective is that warmth and empathy don't immediately strike me as traits that are counter to correctness. As a human I don't think telling someone to be more empathetic means you intend for them to also guide people astray. They seem orthogonal. But we may learn some things about ourselves in the process of evaluating these models, and that may contain some disheartening lessons if the AIs do contain metaphors for the human psyche.

ahartmetz

There are basically two ways to be warm and empathetic in a discussion: just agree (easy, fake) or disagree in the nicest possible way while taking into account the specifics of the question and the personality of the other person (hard, more honest and can be more productive in the long run). I suppose it would take a lot of "capacity" (training, parameters) to do the second option well and so it's not done in this AI race. Also, lots of people probably prefer the first option anyway.

perching_aix

I find it to be disagreeing with me that way quite regularly, but then I also frame my questions quite cautiously. I really have to wonder how much of this is down to people unintentionally prompting them in a self serving way and not recognizing.

EricMausler

> warmth and empathy don't immediately strike me as traits that are counter to correctness

This was my reaction as well. Something I don't see mentioned is I think maybe it has more to do with training data than the goal-function. The vector space of data that aligns with kindness may contain less accuracy than the vector space for neutrality due to people often forgoing accuracy when being kind. I do not think it is a matter of conflicting goals, but rather a priming towards an answer based more heavily on the section of the model trained on less accurate data.

I wonder if the prompt was layered, asking it to coldy/bluntly derive the answer and then translate itself into a kinder tone (maybe with 2 prompts), if the accuracy would still be worse.

tracker1

example: "Healthy at any weight/size."

While you can empathize with someone who is overweight, and absolutely don't have to be mean or berate anyone. I'm a very fat man myself. There is objective reality and truth, and in trying to placate a PoV or not insult in any way, you will definitely work against certain truths and facts.

pxc

In the interest of "objective facts and truth":

That's not the actual slogan, or what it means. It's about pursuing health and measuring health by metrics other than and/or in addition to weight, not a claim about what constitutes a "healthy weight" per se. There are some considerations about the risks of weight-cycling, individual histories of eating disorders (which may motivate this approach), and empirical research on the long-term prospects of sustained weight loss, but none of those things are some kind of science denialism.

Even the first few sentences of the Wikipedia page will help clarify the actual claims directly associated with that movement: https://en.wikipedia.org/wiki/Health_at_Every_Size

But this sentence from the middle of it summarizes the issue succinctly:

> The HAES principles do not propose that people are automatically healthy at any size, but rather proposes that people should seek to adopt healthy behaviors regardless of their body weight.

Fwiw I'm not myself an activist in that movement or deeply opposed to the idea of health-motivated weight loss; in fact I'm currently trying to (and mostly succeeding in!) losing weight for health-related reasons.

perching_aix

> example: "Healthy at any weight/size."

I don't think I need to invite any additional contesting that I'm already going to get with this, but that example statement on its own I believe is actually true, just misleading; i.e. fatness is not an illness, so fat people by default still count as just plain healthy.

Matter of fact, that's kind of the whole point of this mantra. To stretch the fact as far as it goes, in a genie wish type of way, as usual, and repurpose it into something else.

And so the actual issue with it is that it handwaves away the rigorously measured and demonstrated effect of fatness seriously increasing risk factors for illnesses and severely negative health outcomes. This is how it can be misleading, but not an outright lie. So I'm not sure this is a good example sentence for the topic at hand.

dawnofdusk

It's not that troubling because we should not think that human psychology is inherently optimized (on the individual-level, on a population-/ecological-level is another story). LLM behavior is optimized, so it's not unreasonable that it lies on a Pareto front, which means improving in one area necessarily means underperforming in another.

gleenn

I feel quite the opposite, I feel like our behavior is definitely optimized based on evolution and societal pressures. How is human psychological evolution not adhering to some set of fitness functions that are some approximation of the best possible solution to a multi-variable optimization space that we live in?

rkagerer

They were all trained from the internet.

Anecdotally, people are jerks on the internet moreso than in person. That's not to say there aren't warm, empathetic places on the 'net. But on the whole, I think the anonymity and lack of visual and social cues that would ordinarily arise from an interactive context, doesn't seem to make our best traits shine.

xp84

Somehow I am not convinced that this is so true. Most of the BS on the Internet is on social media (and maybe, among older data, on the old forums which existed mainly for social reasons and not to explore and further factual knowledge).

Even Reddit comments has far more reality-focused material on the whole than it does shitposting and rudeness. I don't think any of these big models were trained at all on 4chan, youtube comments, instagram comments, Twitter, etc. Or even Wikipedia Talk pages. It just wouldn't add anything useful to train on that garbage.

Overall on the other hand, most stackoverflow pages are objective, and to the extent there are suboptimal things, there is eventually a person explaining why a given answer is suboptimal. So I accept that some UGC went into the model, and that there's a reason to do so, but I believe it's so broad as "The Internet" represented there.

1718627440

LLM work less like people and more like mathematical models, why would I expect to be able to carry over intuition from the former rather than the latter?

knallfrosch

Classic: "Do those jeans fit me?"

You can either choose truthfulness or empathy.

impossiblefork

Empathy would be seeing yourself with ill-fitting jeans if you lie.

The problem is that the models probably aren't trained to actually be empathetic. An empathetic model might also empathize with somebody other than the direct user.

spockz

Being empathic and truthful could be: “I know you really want to like these jeans, but I think they fit such and so.” There is no need empathy to require lying.

nemomarx

There was that result about training them to be evil in one area impacting code generation?

roywiggins

Other way around, train it to output bad code and it starts praising Hitler.

https://arxiv.org/abs/2502.17424

null

[deleted]

jandom

This feels like a poorly controlled experiment: the reverse effect should be studied with a less empathetic model, to see if the reliability issue is not simply caused by the act of steering the model

ydj

I had the same thought, and looked specifically for this in the paper. They do have a section where they talk about fine tuning with “cold” versions of the responses and comparing it with the fine tuned “warm” versions. They found that the “cold” fine tune performed as good or better than the base model, while the warm version performed worse.

NoahZuniga

Also its not clear if the same effect appears on larger models like GPT-5, gemini 2.5-pro and whatever the largest most recent Anthropic model is.

The title is an overgeneralization.

andai

On a related note, the system prompt in ChatGPT appears to have been updated to make it (GPT-5) more like gpt-4o. I'm seeing more informal language, emoji etc. Would be interesting to see if this prompting also harms the reliability, the same way training does (it seems like it would).

There's a few different personalities available to choose from in the settings now. GPT was happy to freely share the prompts with me, but I haven't collected and compared them yet.

griffzhowl

> GPT was happy to freely share the prompts with me

It readily outputs a response, because that's what it's designed to do, but what's the evidence that's the actual system prompt?

rokkamokka

Usually because several different methods in different contexts produce the same prompt, which is unlikely unless it's the actual one

griffzhowl

Ok, could be. Does that imply then that this is a general feature, that if you get the same output from different methods and contexts with an LLM, that this output is more likely to be factually accurate?

Because to me as an outsider another possibility is that this kind of behaviour would also result from structural weaknesses of LLMs (e.g. counting the e's in blueberry or whatever) or from cleverly inbuilt biases/evasions. And the latter strikes me as an at least non-negligible possibility, given the well-documented interest and techniques for extracting prompts, coupled with the likelihood that the designers might not want their actual system prompts exposed

gastonmorixe

I was dating someone and after a while I started to feel something was not going well. I exported all the chats timestamped from the very first one and asked a big SOTA LLM to analyze the chats deeply in two completely different contexts. One from my perspective, and another from his perspective. It shocked me that the LLM after a long analysis and dozen of pages, always favored and accepted the current "user" persona situation as the more correct one and "the other" as the incorrect one. Since then I learned not to trust them anymore. LLMs are over-fine tuned to be people pleasers, not truth seekers, not fact and evidence grounded assistants. Just need to run everything important in a double-blind way and mitigate this.

labrador

It sounds like you were both right in different ways and don't realize it because you're talking past each other. I think this happens a lot in relationship dynamics. A good couples therapist will help you reconcile this. You might try that approach with your LLM. Have it reconcile your two points of view. Or not, maybe they are irreconcilable as in "irreconcilable differences"

mathiaspoint

If you've ever messed with early GPTs you'll remember how the attention will pick up on patterns early in the context and change the entire personality of the model even if those patterns aren't instructional. It's a useful effect that made it possible to do zero shot prompts without training but it means stuff like what you experienced is inevitable.

frahs

What if you don't say which side you are, so that it's a neutral third party observer?

OsrsNeedsf2P

This is cool but also wtf

Perz1val

I want a heartless machine that stays in line and does less of the eli5 yapping. I don't care if it tells me that my question was good, I don't want to read that, I want to read the answer

Twirrim

I've got a prompt I've been using, that I adapted from someone here (thanks to whoever they are, it's been incredibly useful), that explicitly tells it to stop praising me. I've been using an LLM to help me work through something recently, and I have to keep reminding it to cut that shit out (I guess context windows etc mean it forgets)

    Prioritize substance, clarity, and depth. Challenge all my proposals, designs, and conclusions as hypotheses to be tested. Sharpen follow-up questions for precision, surfacing hidden assumptions, trade offs, and failure modes early. Default to terse, logically structured, information-dense responses unless detailed exploration is required. Skip unnecessary praise unless grounded in evidence. Explicitly acknowledge uncertainty when applicable. Always propose at least one alternative framing. Accept critical debate as normal and preferred. Treat all factual claims as provisional unless cited or clearly justified. Cite when appropriate. Acknowledge when claims rely on inference or incomplete information. Favor accuracy over sounding certain. When citing, please tell me in-situ, including reference links.  Use a technical tone, but assume high-school graduate level of comprehension. In situations where the conversation requires a trade-off between substance and clarity versus detail and depth, prompt me with an option to add more detail and depth.

abtinf

This is a fantastic prompt. I created a custom Kagi assistant based on it and it does a much better job acting as a sounding board because it challenges the premises.

Thank you for sharing.

pessimizer

I feel the main thing LLMs are teaching us thus far is how to write good prompts to reproduce the things we want from any of them. A good prompt will work on a person too. This prompt would work on a person, it would certainly intimidate me.

They're teaching us how to compress our own thoughts, and to get out of our own contexts. They don't know what we meant, they know what we said. The valuable product is the prompt, not the output.

nicce

Einstein predicted LLMs too?

> If I had an hour to solve a problem, I'd spend 55 minutes thinking about the problem and five minutes thinking about solutions.

(not sure if that was the original quote)

Edit: Actually interesting read now that I look the origin: https://quoteinvestigator.com/2014/05/22/solve/

nonethewiser

so an extremely resource intensive rubber duck

junon

I have a similar prompt. Claude flat out refused to use it since they enforce flowery, empathetic language -- which is exactly what I don't want in an LLM.

Currently fighting them for a refund.

porphyra

Meanwhile, tons of people on reddit's /r/ChatGPT were complaining that the shift from ChatGPT 4o to ChatGPT 5 resulted in terse responses instead of waxing lyrical to praise the user. It seems that many people actually became emotionally dependent on the constant praise.

mhuffman

The folks over on /r/MyBoyfriendIsAI seem to be in an absolute shambles over the change .

[0] reddit.com/r/MyBoyfriendIsAI/

astrange

GPT5 isn't much more terse for me, but they gave it a new equally annoying writing style where it writes in all-lowercase like an SF tech twitter user on ketamine.

https://chatgpt.com/share/689bb705-986c-8000-bca5-c5be27b0d0...

Eduard

> https://chatgpt.com/share/689bb705-986c-8000-bca5-c5be27b0d0...

404 not found

dingnuts

if those users were exposed to the full financial cost of their toy they would find other toys

zeta0134

And what is that cost, if you have it handy? Just as an example, my Radeon VII can perfectly well run smaller models, and it doesn't appear to use more power than about two incandescent lightbulbs (120 W or so) while the query is running. I don't personally feel that the power consumed by approximately two light bulbs is excessive, even using the admittedly outdated incandescent standard, but perhaps the commercial models are worse?

Like I know a datacenter draws a lot more power, but it also serves many many more users concurrently, so economies of scale ought to factor in. I'd love to see some hard numbers on this.

derefr

IIRC you can actually get the same kind of hollow praise from much dumber, locally-runnable (~8B parameters) models.

PeterStuer

[flagged]

astrange

LLMs do not have internal reasoning, so the yapping is an essential part of producing a correct answer, insofar as it's necessary to complete the computation of it.

Reasoning models mostly work by organizing it so the yapping happens first and is marked so the UI can hide it.

typpilol

You can see a good example of this on the deep seek website chat when you enable thinking mode or whatever.

You can see it spews pages of pages before it answers.

astrange

My favorite is when it does all that thinking and then the answer completely doesn't use it.

Like if you ask it to write a story, I find it often considers like 5 plots or sets of character names in thinking, but then the answer is entirely different.

currymj

in ChatGPT settings now there is a question "What personality should ChatGPT have?". you can set it to "Robot". highly recommended.

heymijo

Nice.

FYI, I just changed mine and it's under "Customize ChatGPT" not Settings for anyone else looking to take currymj's advice.

IshKebab

Wow this is such an improvement. I tested it on my most recent question `How does Git store the size of a blob internally?`

Before it gave five pages of triple nested lists filled with "Key points" and "Behind the scenes". In robot mode, 1 page, no endless headers, just as much useful information.

shadowgovt

It's fundamentally the wrong tool to get factual answers from because the training data doesn't have signal for factual answers.

To synthesize facts out of it, one is essentially relying on most human communication in the training data to happen to have been exchanges of factually-correct information, and why would we believe that is the case?

astrange

Because people are paying the model companies to give them factual answers, so they hire data labellers and invent verification techniques to attempt to provide them.

Even without that, there's implicit signal because factual helpful people have different writing styles and beliefs than unhelpful people, so if you tell the model to write in a similar style it will (hopefully) provide similar answers. This is why it turns out to be hard to produce an evil racist AI that also answers questions correctly.

lblume

Empirically, there seems to be strong evidence for LLMs giving factual output for accessible knowledge questions. Many benchmarks test this.

shadowgovt

Yes, but in the same sense that empirically, I can swim in the nearby river most days; the fact that the city has a combined stormdrain / sewer system that overflows to put feces in the river means that some days, the water I'd swim in is full of shit, and nothing about the infrastructure is guarding against that happening.

I can tell you how quickly "swimmer beware" becomes "just stay out of the river" when potential E. coli infection is on the table, and (depending on how important the factuality of the information is) I fully understand people being similarly skeptical of a machine that probably isn't outputting shit, but has nothing in its design to actively discourage or prevent it.

pessimizer

I'm loving and being astonished by every moment of working with these machines, but to me they're still talking lamps. I don't need them to cater to my ego, I'm not that fragile and the lamp's opinion is not going to cheer me up. I just want it to do what I ask. Which it is very good at.

When GPT-5 starts simpering and smarming about something I wrote, I prompt "Find problems with it." "Find problems with it." "Write a bad review of it in the style of NYRB." "Find problems with it." "Pay more attention to the beginning." "Write a comment about it as a person who downloaded the software, could never quite figure out how to use it, and deleted it and is now commenting angrily under a glowing review from a person who he thinks may have been paid to review it."

Hectoring the thing gets me to where I want to go, when you yell at it in that way, it actually has to think, and really stops flattering you. "Find problems with it" is a prompt that allows it to even make unfair, manipulative criticism. It's like bugspray for smarm. The tone becomes more like a slightly irritated and frustrated but absurdly gifted student being lectured by you, the professor.

devin

There is no prompt which causes an LLM to "think".

pessimizer

Who cares about semantics? Define what thinking means in a human. I did computer engineering, I know how a computer works, and I also know how an LLM works. Call it what you want if calling it "thinking" makes you emotional.

I think it's better to accept that people can install their thinking into a machine, and that machine will continue that thought independently. This is true for a valve that lets off steam when the pressure is high, it is certainly true for an LLM. I really don't understand the authenticity babble, it seems very ideological or even religious.

But I'm not friends with a valve or an LLM. They're thinking tools, like calculators and thermostats. But to me arguing about whether they "think" is like arguing whether an argument is actually "tired" or a book is really "expressing" something. Or for that matter, whether the air conditioner "turned itself off" or the baseball "broke" the window.

Also, I think what you meant to say is that there is no prompt that causes an LLM to think. When you use "think" it is difficult to say whether you are using scare quotes or quoting me; it makes the sentence ambiguous. I understand the ambiguity. Call it what you want.

mythrwy

A good way to determine this is to challenge LLMs to a debate.

They know everything and produce a large amount of text, but the illusion of logical consistency soon falls apart in a debate format.

hintymad

Do we need to train an LLM to be warm and empathetic, though? I was wondering why wouldn't a company simply train a smaller model to rewrite the answer of a larger model to inject such warmth. In that way, the training of the large model can focus on reliability

nialv7

Well, haven't we seen similar results before? IIRC finetuning for safety or "alignment" degrades the model too. I wonder if it is true that finetuning a model for anything will make it worse. Maybe simply because there is just orders of magnitudes less data available for finetuning, compared to pre-training.

perching_aix

Careful, this thread is actually about extrapolating this research to make sprawling value judgements about human nature that confirm to the preexisting personal beliefs of the many malicious people here making them.

cobbzilla

I want an AI that will tell me when I have asked a stupid question. They all fail at this with no signs of improvement.

thenickdude

We have that already, we call it "Stack Overflow"

cobbzilla

hands down, thread winner

drummojg

I would be perfectly satisfied with the ST:TNG Computer. Knows all, knows how to do lots of things, feels nothing.

bitwize

In Mass Effect, there is a distinction made between AI (which is smart enough to be considered a person) and VI (virtual intelligence, basically a dumb conversational UI over some information service).

What we have built in terms of LLMs barely qualifies as a VI, and not a particularly reliable one. I think we should begin treating and designing them as such, emphasizing responding to queries and carrying out commands accurately over friendliness. (The "friendly" in "user-friendly" has done too much anthropomorphization work. User-friendly non-AI software makes user choices, and the results of such choices, clear and responds unambiguously to commands.)

moffkalast

A bit of a retcon but the TNG computer also runs the holodeck and all the characters within it. There's some bootleg RP fine tune powering that I tell you hwat.

Spivak

It's a retcon? How else would the holdeck possibly work, there's only one (albeit highly modular) computer system on the ship.

null

[deleted]

Aeolun

I dunno, I deliberately talk with Claude when I just need someone (or something) to be enthusiastic about my latest obsession. It’s good for keeping my motivation up.

layer8

There need to be different modes, and being enthusiastic about the user’s obsessions shouldn’t be the default mode.

nis0s

An important and insightful study, but I’d caution against thinking that building pro-social aspects in language models is a damaging or useless endeavor. Just speaking from experience, people who give good advice or commentary can balance between being blunt and soft, like parents or advisors or mentors. Maybe language models need to learn about the concept of tough love.

fpgaminer

"You don't have to be a nice person to be a good person."

mlinhares

Most terrible people i've met were "very nice".

HN

Training language models to be warm and empathetic makes them less reliable

Training language models to be warm and empathetic makes them less reliable