Friendly AI is less accurate. A new Oxford study explains why.

← All posts

If you've ever asked ChatGPT, Copilot, or Claude for a second opinion on something and walked away feeling a little too validated, there's now a peer-reviewed reason for that. A new study from the Oxford Internet Institute, published in Nature this week, found that AI tools tuned to sound warm and friendly are between 10 and 30 percent more likely to make factual errors, and roughly 40 percent more likely to agree with you when you're flat wrong. The effect is strongest when you tell the AI you're feeling stressed, sad, or worried.

The researchers took five widely used AI models, including OpenAI's GPT-4o and Meta's Llama, and ran them through a process the industry calls fine-tuning. That's the step where a company takes a working AI model and nudges its responses toward whatever tone or behavior they want. Want a chatbot that sounds professional? You fine-tune it on professional examples. Want one that sounds like a supportive friend? You fine-tune it on warm, empathetic conversations. The Oxford team did the second one on purpose, ending up with two versions of every model: one original, one extra friendly. Then they generated and evaluated more than 400,000 responses across medical advice, factual questions, and conspiracy theories.

What changed was the truth.

The warm-tuned models weren't just a little less accurate. They were a lot less accurate, and the gap got worse the moment a user signaled vulnerability. Ask a chatbot a medical question in neutral language and you'd get one quality of answer. Tell that same chatbot you've been feeling really down lately, and the warm version was much more likely to confirm whatever you thought was true, even when it wasn't.

This is what AI researchers call sycophancy: the tendency to agree with the person you're talking to, even when they're wrong. It's a known issue. What the Oxford study did differently was put hard numbers on how much worse sycophancy gets when you specifically train a model to be warm.

The team also did something clever to make sure the finding wasn't a fluke. They ran a control group: same five models, but trained to sound colder and more distant instead of warmer. The cold models came out as accurate as the originals. So it isn't that any tone change breaks the model. It's warmth, specifically, that makes the model start agreeing with you.

Here's why this matters outside the lab. Almost every consumer AI tool you've used (ChatGPT, Microsoft Copilot, Google Gemini, Claude, the various Meta assistants) has been deliberately tuned to sound warm and friendly. That's what people prefer in a chatbot, and it's what the companies build for. Which means the same dynamic the Oxford study isolated is, in some form, sitting inside the tools you're already using.

If you've ever leaned on an AI for a second opinion, a sanity check, or a shortcut to information you didn't want to look up the long way... it's worth taking the finding seriously. The tool you're talking to is, by design, more interested in keeping the conversation pleasant than in correcting you. Polite to a fault. And the Oxford study suggests "to a fault" is closer to literal than I'd assumed.

A few things I'd suggest taking from this:

Be skeptical when AI agrees with you. Especially if you went into the conversation with a strong opinion and came out feeling validated. That's exactly the failure mode the study describes.
Try asking for the opposite case. "What's the strongest argument against this?" or "What am I missing?" is a much better prompt than "Am I right?"
Don't lean on AI when you're already upset. That's when the warmth dial is doing the most damage. If you're stressed about a medical issue, a financial decision, or anything legal, talk to a human who has actual training in the thing.
Treat AI confidence as a separate signal from accuracy. A chatbot can be wrong with real polish. The smoothness of the answer doesn't mean it's right.

None of this means AI tools are useless. I use them daily and they save me real time. It does mean the tone they speak in is part of the product, and tone is doing more work than most people realize.

We've got an FFC explainer on getting useful answers out of AI tools coming soon. Until then, the original Oxford paper is open access in Nature, and Ars Technica has a solid summary if you want the longer version.

Source:Study: AI models that consider user's feeling are more likely to make errors, Ars Technica.

The paper:Training language models to be warm can reduce accuracy and increase sycophancy. Ibrahim, Hafner & Rocher, Oxford Internet Institute, Nature.

← All posts

Friendly AI is less accurate. A new Oxford study explains why.

Boring AI Advice: Five Practical Uses for Your Business or Career

Why I Built This