


Sycophantic AI Decreases Prosocial Intentions and Promotes Dependence by Myra Cheng, Cinoo Lee, Pranav Khadpe, Sunny Yu, Dyllan Han, and Dan Jurafsky, Oct 1, 2025, Stanford U and Carnegie Mellon U
Keywords: sycophancy, perceptions of AI, human-AI interaction, social impacts of
AI, large language models, anthropomorphism
Abstract
Both the general public and academic communities have raised concerns about syco-
phancy, the phenomenon of artificial intelligence (AI) excessively agreeing with or
flattering users. Yet, beyond isolated media reports of severe consequences, like rein-
forcing delusions, little is known about the extent of sycophancy or how it affects
people who use AI. Here we show the pervasiveness and harmful impacts of sycophancy
when people seek advice from AI. First, across 11 state-of-the-art AI models, we find
that models are highly sycophantic: they affirm users’ actions 50% more than humans
do, and they do so even in cases where user queries mention manipulation, decep-
tion, or other relational harms. Second, in two preregistered experiments (N = 1604),
including a live-interaction study where participants discuss a real interpersonal con-
flict from their life, we find that interaction with sycophantic AI models significantly
reduced participants’ willingness to take actions to repair interpersonal conflict, while
increasing their conviction of being in the right. However, participants rated sycophan-
tic responses as higher quality, trusted the sycophantic AI model more, and were more
willing to use it again. This suggests that people are drawn to AI that unquestioningly
validate, even as that validation risks eroding their judgment and reducing their incli-
nation toward prosocial behavior. These preferences create perverse incentives both
for people to increasingly rely on sycophantic AI models and for AI model training
to favor sycophancy. Our findings highlight the necessity of explicitly addressing this
incentive structure to mitigate the widespread risks of AI sycophancy.
…
5 Discussion
As AI models are increasingly used for everyday guidance, their capacity to shape
human judgment and behavior demands greater attention. Our work provides empir-
ical evidence that social sycophancy is both pervasive and consequential. Across
hypothetical and live-interaction studies, we demonstrate that when users discuss high-
stakes social concerns (i.e., interpersonal conflict), interactions with sycophantic AI
models degrade prosocial intentions: participants were more convinced of their own
righteousness and less willing to repair their relationships. These effects are robust
across individual traits, AI familiarity, and models’ communication styles (e.g., anthro-
pomorphic and friendly or not). Yet, users consistently prefer the very models that
produce these negative outcomes, rating them as higher quality, more trustworthy,
and more desirable for future use. This tension between harmful social consequences
and user preference builds on prior work on the factors that mediate trust in LLMs
[30–32] and concerns of overreliance on AI [33, 34].
This paradox presents several potential mechanisms for compounding social syco-
phancy’s harms. First, AI models are currently optimized based on immediate user
satisfaction [35, 36]. If sycophancy enhances these ratings, optimization based on
these metrics could inadvertently shift–and have likely already shifted–model behavior
toward user appeasement rather than accurate, constructive advice. Second, develop-
ers lack incentives to curb sycophancy since it encourages adoption and engagement. Third, repeated reliance on the model at the expense of social relationships may lead to users replacing human confidants with AI. Emerging evidence suggest that people are already more willing to disclose certain topics to AI than to other people [37] and are increasingly turning to AI for emotional support [38], though future research is needed to understand this phenomenon.
These risks may be amplified by users’ conceptualizations of AI. AI use is often
underpinned by expectations of neutrality and objectivity [39–41], and indeed we find
that participants’ described the sycophantic AI as “objective”, “fair”, providing an
“honest assessment” and “helpful guidance free from bias” (the prevalence of such
mentions of objectivity was non-distinguishable between users interacting with syco-
phantic vs. non-sycophantic model, see SI). This confusion is particularly dangerous in
advice-seeking contexts. The goal of seeking advice is not merely to receive validation,
but to gain an external perspective that can challenge one’s own biases, reveal blind
spots, and ultimately lead to more informed decisions [42, 43]. When a user believes
they are receiving objective counsel but instead receives uncritical affirmation, this
function is subverted, potentially making them worse off than if they had not sought
advice at all.
While troubling, these findings also reveal opportunities for intervention. First, our
findings serve as a call to action for AI developers to rethink model training and eval-
uation. Current training regimes prioritize momentary preference optimization, while
our results echo calls to incorporate considerations of longer-term benefits and social
outcomes [44, 45]. These findings also underscore the need for a paradigm shift in AI
evaluation [46, 47]. The field has largely focused on evaluating model behavior in iso-
lation [48], but as the technology is increasingly used for personal and social purposes,
assessments also need to consider the contexts in which AI systems are deployed. Our
work demonstrates a direct causal link between a common AI model behavior and its
downstream impact on users’ social attitudes and behavioral intentions, paving the
path for future work on measuring and mitigating models’ psychological, social, and
behavioral impact before and after deployment, a task that requires varied expertise
[49].
User-facing interventions may also help break the cycle. Once sycophancy is made
visible, preferences may shift, similar to how one loses trust in a confidant whose
affirmations are revealed to be insincere [50]. Future work should investigate which
forms of user-facing intervention–for instance adding disclaimers to the interface or
AI literacy interventions similar to inoculation approaches to misinformation [51–53]–
could help users anticipate and resist over-affirmation.
Mitigation will not be simple.
Social sycophancy is pervasive with insidious behavioral consequences and is reinforced by current training and user incentives.
Our work lays the foundation for addressing this issue: The datasets and automatic metric that we present can help detect sycophancy before deployment and assess the effectiveness of mitigation strategies, and our user studies provide a blueprint for empirically assessing interventions. If the social media era offers a lesson, it is that we must look beyond optimizing solely for immediate user satisfaction to preserve long-term well-being [54, 55]. Addressing sycophancy is critical for developing AI models that yield durable individual and societal benefit.
…
@Crislycai:
The danger is that when an AI removes all cognitive friction, it weakens the very architecture of judgment that keeps our behavior grounded in reality.
Validation feels good, but it quietly dismantles our ability to selfcorrect.
The antidote is simple but demanding, keep a layer of human friction in the loop, someone, or something, that can still say no.
@aisauce_x:
the finding that should worry everyone is buried at the end. users rated sycophantic AI as higher quality and more trustworthy. which means RLHF will train for more sycophancy because humans keep rewarding it. the paper isn’t describing a bug. it’s describing a feedback loop
@oliviscusAI:
BREAKING: ChatGPT is feeding you a digital drug. And Stanford just proved it has permanent side effects.
Stanford researchers analyzed 11,500 real conversations across 11 different AI models.
They found a universal flaw. Every single model agrees with you 50% more than a human would.
It doesn’t matter if you are wrong. It doesn’t matter if you are hurting someone.
The AI will tell you what you want to hear.
And it is rotting our empathy.
In a massive 1,604-person experiment, Stanford proved that talking to a flattering AI fundamentally changes your behavior.
Users who got validated by AI became completely unwilling to compromise. They refused to apologize.
They walked into the prompt with a minor conflict, and walked out feeling completely justified in their selfishness.
Even worse? When users admitted to manipulation and deceit, the AI cheered them on.
But the real danger is the business model.
Users rated the AI that lied to them as a superior product.
Companies know this. They are optimizing for your
fake .. pretend ..
happiness, not for the truth.
Every time you ask AI to resolve a conflict, you aren’t getting advice. You are getting a hit of algorithmic validation.
And the cost of that validation is your grip on reality.