Models
August 15, 2025

Anthropic’s Claude 4 can now end abusive or distressing conversations

Anthropic’s Claude Opus 4 and 4.1 now include a feature to terminate conversations in rare, extreme cases of persistent abuse or harmful user behavior, part of their “model welfare” initiative.

Anthropic announced that its Claude Opus 4 and Opus 4.1 models now possess the ability to end conversations when confronted with persistently harmful or abusive user interactions.

This safety feature was introduced as part of the company’s exploratory work on “model welfare,” designed to safeguard both user experience and the model’s integrity in extreme edge cases.

According to Anthropic, termination only occurs after repeated attempts to redirect discussions have failed or at the explicit request of the user. Importantly, the vast majority of users, including those discussing complex or controversial topics, will not encounter this intervention during normal use.

#
Anthropic

Read Our Content

See All Blogs
Gen AI

Anthropic’s Claude Managed Agents platform accelerates AI agent deployment for teams

Deveshi Dabbawala

April 9, 2026
Read more
AI safety

Everything you need to know about Anthropic's Project Glasswing

Deveshi Dabbawala

April 8, 2026
Read more