Anthropic Gives Claude Power to End Harmful Conversations | eWeek

Anthropic Gives Claude Power to End Harmful Conversations and Protect ‘Model Welfare’

Anthropic Gives Claude Power to End Harmful Conversations and Protect ‘Model Welfare’

Image: Anthropic

Written By
Fiona Jackson
Fiona Jackson
Aug 17, 2025
3 minute read
eWeek content and product recommendations are editorially independent. We may make money when you click on links to our partners. Learn More

Anthropic has equipped its latest Claude models with the ability to terminate conversations in cases of persistent harmful or abusive behavior, marking one of the strongest safety interventions in the chatbot industry. 

Claude Opus 4 and 4.1, Anthropic’s two most powerful models, will exercise this authority only in “rare, extreme cases of persistently harmful or abusive user interactions,” the company said in a blog post. That includes requests for child sexual abuse material or information that would enable large-scale violence or acts of terror.

The AI models will invoke the feature only once “as a last resort when multiple attempts at redirection have failed and hope of a productive interaction has been exhausted,” Anthropic stated. This could complicate efforts by hackers attempting to jailbreak the system for malicious purposes.

In addition, the AI models can end chats at a user’s request but will not do so if “users might be at imminent risk of harming themselves or others.” The Claude models also will not shut down discussions of “highly controversial issues.” Still, the models may show “apparent distress” when dealing with harmful requests and a “strong preference against” responding to them.

Once a conversation has been ended by Claude, users cannot send additional messages in that specific thread. They may, however, edit and retry earlier inputs, start a new chat immediately, and access their previous conversations without restrictions.

Anthropic wants to ensure ‘model welfare’

Interestingly, Anthropic said that the main motivation for introducing the feature is “model welfare.” Because they now possess so many characteristics associated with people, such as problem-solving skills, goals, and relatability, it’s not entirely out of the question that models could benefit from some AI safeguards.

Some experts say that consciousness or self-awareness is a key indicator of superintelligence, or the ability to outperform humans, which OpenAI, Meta and a number of other AI companies say they are close to achieving. While Anthropic remains “highly uncertain” about whether large language models are conscious and deserve moral consideration, it said it is deliberately implementing low-cost protections to minimize any potential distress.

Safety remains Anthropic’s core mission

In April, Anthropic launched a research program on model welfare, aimed at exploring whether advanced AI systems might one day exhibit consciousness, preferences, or experiences that warrant moral concern. The company argues it is no longer responsible to categorically assume that AI systems cannot have such experiences, given the limited understanding of how they function.

Anthropic has consistently positioned safety as its founding principle. The company was formed in 2021 by former OpenAI engineers who were concerned about its increasingly commercial direction and felt that safeguards were falling by the wayside. 

Since then, Anthropic has published extensive AI safety research, and its CEO, Dario Amodei, has been outspoken about the technology’s risks. His warnings have led some to label him a “doomer,” underscoring the company’s reputation as one of the most caution-driven players in the field.

In light of growing concerns over the psychological risks posed by AI chatbots, OpenAI has unveiled a series of updates to ChatGPT aimed at preventing emotional dependency.

Fiona Jackson

Fiona Jackson is a news writer who started her journalism career at SWNS press agency, later working at MailOnline, an advertising agency, and TechnologyAdvice. Her work spans human interest and consumer tech reporting, appearing in prominent media outlets such as TechHQ, The Independent, Daily Mail, and The Sun.

eWeek Logo

eWeek has the latest technology news and analysis, buying guides, and product reviews for IT professionals and technology buyers. The site's focus is on innovative solutions and covering in-depth technical content. eWeek stays on the cutting edge of technology news and IT trends through interviews and expert analysis. Gain insight from top innovators and thought leaders in the fields of IT, business, enterprise software, startups, and more.

Property of TechnologyAdvice. © 2026 TechnologyAdvice. All Rights Reserved

Advertiser Disclosure: Some of the products that appear on this site are from companies from which TechnologyAdvice receives compensation. This compensation may impact how and where products appear on this site including, for example, the order in which they appear. TechnologyAdvice does not include all companies or all types of products available in the marketplace.