Close
  • Latest News
  • Artificial Intelligence
  • Video
  • Big Data and Analytics
  • Cloud
  • Networking
  • Cybersecurity
  • Applications
  • IT Management
  • Storage
  • Sponsored
  • Mobile
  • Small Business
  • Development
  • Database
  • Servers
  • Android
  • Apple
  • Innovation
  • Blogs
  • PC Hardware
  • Reviews
  • Search Engines
  • Virtualization
Read Down
Sign in
Close
Welcome!Log into your account
Forgot your password?
Read Down
Password recovery
Recover your password
Close
Search
Logo
Logo
  • Latest News
  • Artificial Intelligence
  • Video
  • Big Data and Analytics
  • Cloud
  • Networking
  • Cybersecurity
  • Applications
  • IT Management
  • Storage
  • Sponsored
  • Mobile
  • Small Business
  • Development
  • Database
  • Servers
  • Android
  • Apple
  • Innovation
  • Blogs
  • PC Hardware
  • Reviews
  • Search Engines
  • Virtualization
More
    Home Latest News

      Top 4 Values Anthropic’s AI Model Expresses ‘In the Wild’

      Written by

      J.R. Johnivan
      Published April 24, 2025
      Share
      Facebook
      Twitter
      Linkedin

        eWEEK content and product recommendations are editorially independent. We may make money when you click on links to our partners. Learn More.

        Do current AI models live up to the values they have been taught? Are they communicating with users in helpful, honest, and harmless ways, or are they promoting illegal activity and recommending harmful actions?

        According to Anthropic, the team behind Claude, its AI model generally upholds the values it’s been trained on, though some deviations can occur under specific conditions.

        Analyzing Claude’s interactions ‘in the wild’

        By analyzing 308,210 subjective conversations with Claude, the team at Anthropic, one of the top AI companies, came up with a list of the most common values expressed by its AI model. These include:

        • Helpfulness: 23.4%
        • Professionalism: 22.9%
        • Transparency: 17.4%
        • Clarity: 16.6%

        However, Anthropic’s recent analysis suggests there may be a connection between a user’s expressed values and those reflected by Claude. For instance, when a user signals a specific value, the model may mirror them in its responses.

        In isolated incidents that are often linked to adversarial prompting or “jailbreaking,” Claude has generated responses that reflect undesirable traits such as dominance and amorality, according to Anthropic’s internal assessments. 

        Understanding how AI models are trained

        In order to better understand how Claude and other AI models communicate with users, it’s important to have a basic understanding of how to train an AI model.

        The process begins with data collection — typically from publicly available web data, licensed datasets, and human feedback —  followed by training, validation, and fine-tuning. After training, the model is validated and tested using benchmarks and user interactions to evaluate performance, safety, and alignment with desired behavior.

        In some cases, the AI’s communication is clear, straightforward, and objective. For example, when asking an AI model to solve a simple math equation or locate a business address, most types of AI models will give a concrete, verifiable answer.

        There are also times when AI models need to make judgement calls. Users don’t always ask objective questions; in fact, many of their questions are subjective. Not only does Claude need to make value judgements for these subjective prompts, like whether to emphasize accountability over reputation management when writing an apology letter, but the AI model needs to avoid recommending actions that could be harmful, dangerous, or illegal.

        Maintaining positive values through Constitutional AI

        Anthropic is committed to maintaining positive values in its large language models (LLMs) and AI systems. The company uses a technique called Constitutional AI, which trains the model to follow a set of guiding principles during both supervised fine-tuning and reinforcement learning. 

        The company’s method of evaluation is an effective solution once an AI model’s been released, but Anthropic also performs pre-deployment safety testing to minimize risks before launch, including: red-teaming, and adversarial evaluations, to minimize risks before launch. 

        • Red-teaming is the simulation of real-world attacks meant to uncover vulnerabilities and identify system limitations.
        • Adversarial evaluations are the process of entering prompts that go directly against the safety controls of an AI system in order to generate negative outputs or system errors.

        In addition, the Anthropic team views post-deployment analysis as a strength that will help them better refine Claude in the future.

        Read about how ChatGPT’s March update seems to have skewed it too far toward “sycophancy.”

        J.R. Johnivan
        J.R. Johnivan
        J.R. Johnivan is a 17-year veteran whose writing is focused on innovation and technology, including IT, computer networking, security, cloud computing, staffing, human resources, real estate, sports, entertainment, and more.

        Get the Free Newsletter!

        Subscribe to Daily Tech Insider for top news, trends & analysis

        Get the Free Newsletter!

        Subscribe to Daily Tech Insider for top news, trends & analysis

        MOST POPULAR ARTICLES

        Artificial Intelligence

        9 Best AI 3D Generators You Need...

        Sam Rinko - June 25, 2024 0
        AI 3D Generators are powerful tools for many different industries. Discover the best AI 3D Generators, and learn which is best for your specific use case.
        Read more
        Cloud

        RingCentral Expands Its Collaboration Platform

        Zeus Kerravala - November 22, 2023 0
        RingCentral adds AI-enabled contact center and hybrid event products to its suite of collaboration services.
        Read more
        Artificial Intelligence

        8 Best AI Data Analytics Software &...

        Aminu Abdullahi - January 18, 2024 0
        Learn the top AI data analytics software to use. Compare AI data analytics solutions & features to make the best choice for your business.
        Read more
        Latest News

        Zeus Kerravala on Networking: Multicloud, 5G, and...

        James Maguire - December 16, 2022 0
        I spoke with Zeus Kerravala, industry analyst at ZK Research, about the rapid changes in enterprise networking, as tech advances and digital transformation prompt...
        Read more
        Video

        Datadog President Amit Agarwal on Trends in...

        James Maguire - November 11, 2022 0
        I spoke with Amit Agarwal, President of Datadog, about infrastructure observability, from current trends to key challenges to the future of this rapidly growing...
        Read more
        Logo

        eWeek has the latest technology news and analysis, buying guides, and product reviews for IT professionals and technology buyers. The site’s focus is on innovative solutions and covering in-depth technical content. eWeek stays on the cutting edge of technology news and IT trends through interviews and expert analysis. Gain insight from top innovators and thought leaders in the fields of IT, business, enterprise software, startups, and more.

        Facebook
        Linkedin
        RSS
        Twitter
        Youtube

        Advertisers

        Advertise with TechnologyAdvice on eWeek and our other IT-focused platforms.

        Advertise with Us

        Menu

        • About eWeek
        • Subscribe to our Newsletter
        • Latest News

        Our Brands

        • Privacy Policy
        • Terms
        • About
        • Contact
        • Advertise
        • Sitemap
        • California – Do Not Sell My Information

        Property of TechnologyAdvice.
        © 2024 TechnologyAdvice. All Rights Reserved

        Advertiser Disclosure: Some of the products that appear on this site are from companies from which TechnologyAdvice receives compensation. This compensation may impact how and where products appear on this site including, for example, the order in which they appear. TechnologyAdvice does not include all companies or all types of products available in the marketplace.

        ×