Close
  • Latest News
  • Artificial Intelligence
  • Video
  • Big Data and Analytics
  • Cloud
  • Networking
  • Cybersecurity
  • Applications
  • IT Management
  • Storage
  • Sponsored
  • Mobile
  • Small Business
  • Development
  • Database
  • Servers
  • Android
  • Apple
  • Innovation
  • Blogs
  • PC Hardware
  • Reviews
  • Search Engines
  • Virtualization
Read Down
Sign in
Close
Welcome!Log into your account
Forgot your password?
Read Down
Password recovery
Recover your password
Close
Search
Logo
Logo
  • Latest News
  • Artificial Intelligence
  • Video
  • Big Data and Analytics
  • Cloud
  • Networking
  • Cybersecurity
  • Applications
  • IT Management
  • Storage
  • Sponsored
  • Mobile
  • Small Business
  • Development
  • Database
  • Servers
  • Android
  • Apple
  • Innovation
  • Blogs
  • PC Hardware
  • Reviews
  • Search Engines
  • Virtualization
More
    Home Latest News

      AI’s Errors Are Increasing Despite Advances in Reasoning – Experts Theorize Why

      Written by

      Aminu Abdullahi
      Published May 9, 2025
      Share
      Facebook
      Twitter
      Linkedin
        Profile photo of Sam Altman.
        Image: Sam Altman CropEdit James Tamim/Creative Commons

        eWEEK content and product recommendations are editorially independent. We may make money when you click on links to our partners. Learn More.

        AI initially seemed amazing with its many capabilities including answering questions, summarizing documents, and even writing code. But there are increasing concerns about how frequently AI systems invent false information – AKA hallucinations – with error rates in some tests reaching as high as 79%.

        This problem recently impacted customers of Cursor, an AI coding assistant platform, when its AI support bot falsely claimed users could only install the software on one computer. The fabricated Cursor policy sparked outrage, with some customers canceling subscriptions before the company intervened. “We have no such policy. You’re of course free to use Cursor on multiple machines,” Cursor CEO Michael Truell clarified on Reddit.

        This incident highlights how AI hallucinations are moving beyond harmless errors to cause real-world consequences.

        AI accuracy issues in models from OpenAI, DeepSeek, IBM

        Independent tests about hallucinations reveal alarming trends, and the rising AI error rates have experts worried.

        Vectara, which tracks how often AI invents information, says AI hallucinations are becoming more common, even in tasks that should be easy to verify. The company found OpenAI’s o3 model fabricated details 6.8% of the time when summarizing news articles, a simple, verifiable task. DeepSeek’s R1 model performed worse at 14.3%, while IBM’s reasoning-focused Granite 3.2 hallucinated 8.7-16.5% of the time, depending on version size.

        The Tow Center for Digital Journalism recently found AI-powered search engines are terrible at citing news accurately; in fact, Elon Musk’s Grok 3 generated incorrect citations a staggering 94% of the time.

        Experts say they do not know yet why this is happening, though there are theories. One theory is that newer models are trained to reason through problems step by step, but each step introduces a new chance to go wrong. Another theory is that the AI is trained to always provide an answer, even if it’s incorrect, rather than admit it doesn’t know.

        “Despite our best efforts, they will always hallucinate,” said Amr Awadallah, chief executive officer of Vectara and former Google executive, in The New York Times. “That will never go away.”

        OpenAI’s benchmark results

        OpenAI, a leader in generative AI technology, is facing an ironic setback with its newest systems. OpenAI’s o3 and o4-mini models use “reasoning” (i.e., a step-by-step thought process) rather than just spitting out answers, but tests show this deeper thinking is backfiring.

        According to OpenAI’s benchmark tests:

        • The o3 model hallucinated 33% of the time when answering questions about public figures (PersonQA).
        • On simpler factual questions (SimpleQA), o3 hallucinated 51% of the time.
        • The o4-mini model did even worse: 48% for PersonQA and 79% for SimpleQA.

        These numbers are higher than those of OpenAI’s earlier systems. And while OpenAI is studying the issue, the causes are still murky. “We’ll continue our research on hallucinations across all models to improve accuracy and reliability,” said Gaby Raila, an OpenAI spokesperson, to The New York Times.

        A serious issue for serious work

        While a little misinformation might not be a significant issue if you’re writing a poem or asking for dinner ideas, hallucinations can be dangerous when it comes to court documents, medical records, or business decisions.

        Even companies trying to fix the problem of AI hallucinations are struggling. Microsoft and Google have tools that attempt to flag suspicious answers, but experts remain doubtful that these measures will fully solve the issue.

        Read eWeek’s coverage about how Amazon has been mitigating AI hallucinations using a mathematical method. On our sister site TechRepublic, we look at Anthropic’s research into how its AI Claude “thinks.”

        Aminu Abdullahi
        Aminu Abdullahi
        Aminu Abdullahi is an experienced B2B technology and finance writer and award-winning public speaker. He is the co-author of the e-book, The Ultimate Creativity Playbook, and has written for various publications, including TechRepublic, eWEEK, Enterprise Networking Planet, eSecurity Planet, CIO Insight, Enterprise Storage Forum, IT Business Edge, Webopedia, Software Pundit, Geekflare and more.

        Get the Free Newsletter!

        Subscribe to Daily Tech Insider for top news, trends & analysis

        Get the Free Newsletter!

        Subscribe to Daily Tech Insider for top news, trends & analysis

        MOST POPULAR ARTICLES

        Artificial Intelligence

        9 Best AI 3D Generators You Need...

        Sam Rinko - June 25, 2024 0
        AI 3D Generators are powerful tools for many different industries. Discover the best AI 3D Generators, and learn which is best for your specific use case.
        Read more
        Cloud

        RingCentral Expands Its Collaboration Platform

        Zeus Kerravala - November 22, 2023 0
        RingCentral adds AI-enabled contact center and hybrid event products to its suite of collaboration services.
        Read more
        Artificial Intelligence

        8 Best AI Data Analytics Software &...

        Aminu Abdullahi - January 18, 2024 0
        Learn the top AI data analytics software to use. Compare AI data analytics solutions & features to make the best choice for your business.
        Read more
        Latest News

        Zeus Kerravala on Networking: Multicloud, 5G, and...

        James Maguire - December 16, 2022 0
        I spoke with Zeus Kerravala, industry analyst at ZK Research, about the rapid changes in enterprise networking, as tech advances and digital transformation prompt...
        Read more
        Video

        Datadog President Amit Agarwal on Trends in...

        James Maguire - November 11, 2022 0
        I spoke with Amit Agarwal, President of Datadog, about infrastructure observability, from current trends to key challenges to the future of this rapidly growing...
        Read more
        Logo

        eWeek has the latest technology news and analysis, buying guides, and product reviews for IT professionals and technology buyers. The site’s focus is on innovative solutions and covering in-depth technical content. eWeek stays on the cutting edge of technology news and IT trends through interviews and expert analysis. Gain insight from top innovators and thought leaders in the fields of IT, business, enterprise software, startups, and more.

        Facebook
        Linkedin
        RSS
        Twitter
        Youtube

        Advertisers

        Advertise with TechnologyAdvice on eWeek and our other IT-focused platforms.

        Advertise with Us

        Menu

        • About eWeek
        • Subscribe to our Newsletter
        • Latest News

        Our Brands

        • Privacy Policy
        • Terms
        • About
        • Contact
        • Advertise
        • Sitemap
        • California – Do Not Sell My Information

        Property of TechnologyAdvice.
        © 2024 TechnologyAdvice. All Rights Reserved

        Advertiser Disclosure: Some of the products that appear on this site are from companies from which TechnologyAdvice receives compensation. This compensation may impact how and where products appear on this site including, for example, the order in which they appear. TechnologyAdvice does not include all companies or all types of products available in the marketplace.

        ×