OpenAI's Shocking Blunder: Key Evidence Vanishes in NY Times Lawsuit | eWeek

OpenAI’s Shocking Blunder: Key Evidence Vanishes in NY Times Lawsuit

News graphic featuring the logo of OpenAI.

Image: eWeek

Written By
Sunny Yadav
Sunny Yadav
Dec 3, 2024
2 minute read
eWeek content and product recommendations are editorially independent. We may make money when you click on links to our partners. Learn More

OpenAI is under fire after engineers accidentally erased key data in a high-stakes lawsuit brought by The New York Times and Daily News. The lawsuit accuses OpenAI of using copyrighted articles to train its AI models without permission, potentially violating copyright law. According to court documents, the data deletion has forced the plaintiffs to redo weeks of work at a significant cost.

The erased data came from one of two virtual machines OpenAI had provided to the plaintiffs to search its AI training datasets for their copyrighted content. Virtual machines, often used for testing and backup purposes, allowed the plaintiffs’ legal team to comb through training data. However, on November 14, OpenAI engineers erased the contents of one machine, rendering the recovered data incomplete and unusable for tracing how the plaintiffs’ content was incorporated into OpenAI’s models.

Dispute Over Responsibility for Data Loss

OpenAI attributed the issue to a “system misconfiguration” requested by the plaintiffs, claiming the change inadvertently removed file names and folder structures on a temporary cache drive. OpenAI’s attorneys denied any files were permanently lost and maintained that the deletion was unintentional.

However, lawyers for the publishers argue the incident underscores OpenAI’s superior ability to search its own datasets. “The news plaintiffs have been forced to recreate their work from scratch using significant person-hours and computer processing time,” they wrote, adding that the loss delayed their case and increased costs.

The lawsuit highlights the growing tension between content creators and AI developers. The New York Times and Daily News claim OpenAI’s use of their articles goes beyond “fair use,” arguing it provides an unfair advantage in developing commercial AI models. The plaintiffs are seeking billions in damages for allegedly using their works without authorization.

OpenAI, which has struck licensing deals with other major publishers like Axel Springer and Dotdash Meredith, has not disclosed whether its AI models were specifically trained on the plaintiffs’ content. OpenAI maintains that training models on publicly available data, including news articles, fall under fair use.

Advertisement

What’s Next?

As the case proceeds, the data loss could prove a significant hurdle for the plaintiffs. While OpenAI works to file its response, the broader legal battle over how AI companies use copyrighted content remains unresolved. With potential damages in the billions, the outcome could set a critical precedent for the future of AI development and intellectual property rights.

eWeek Logo

eWeek has the latest technology news and analysis, buying guides, and product reviews for IT professionals and technology buyers. The site's focus is on innovative solutions and covering in-depth technical content. eWeek stays on the cutting edge of technology news and IT trends through interviews and expert analysis. Gain insight from top innovators and thought leaders in the fields of IT, business, enterprise software, startups, and more.

Property of TechnologyAdvice. © 2026 TechnologyAdvice. All Rights Reserved

Advertiser Disclosure: Some of the products that appear on this site are from companies from which TechnologyAdvice receives compensation. This compensation may impact how and where products appear on this site including, for example, the order in which they appear. TechnologyAdvice does not include all companies or all types of products available in the marketplace.