Report Alleges that Zuckerberg Approved Theft of Copyrighted Work to Train Meta’s AI | eWeek

Report Alleges that Zuckerberg Approved Theft of Copyrighted Work to Train Meta’s AI

News graphic featuring the logo of Meta.

Image: eWeek

Jan 13, 2025
2 minute read
eWeek Le contenu et les recommandations de produits sont indépendants de la rédaction. Nous pouvons gagner de l'argent lorsque vous cliquez sur des liens vers nos partenaires. En savoir plus

Meta, the parent company of Facebook, Instagram, and WhatsApp, is facing a legal challenge from a group of writers who claim it used illegal copies to train its AI models. Author Ta-Nehisi Coates and comedian Sarah Silverman are among the writers who filed the complaint, which alleges that Meta purposefully trained its AI language model, LLaMA, using the LibGen dataset—a repository purportedly based in Russia and frequently criticized for having pirated content.

Internal Ethics Debate Preceded Decision

According to internal Meta messages cited in the petition, the company’s AI leadership team had expressed concerns about the use of LibGen and warned that integrating it into the model’s training data could harm Meta’s standing with authorities. The messages show an internal company struggle with the ethical and practical consequences of accessing the LibGen dataset from a Meta computer, despite the team’s eagerness to proceed with the data.

The concept of “torrenting,” the peer-to-peer file-sharing technique LibGen uses to increase the volume of content it illegally copies, was specifically pointed out in one message as a source of unease. However, a memo in the documents allegedly referring to Mark Zuckerberg by his initials noted that Meta’s AI team “has been cleared to employ LibGen.”

Though the original complaint was filed in 2023, the incident is back in the news after U.S. district judge Vince Chhabria permitted the authors to file an amended complaint last week, reviving their claims of copyright infringement and adding a new computer fraud allegation. Although he initially dismissed the claims, the new evidence might be sufficient to turn the case around.

“Meta’s CEO, Mark Zuckerberg, approved Meta’s use of the LibGen dataset notwithstanding concerns within Meta’s AI executive team (and others at Meta) that LibGen is ‘a dataset we know to be pirated,’” lawyers for the plaintiffs confirmed, but requests for comments from Meta went unanswered.

The use of copyrighted resources to train AI models has generated controversy in the tech and creative sectors, with creators claiming that unlawful use of their work jeopardizes their revenue and intellectual property. Last year a federal court in New York ordered LibGen’s anonymous operators to pay $30 million in damages for copyright infringement. The case is part of a larger, ongoing conversation about the role of ethics in AI.

Read our guide to the ethical challenges facing generative AI tools like ChatGPT to learn more about the issues at stake.

eWeek Logo

eWeek has the latest technology news and analysis, buying guides, and product reviews for IT professionals and technology buyers. The site's focus is on innovative solutions and covering in-depth technical content. eWeek stays on the cutting edge of technology news and IT trends through interviews and expert analysis. Gain insight from top innovators and thought leaders in the fields of IT, business, enterprise software, startups, and more.

Propriété de TechnologyAdvice. © 2026 TechnologyAdvice. Tous droits réservés

Divulgation publicitaire : Certains des produits qui apparaissent sur ce site proviennent d'entreprises dont TechnologyAdvice reçoit une compensation. Cette compensation peut influencer la façon dont les produits apparaissent sur ce site, notamment l'ordre dans lequel ils apparaissent. TechnologyAdvice n'inclut pas toutes les entreprises ou tous les types de produits disponibles sur le marché.