Lawsuit says Mark Zuckerberg approved Meta’s use of pirated materials to train Llama AI

Meta knowingly used pirated materials to train its Llama AI models — with the blessing of company chief Mark Zuckerberg — according to an ongoing copyright lawsuit against the company. As TechCrunch reports, the plaintiffs of the Kadrey v. Meta case submitted court documents talking about the company’s use of of the LibGen dataset for AI training.

LibGen is generally described as a “shadow library” that provides file-sharing access to academic and general-interest books, journals, images and other materials. The counsel for the plaintiffs, which include writers Sarah Silverman and Ta-Nehisi Coates, accused Zuckerberg of approving the use of LibGen for training despite concerns raised by company executives and employees who described it as a “dataset [they] know to be pirated.”

The company removed copyright information from LibGen materials, the complaint also said, before feeding them to Llama. Meta apparently admitted in a document submitted to court that it “remov[ed] all the copyright paragraphs from beginning and the end” of scientific journal articles. One of its engineers even reportedly made a script to automatically delete copyright information. The counsel argued that Meta did so to conceal its copyright infringement activities from the public. In addition, the counsel mentioned that Meta admitted to torrenting LibGen materials, even though its engineers felt uneasy about sharing them “from a [Meta-owned] corporate laptop.”

Silverman, alongside other writers, sued Meta and OpenAI for copyright infringement in 2023. They accused the companies of using pirated materials from shadow libraries to train their AI models. The court previously dismissed some of their claims, but the plaintiffs said their amended complaint supports their allegations and addresses the court’s earlier reasons for dismissal.

This article originally appeared on Engadget at https://www.engadget.com/ai/lawsuit-says-mark-zuckerberg-approved-metas-use-of-pirated-materials-to-train-llama-ai-141548827.html?src=rss