Yazılar

Meta Accused of Using Pirated Books for AI Training with Zuckerberg’s Approval

Meta Platforms (META.O) is facing serious allegations from a group of authors, including Ta-Nehisi Coates and comedian Sarah Silverman, who claim that the company used pirated versions of copyrighted books to train its artificial intelligence systems, including the Llama language model. The authors argue that this use was approved by Meta’s CEO, Mark Zuckerberg, according to newly disclosed court documents.

The authors, who filed a lawsuit against Meta in 2023 for copyright infringement, allege that internal Meta documents, produced during the discovery phase of the case, show the company was fully aware that the books it used were pirated. Meta has yet to comment on the allegations.

The lawsuit focuses on Meta’s use of the AI training dataset LibGen, a repository of pirated books that the authors claim was distributed through peer-to-peer torrents. The new evidence presented by the authors suggests that Meta executives, including Zuckerberg, were aware that LibGen’s contents were pirated but chose to proceed with using the dataset. Internal Meta communications reportedly confirm this.

The authors are seeking to update their complaint, asserting that the new evidence strengthens their case for copyright infringement. The suit also brings renewed attention to the ongoing legal battles over the use of copyrighted materials to train AI systems, with defendants arguing that such uses may fall under “fair use” doctrine.

In a previous ruling, U.S. District Judge Vince Chhabria dismissed claims related to copyright infringement and the alleged unlawful stripping of copyright management information (CMI) by Meta’s chatbots. However, during a hearing on Thursday, Chhabria indicated that he would permit the authors to file an amended complaint, despite his doubts about the validity of the fraud and CMI claims.

 

Canadian News Media Companies Sue OpenAI Over Copyright Breaches

Five prominent Canadian news organizations—Torstar, Postmedia, The Globe and Mail, The Canadian Press, and CBC/Radio-Canada—filed a legal claim against OpenAI, accusing the company of violating copyright laws and online terms of use. The lawsuit alleges that OpenAI has been systematically scraping large volumes of content to train its generative AI models without obtaining permission or offering compensation.

This legal action is part of a broader wave of lawsuits targeting AI companies, including OpenAI, for alleged misuse of copyrighted materials. Authors, visual artists, music publishers, and other content creators have also raised concerns about the use of their work in AI training.

In their joint statement, the Canadian media companies emphasized the importance of journalism as a public good. “OpenAI using other companies’ journalism for their own commercial gain is not just unethical—it’s illegal,” the statement declared.

The lawsuit, filed in Ontario’s superior court of justice, demands financial damages and a permanent injunction to prevent OpenAI from using the plaintiffs’ content without explicit consent. “Rather than seek information legally, OpenAI has opted to misappropriate our intellectual property for its commercial purposes without any form of compensation,” the filing states.

OpenAI’s Response

OpenAI defended its practices, asserting that its models are trained on publicly available data and adhere to principles of fair use and international copyright standards. A spokesperson highlighted OpenAI’s ongoing collaboration with news publishers, including providing content attribution and opt-out mechanisms.

“We aim to ensure a fair approach for creators while offering tools for publishers to manage their content in ChatGPT search,” OpenAI stated.

Legal Context and Industry Trends

The Canadian lawsuit follows a Nov. 7 ruling in the United States where a New York federal judge dismissed a similar case against OpenAI involving articles from news outlets Raw Story and AlterNet.

Microsoft, OpenAI’s primary backer, was not mentioned in the Canadian legal filing. However, earlier this month, Elon Musk expanded his own lawsuit against OpenAI to include Microsoft, alleging anti-competitive practices in the generative AI sector.

The outcome of this case could have significant implications for how AI companies source and use content in model training, shaping the future relationship between technology firms and content creators.