Loading...

Please wait while we load the content.

Apple Sued Over Alleged Use of Pirated Books in AI Training | Folio3 AI

Jan 26, 2026

Apple Sued Over Alleged Use of Pirated Books in AI Training

Key Takeaways Apple faces multiple class action lawsuits alleging the company trained its Apple Intelligence AI models using pirated copyrighted books without permission or compensation. Authors and n...

Key Takeaways

Apple is facing mounting legal challenges over allegations that the tech giant used pirated copyrighted books to train its artificial intelligence systems without authorization or compensation to authors.

Two separate class action lawsuits filed in California federal court accuse Apple of systematically using books from illegal shadow libraries to develop Apple Intelligence, the company's suite of AI-powered features.

The first lawsuit, filed in September 2025 in the U.S. District Court for the Northern District of California, was brought by authors Grady Hendrix and Jennifer Roberson.

A second suit was filed in October by neuroscientists Susana Martinez-Conde and Stephen Macknik, professors at SUNY Downstate Health Sciences University in Brooklyn, New York.

The core allegations

The lawsuits center on a dataset called Books3, which contains text files for approximately 196,640 copyrighted written works.

According to the complaints, Books3 is derived from a shadow library website called Bibliotik, which hosted thousands of pirated books.

The dataset was available on HuggingFace before being removed in October 2023 and was included as part of the RedPajama dataset used to train Apple's OpenELM language models.

The complaints allege that Apple copied protected works to train its OpenELM generative AI language model variants, OpenELM-270M, OpenELM-450M, OpenELM-1_1B, and OpenELM-3B, which form part of Apple Intelligence. The plaintiffs claim Apple also likely trained its Foundation Language Models using the same pirated dataset.

According to the lawsuit filed by Martinez-Conde and Macknik, the pirated books used in training included their works, Champions of Illusion: The Science Behind Mind-Boggling Images and Mystifying Brain Puzzles and Sleights of Mind: What the Neuroscience of Magic Reveals About Our Everyday Deceptions.

Questions about "publicly available" data

The lawsuits take issue with Apple's characterization of its training data. When Apple launched Apple Intelligence, the company advertised that it was trained using works described as publicly available or open source.

However, the complaints argue that publicly available does not mean the works were made free for public use by the author, but rather that they are simply accessible online, regardless of the legality of that access.

The lawsuit claims that Applebot, Apple's web-crawling software, can reach shadow libraries that host millions of unlicensed copyrighted books.

The complaints also allege that any use of the Books3 dataset required downloading a copy, meaning Apple still maintains its own copy of the pirated library.

Economic and market impact

The lawsuits argue that the unauthorized use of copyrighted works has caused economic harm to authors.

One lawsuit states that the day after Apple officially introduced Apple Intelligence, the company gained more than $200 billion in value, described as the single most lucrative day in the history of the company.

The complaints contend that Apple has copied the copyrighted works to train AI models whose outputs compete with and dilute the market for those very works.

The plaintiffs argue that AI-generated works are being used to produce quickly produced, low-quality replicas or companions to the original works, undermining their economic value.

What the plaintiffs want

Hendrix and Roberson are demanding a jury trial and requesting declaratory and injunctive relief, along with statutory damages, compensatory damages, restitution, disgorgement, and attorneys' fees for themselves and all class members.

They want to represent a nationwide class of individuals or entities who own a registered U.S. copyright in any work used for Apple Intelligence during the class period.

Martinez-Conde and Macknik have requested an unspecified amount of monetary damages and an order for Apple to stop misusing their copyrighted work.

Apple's stance on ethical AI development

While Apple has not issued a public comment on either lawsuit, the company has previously positioned itself as taking an ethical approach to AI training.

Apple has offered publishers millions of dollars to access publications for training data and agreed with Shutterstock in 2024 to license millions of images for training purposes.

In a July research paper, Apple stated that if a publisher did not agree to data being scraped for training, it would not scrape the content, including adhering to limitations outlined by robots.txt, which not all companies follow.

Part of a broader trend

Apple is the latest technology company to face copyright infringement allegations related to AI training.

Similar lawsuits have been filed against OpenAI, Microsoft, Meta Platforms, and Anthropic.

Most notably, Anthropic agreed in September 2025 to pay $1.5 billion to settle a class action lawsuit brought by authors who accused the company of using their books to train its Claude chatbot without permission.

That settlement has been described as the largest publicly reported copyright recovery in history.

The licensing market for AI training data is currently valued at approximately $2.5 billion and could reach nearly $30 billion within a decade, according to the complaint filed by Martinez-Conde and Macknik.

The cases represent a fundamental legal question facing the AI industry: whether the use of copyrighted material for training AI models constitutes fair use or copyright infringement.

As these lawsuits progress through the courts, they are likely to establish important precedents that will shape how AI companies acquire and use training data in the future.

MORE NEWS

OpenAI Signs $10 Billion Computing Deal With Cerebras Systems

Key takeaways OpenAI has signed a multi-year, $10 billion agreement with AI chipmaker Cerebras Systems to secure computing infrastructure. The deal will deliver 750 megawatts of computing power throug...

Jan 26, 2026

Elon Musk’s xAI Restricts Grok Chatbot After Global Outcry Over Sexualized AI Images

KEY TAKEAWAYS: xAI implemented restrictions preventing Grok from editing images of real people in revealing clothing after global backlash California Attorney General Rob Bonta launched investigation...

TSMC-Posts-Record-Fourth-Quarter-Profit-Driven-By-AI-Chip-Demand

Loading...

10 Mins

99 %

22 + Years

Jan 26, 2026

Apple Sued Over Alleged Use of Pirated Books in AI Training

Key Takeaways

The core allegations

Questions about "publicly available" data

Economic and market impact

What the plaintiffs want

Apple's stance on ethical AI development

Part of a broader trend

Read more:

MORE NEWS

Related News

Mar 25, 2026

OpenAI Signs $10 Billion Computing Deal With Cerebras Systems

Jan 26, 2026

Elon Musk’s xAI Restricts Grok Chatbot After Global Outcry Over Sexualized AI Images

Jan 26, 2026

TSMC Posts Record Fourth Quarter Profit Driven By AI Chip Demand

Loading...

Jan 26, 2026

Apple Sued Over Alleged Use of Pirated Books in AI Training

Key Takeaways

The core allegations

Questions about "publicly available" data

Economic and market impact

What the plaintiffs want

Apple's stance on ethical AI development

Part of a broader trend

Read more:

MORE NEWS

Related News

Mar 25, 2026

OpenAI Signs $10 Billion Computing Deal With Cerebras Systems

Jan 26, 2026

Elon Musk&#8217;s xAI Restricts Grok Chatbot After Global Outcry Over Sexualized AI Images

Jan 26, 2026

TSMC Posts Record Fourth Quarter Profit Driven By AI Chip Demand

Elon Musk’s xAI Restricts Grok Chatbot After Global Outcry Over Sexualized AI Images