7 Million Stolen Books: How Claude LLM Broke Trust in AI Race

By Hadi Brenjekjy – Board Member, London Intercultural Centre

When the Claude AI case hit the headlines, I was shocked! It made me wonder just how far we are pushing the limits in this AI race. For those who have missed the news: Anthropic, the company behind Claude, has admitted to downloading over 7 million pirated books…yes, pirated.. reason? to train their AI.

Let’s pause.

This is a choice they have made. A breach of both compliance and community trust.

Claude Legal Lowdown

Sure, a U.S. judge ruled that using copyrighted books for training can fall under fair use if it’s transformative enough. But that’s not a free pass to loot the intellectual commons. Downloading pirated content is still illegal. And Anthropic knew that.

The court’s basically saying: “Training on legit content = maybe okay.

Downloading 7 million stolen books? Absolutely not.”

December 2025, we are watching.

A Brief History of AI’s Appetite for Data

To understand why data is everything, we have to zoom out.

Large Language Models (LLMs) like Claude, ChatGPT, and others, were raised on massive volumes of data. Basically trillions of words from books, websites, forums, academic journals, news articles, and more, you name it. The idea is simple: the more you feed the model, the better it understands context and human expression.

OpenAI, for example, trained GPT-3 and GPT-4 on a mix of Common Crawl (a web scraping project), Wikipedia, open-access books, and a blend of licensed datasets. They have since confirmed deals with publishers like Associated Press and Reddit to access higher-quality, ethically sourced material.

But early on, even OpenAI faced backlash over vague disclosures about training sources. The AI community was concerned: “Did you ask permission? Who owns the words inside your models?”

That concern blew up when The New York Times sued OpenAI for allegedly using its articles without consent.

So no one’s hands are completely clean. But there’s a line between murky and malicious. Between maybe you scraped something gray and you definitely downloaded 7 million pirated books from illegal sites and built your tool off them.

That’s where Anthropic crossed the line.

Why This Should Alarm Us All

What Anthropic did undermines all of that. it’s about trust, fairness, and respecting the work of real people: authors, translators, educators.. its copyright!

Imagine being an independent writer, pouring your soul into a book for years, only to discover it’s been vacuumed up by a multibillion-dollar company and turned into chatbot. No credit. No consent. No coin.

If the goal is to build AI that benefits society, it can’t start on a foundation of exploitation. What starts wrong, ends wrong.

Our Call to the Industry

Let this be a wake-up call to every company using AI:

compliance isn’t optional. Trust isn’t infinite.

At LIC, we urge all organisations, especially those in education, policy, and culture, to vet their AI partners. Always ask:

How was your AI trained?
Who owns the content in your models?
What are your values?

Until there’s transparency, there can be no true trust.

We also support initiatives calling for AI transparency labels, clear disclosures of what data went into training models, what rights were respected, and how future updates are governed. Much like food labeling (fair trade).

A Note to Creators

To every author whose work may have been scraped without consent: we see you. We stand with you.

Your words matter. And the fact that the tech elite sometimes act like your stories are just “tokens” for training doesn’t diminish their value.

And to every AI developer still trying to do the right thing in a messy system: keep going. But make it clean. Make it accountable. Make it human.

Because if the future is built on stolen work, then the machines are not the real threat

we are.

Hadi

Our Team Will Be In Touch Soon

+1-(907) 345-5711

1227 Margaret Street Houston, TX 77040

Yogaclasswellness@mail.com

7 Million Stolen Books: How Claude LLM Broke Trust in AI Race

7 Million Stolen Books: How Claude LLM Broke Trust in AI Race

Claude Legal Lowdown

A Brief History of AI’s Appetite for Data

Why This Should Alarm Us All

Our Call to the Industry

A Note to Creators

Leave a Reply Cancel reply

Popular Posts

6 Secret Techniques to Improve…

Believing These 6 Myths About…

Leveraging Cultural Diversity in Education…

The Essential Role of Immigrant…

Blog Category

Blog Tag

Solutions

Consultancy & Advisory

R&D & Diagnostics

Tools & Tech

Our Academy

Audiences

For Corporates

For Governments

For Startups

For Investors

About LIC

About LIC

Events

Team

Insights / Blog

Contact

Updates, Events, and Calm Vibes

Our Team Will Be In Touch Soon

+1-(907) 345-5711

1227 Margaret Street Houston, TX 77040

Yogaclasswellness@mail.com

7 Million Stolen Books: How Claude LLM Broke Trust in AI Race

7 Million Stolen Books: How Claude LLM Broke Trust in AI Race

Claude Legal Lowdown

A Brief History of AI’s Appetite for Data

Why This Should Alarm Us All

Our Call to the Industry

A Note to Creators

Leave a Reply Cancel reply

Popular Posts

Blog Category

Download our guide

Blog Tag

Solutions

Consultancy & Advisory

R&D & Diagnostics

Tools & Tech

Audiences

For Corporates

For Governments

For Startups

For Investors

About LIC

Team

Insights / Blog

Updates, Events, and Calm Vibes