A PYMNTS Company

In Win for AI Companies, Court Finds AI Training Is Fair Use, but Only From Lawful Sources

 |  June 24, 2025

A federal district court judge in California on Tuesday handed down a split decision in one of the first major cases involving AI and copyright to reach in a definitive legal ruling. Judge William Alsup granted summary judgment in favor of a group of authors suing the Amazon-backed AI startup Anthropic over its downloading and retention of collections of pirated books from so-called shadow libraries. But he ruled that the use of lawfully acquired copies to train large language models (LLMs) to be a fair use under copyright law, so long as the training does not result in the LLM generating exact copies of the originals (full opinion).

    Get the Full Story

    Complete the form to unlock this article and enjoy unlimited free access to all PYMNTS content — no additional logins required.

    yesSubscribe to our daily newsletter, PYMNTS Today.

    By completing this form, you agree to receive marketing communications from PYMNTS and to the sharing of your information with our sponsor, if applicable, in accordance with our Privacy Policy and Terms and Conditions.

    The ruling offered both sides of the AI and copyright debate something to cheer. But it also offered each cause for concern. If upheld on appeal Judge Alsup’s analysis could provide a blueprint for courts handling similar cases going forward.

    The group of three authors sued Anthropic in August 2024 and sought to certify the case as a class action. It charged the AI company with downloading full-text copies of millions of books from notorious online libraries such as Books3 and Library Genesis (LibGen), which Anthropic knew to be pirated, and using them to train its Claude LLM series.

    Anthropic did not dispute downloading the collections but argued that its use of the texts was fair use under copyright law. It also purchased legitimate copies of books in bulk and scanned them to create digital versions also used in training, and then discarded the print copies.

    In a major win for AI companies, Alsup found the use of books in training to be a “quintessentially transformative” fair use, so long as the books were lawfully acquired. He also ruled that format shifting of lawfully acquired works, such as digitally scanning print copies of books, to be a valid fair use, echoing holdings in earlier important copyright cases such as the two Google Books cases and the seminal 1984 Supreme Court ruling in the Sony Betamax case.

    Read more: Senate Bill Would Shield AI Developers From Civil Liability In Certain Uses of Their Tools

    Where Anthropic got in trouble, in Judge Alsup’s analysis, and where other AI companies also accused of using pirated libraries in training could be in trouble, such as Meta, was in downloading pirated copies of works it could have lawfully acquired. It then retained those copies in a huge central library to ongoing and future uses, which Anthropic neither specified nor for which it provided a fair use justification.

    “A separate justification was required for each use. None is even offered here except for Anthropic’s pocketbook and convenience,” Alsup wrote. He ordered a trial to be held on Anthropic’s use of the pirated copies to create its central library, including whether the infringement was willful, which could significantly increase the size of any resulting damages.

    In a blow to rights owners, Alsup generally rejected the theory that output generated by LLMs trained on the authors’ works could compete with or dilute the market for their subsequent works, such as by mimicking their styles or creating alternative summaries of factual material found in non-fiction works.

    In a passage likely to feature prominently in future debates over AI and copyright, Alsup wrote, “Authors’ complaint is no different than it would be if they complained that training schoolchildren to write well would result in an explosion of competing works. This is not the kind of competitive or creative displacement that concerns the Copyright Act. The Act seeks to advance original works of authorship, not to protect authors against competition.”

    Alsup likewise rejected the authors’ claim that training LLMs has or will displace the emerging market for licensing their works for the specific purpose of using them in training. “A market could develop,” Alsup concedes. “Even so, such a market for that use is not one the Copyright Act entitles Authors to exploit.”