Copyright Litigation

Tracking Litigation at the Intersection of
Copyright and Artificial Intelligence

Editor: Robert S. Want rwant@LegalEditor.com
(347) 804-6763

Background of AI, Copyright
Litigation: Core Legal Questions
At the heart of AI-related copyright litigation lie two fundamental questions:

First, the “Input Question”: May AI companies lawfully use copyrighted works to train generative AI and large language models (LLMs) without obtaining licenses or paying compensation to copyright holders?

Generative AI is the broad category of AI that refers to input training on large amounts of data, including text, images, music, or video, while LLMs refer to a specific type of generative AI that specializes in text.

AI companies typically argue that using copyrighted works in input training constitutes “fair use” under Copyright Law 17 U.S.C. § 107 because the training process is transformative — the copyrighted works are not reproduced for consumption but rather serve as data to teach AI systems to recognize patterns and generate new, transformative content.

The “transformative” doctrine is a central concept in U.S. copyright law, primarily used in determining whether a use qualifies as fair use under 17 U.S.C. § 107. The doctrine focuses on whether the use of a copyrighted work (in this context, to train generative AI or LLMs) adds something new, with a different purpose, meaning, or message, rather than merely copying or substituting for the original.

Second, the “Output Question”: When AI systems generate content that closely resembles or reproduces copyrighted works, does this constitute copyright infringement? Plaintiffs argue that AI outputs sometimes replicate substantial portions of their copyrighted works, while defendant AI platforms contend that outputs are not mere copies of the originals but are in fact transformative — creations that add something new and thus qualify as fair use and do not infringe existing copyrights.

While most of the cases discussed here are, at least in the early stages of litigation, concerned primarily with the input question, a notable pending case focusing on the output question is Advance Local Media LLC v. Cohere Inc. The case was filed in February 2025 by news organizations against Cohere, an AI company that produces news summaries, often in near-verbatim format, of plaintiffs’ articles.

Cohere sought to dismiss the plaintiffs’ output claims, arguing that any overlapping expression was minimal and that the company’s news summary outputs were nothing more than factual digests. But the court refused to dismiss, saying that the summaries that reflect the expressive structure and journalistic storytelling choices of the originals may plausibly infringe copyright. So the output claims will go forward and, depending on the outcome, may help shape how freely AI companies can repackage internet news content.

The case has advanced past the pleadings stage and is now in active discovery.

We will closely follow input and output questions as they work there way through the courts. How these questions are handled, both in judicial and legislative forums, will have profound implications for the future of AI development, and the fundamental balance that copyright law seeks to strike between protecting creative works and fostering innovation.

OpenAI Copyright Infringement Litigation
Consolidation of the Cases in Multidistrict Litigation (MDL #3143). One of the most significant of the AI-copyright cases currently pending is the consolidated multidistrict litigation against OpenAI Inc. (creator of ChatGPT) and one of its investors Microsoft. This consolidation represents the first major judicial effort to coordinate the sprawling litigation against one of the world’s most prominent AI companies.

The consolidation occurred on April 3, 2025, when the Judicial Panel on Multidistrict Litigation ordered the centralization of related actions pending in both the Northern District of California and the Southern District of New York (there are now 19 cases in the MDL). The consolidation is before Judge Sidney H. Stein in the Southern District of New York, located in Manhattan. (The district courts are the trial courts of the federal judicial system; since copyright law is a federal matter, all such cases are filed in the federal courts.)

You can follow developments in the OpenAI consolidated litigation through Public Access to Court Electronic Records (PACER). To access the litigation, use MDL #3143.

In deciding in favor of consolidation (transfer order), the judicial panel found that these cases “share factual questions arising from allegations that OpenAI used copyrighted works, without consent or compensation, to train their large language models (LLMs).”

Some of the cases focused on input-related claims and some on output claims. The panel rejected concerns raised by plaintiffs about the lack of uniformity among claims, finding that “the differences between the training claims and the output claims were not a significant obstacle to centralization given the substantial overlap in factual questions and discovery relating to defendants’ training of their LLMs.”

Each action, the panel said, will involve “overlapping, complex, and voluminous discovery,” regarding how defendants trained and designed their LLMs. And given the novel and complicated nature of the technology, the panel added, there will likely be overlapping experts across these actions.

Centralization, the panel said, will eliminate duplicative discovery; prevent inconsistent pretrial rulings, particularly as to class certification; and conserve the resources of the parties, their counsel, and the judiciary.

The OpenAI Copyright Infringement Litigation (MDL #3143) encompasses cases brought by various plaintiff groups, including:

• The New York Times Company,
• The (New York) Daily News, Ziff Davis, and other publishers,
• The Center for Investigative Reporting,
• Multiple class actions by authors, including complaints by bestselling authors John Grisham, David Baldacci, George R.R. Martin, Ta-Nahisi Coates, and comedian Sarah Silverman.

The New York Times Litigation. The flagship case within MDL #3143 is The New York Times Company v. Microsoft Corporation, et al., filed on December 27, 2023 (Southern District of New York, 23-cv-11195).

In its complaint, the Times alleged that OpenAI and Microsoft “used millions” of its copyrighted articles to train ChatGPT and other AI models without authorization, causing economic harm by creating a “market substitute” for its journalism and undermining its subscription and advertising revenue models. The complaint further alleged that the AI systems generate “hallucinations” — false information attributed to The Times — that damage its reputation and brand.

In their consolidated litigation (MDL #3143), plaintiffs asserted allegations that mirrored those of The Times: that defendants infringed plaintiffs’ copyrights by downloading and reproducing plaintiffs’ works, using those works as inputs to train OpenAI’s LLMs and creating infringing works in the outputs of OpenAI’s consumer-facing AI product, ChatGPT.

In creating its outputs, plaintiffs argued, ChatGPT relies on OpenAI’s LLMs, which are “trained” by supplying the LLMs with large amounts of text, allowing the models to identify relationships among words in the training data.

After training, the LLMs can generate responses to user prompts that resemble human-authored text. Plaintiffs claim that when prompted by users, ChatGPT can generate accurate summaries of plaintiffs’ copyrighted works and outlines for potential sequels, which plaintiffs allege are unauthorized and infringing.

Discovery Disputes and Data Preservation. Early in the litigation, Judge Stein denied OpenAI’s request to dismiss authors’ output claims that text generated by OpenAI’s ChatGPT infringes their copyrights. In his dismissal order, Judge Stein said that the authors may be able to prove that the text ChatGPT produces is similar enough to their work so as to violate their copyrights.

In his order, Judge Stein did not address a central claim in the consolidated litigation — the input claim that the authors’ copyrights were infringed when OpenAI used their work to train its LLM systems. In litigation over the issue, OpenAI and other tech companies argue that the use of copyrighted material in AI training is allowable under the fair use doctrine. The input claim will be the focus in the next phase of the litigation.

Concerning the input claim, the OpenAI litigation has featured a discovery fight over the relevance of massive user data requested by The Times and the other news organizations. The plaintiffs are asking the court to produce 20 million ChatGPT user chat logs, claiming they need the outputs to conduct expert analysis on topics such as how the defendant pulled news content and how often it hallucinated and generated false outputs attributed to news outlets.

The court granted the news organization’s request for user chat log documents on Nov 7. But OpenAI asked the court to reconsider, raising privacy concerns about releasing user data on such a large scale. The plaintiffs responded by citing OpenAI’s privacy policy, which states that it may use personal data to “comply with legal obligations.” The plaintiffs noted in a memo to the court that they had been waiting months for OpenAI to de-identify the data to resolve user privacy concerns, and that “OpenAI should not be heard to claim that its own lengthy anonymization process is insufficient to mitigate the privacy concerns that it was designed to address.”

On Jan 5, 2026, the court ruled against OpenAI in its efforts to keep ChatGPT chat logs secret. The court said the logs were necessary for plaintiffs as they attempt to prove their copyright claims and that there were “multiple layers of protection” to assure consumer privacy.

This discovery dispute highlights the novel evidentiary challenges in this case and other AI litigation, where proving infringement may require analyzing massive datasets of AI-generated outputs, many of which may be ephemeral or deliberately deleted.

Current Status of NY Times MDL #3143. As of May 2026, the MDL remains in active discovery. It is expected to proceed through extensive fact discovery, chat log evaluation, and expert witness testimony regarding the technical operation of LLMs before the litigation reaches dispositive motions (such as motions by both sides to dismiss) or trial. Barring an early settlement, it is certain to be a drawn-out process.

A case handed down by the U.S. Supreme Court on March 25, 2026, Cox Communications, Inc. v. Sony Music Entertainment, is likely to strengthen the argument of internet service providers that they cannot be held liable for what users may do with ISP outputs. In Cox, the Court held that a service provider is contributorily liable only if it intended that its service be used for infringement.

In light of the Cox decision, the New York Times Co. has dropped its claim accusing OpenAI Inc. of contributing to ChatGPT users’ infringement of articles, but the opinion says nothing about the input-side: whether the use of copyrighted material in AI training constitutes fair use.

AI defendants going forward can be expected to cite Cox in the output-side cases, arguing that the AI platforms offer a general-purpose tool with massive noninfringing uses, and that when a user prompts a chatbot to generate copyright-infringement output, it’s the user — not the platform — who’s the direct infringer.

Related Lawsuits. A lawsuit that mirrors the issues in the New York Times consolidated litigation was recently filed against OpenAI and Microsoft by a group of regional publishers — including the Hartford Courant Co., the Los Angeles Daily News Publishing Co., and the San Diego Union-Tribune. The suit has been consolidated into MDL #3143.

Also, the New York Times on Dec 5 filed suit against AI startup Perplexity, alleging AI-copyright claims similar to those in the ongoing Times case against OpenAI.

(It is notable that in May 2025, The Times struck a multiyear deal with Amazon to license Times editorial content for use in Amazon’s artificial intelligence platforms. It was The Times’s first licensing arrangement involving generative AI. No financial terms were disclosed.

The Anthropic Settlement: A Watershed Moment
Background of Bartz et al. v. Anthropic. While the OpenAI MDL litigation continues, a parallel case against AI company Anthropic PBC reached a historic settlement in August 2025 that may significantly influence the OpenAI case and other pending AI-copyright cases. In Bartz v. Anthropic, plaintiff authors in a proposed class action sued Anthropic in the Northern District of California (San Francisco), alleging that the company used copyrighted books without authorization to train its Claude family of LLMs. Amazon.com Inc. is a major backer of Anthropic.

The complaint alleged that Anthropic pursued a strategy of amassing a central library of “all the books in the world” to retain “forever.” The company’s strategy, according to the lawsuit, included:

• Purchasing physical books, removing their bindings, scanning them, and creating digitized, searchable files,
• Downloading more than seven million digital copies of books from pirate website libraries (online repositories that provide unauthorized and free access to copyrighted materials).

Judge Alsup’s June 2025 Ruling. On June 23, 2025, presiding judge William Alsup issued a groundbreaking summary judgment ruling that bifurcated the fair use analysis. The court ruled:

(1) Lawfully Purchased Books: Anthropic’s practice of purchasing physical books and creating digital copies through “destructive digitization” for the purpose of training AI models constitutes fair use. Judge Alsup found this use “transformative — spectacularly so” because the books were not used to replicate or supplant the original works but rather to enable the AI model to identify patterns and create fundamentally different outputs.

(2) Pirated Books: By contrast, Judge Alsup refused to grant summary judgment on Anthropic’s fair use defense for books obtained from pirate libraries. The court held that downloading books from pirate sites does not constitute fair use.

This split ruling created significant risk for Anthropic. Under 17 U.S.C. § 504(c), statutory damages for willful copyright infringement can reach $150,000 per infringed work. With more than seven million pirated book copies at issue, Anthropic faced potential damages in the tens of billions of dollars.

The $1.5 Billion Settlement. Rather than proceed to trial and risk a horrendous adverse ruling, Anthropic reached a settlement in principle in August 2025, which was approved by Judge Alsup on Nov 3, 2025. The settlement terms include:

(1) Monetary Compensation: Anthropic will pay a minimum of $1.5 billion, representing approximately $3,000 for each of the estimated 500,000 copyrighted works downloaded from pirate libraries. If the final list of infringed works exceeds 500,000 titles, Anthropic must pay an additional $3,000 per work.

(2) Destruction of Pirated Materials: Anthropic must destroy all datasets containing pirated books and any derivative copies, with written confirmation of complete and permanent removal from its systems.

(3) Structured Payment Schedule: Settlement proceeds will be paid in four installments between Oct 2, 2025, and Sept 25, 2029, though Anthropic may elect to fund the entire amount earlier.

(4) Limited Scope of Release: Critically, the settlement only releases claims for past conduct occurring before Aug 25, 2025, and only for identified works. It does not cover:
• Future use of copyrighted materials,
• Claims based on infringing outputs from AI models,
• Works not included in the final settlement list,
• Use of any copyrighted materials acquired after the settlement date.

Significance of the Settlement. The Anthropic settlement represents the largest publicly reported copyright recovery in United States history. Its significance extends beyond the monetary amount in several ways:

(1) Judicial Acceptance of Training on Lawfully Acquired Works. Judge Alsup’s ruling that the copying of lawfully acquired books for AI training constitutes fair use provides a potential roadmap for AI companies, and pending and future litigation.

(2) Clear Rejection of Piracy: The settlement sends a clear signal that AI companies cannot obtain training data from pirate websites with impunity. The $3,000-per-work payment substantially exceeds what many authors might receive through legitimate licensing, demonstrating the severe financial consequences of using pirated materials.

(3) Emphasis on Licensing Markets: Statements by both parties to the litigation emphasized the settlement’s role in fostering legitimate licensing arrangements. Attorneys for the authors stated that it “sets a precedent requiring AI companies to pay copyright owners,” while industry observers predicted it would “foster further licensing” as AI companies seek to avoid similar liability.

(4) No License for Future Use: Unlike typical licensing agreements, the settlement does not grant Anthropic any rights to use the copyrighted works going forward. Copyright holders retain complete control over future use of their works, preserving their ability to negotiate licenses on favorable terms.

(5) Output Claims Preserved: The Anthropic settlement dealt with data inputs to the training model. It explicitly did not release any claims based on infringing outputs from Claude or other Anthropic models, leaving this critical issue unresolved.

It should be emphasized that Judge Alsup’s ruling in Bartz v. Anthropic that the use of lawfully obtained works in the training process constitutes fair use does not, in terms of the law, create a legal precedent that other courts (or even the same court) must follow; Judge Alsup presides over a trial court and only an appellate court can create binding precedent. But the Anthropic court’s ruling can be highly persuasive, offering a blueprint that trial courts may wish to follow in similar ongoing and future AI-copyright litigation. Nor does the settlement resolve output-infringement questions like substantial similarity and who may be liable when that occurs, the AI company or the user, or both.

Other Pending AI-Copyright Litigation
Cases involving AI input (training) and/or output issues are being filed in increasing numbers. Here are some of the most notable:

Visual Arts Cases: Getty Images v. Stability AI. In this ongoing litigation originally filed in August 2025 in San Francisco federal court, Getty alleges that Stability AI, creator of the Stable Diffusion image generator, unlawfully used 12 million Getty Images to train its generative AI model. A central piece of evidence is that Stable Diffusion outputs sometimes include distorted versions of the Getty Images watermark, suggesting not just training use but partial reproduction.

Separately, a proposed class action by visual artists, Andersen v. Stability, challenges Stability AI, Midjourney, and DeviantArt for using billions of images scraped from the internet without permission. This litigation involving generative AI raises similar training and output questions as the text-based LLM cases.

Entertainment Industry: Disney v. Midjourney. In June 2025, Disney and Universal filed a complaint in Los Angeles federal court against image generator company Midjourney, alleging “mass copyright infringement” on “an almost unimaginable scale.” The complaint includes dozens of examples of Midjourney-generated images allegedly depicting copyrighted characters, including Darth Vader, Elsa from Frozen, the Minions, and Shrek.

In the lawsuit, the studios contend that Midjourney “pirated entire creative libraries to train its AI models, enabling users to create ‘endless unauthorized copies’ of some of the most valuable characters in entertainment history.” The case is ongoing, with Midjourney asserting a transformative fair use defense.

The Disney/Universal lawsuit is particularly significant because it involves high-value entertainment properties where even small amounts of unauthorized reproduction can result in enormous monetary damages. The studios describe Midjourney as a “quintessential copyright free-rider and a bottomless pit of plagiarism.”

Music Industry: Record Labels v. Suno and Udio. In June 2024, Music publishers including Universal Music Group, Sony Music Entertainment, and Warner Music Group, coordinated by the Recording Industry Association of America (RIAA), filed lawsuits against AI music generators Suno AI and Udio for alleged copyright infringement on “an almost unimaginable scale.” Suno complaint and Udio complaint.

The suits allege that these text-to-music platforms, which allow users to generate full songs from text prompts, were trained on “vast quantities of sound recordings from artists across every genre, style, and era” without authorization. The complaints cite evidence including:
• AI-generated vocals indistinguishable from Bruce Springsteen, Lin-Manuel Miranda, and Michael Jackson,
• Outputs that replicate producer tags of Jason Derulo and Cash Money AP,
• Generated songs nearly identical to classics like Mariah Carey’s “All I Want for Christmas Is You” and The Temptations’ “My Girl.”

In August 2024, both Suno and Udio filed answers admitting they trained their generative AI models on copyrighted music but asserting fair use defenses. Suno stated it was “no secret” that it ingested “essentially all music files of reasonable quality that are accessible on the open Internet.” Both companies argue that their use is transformative because they create “intermediate copies” that are never heard by anyone and serve only to teach the AI to recognize musical patterns.

The RIAA characterized the defendants’ admissions as a “major concession” and rejected their fair use claims, stating: “There’s nothing fair about stealing an artist’s life’s work, extracting its core value, and repackaging it to compete directly with the originals.”

While the Suno and Udio litigation continues, Warner Music Group (WMG) has opted to settle with the music generators. On Nov 19, WMG announced a settlement with Udio that includes a licensing deal where WMG artists can opt-in to license their works on Udio’s new subscription service, which will be launched next year.

(In a challenge to the settlement. on June 5, 2026, the American Federation of Musicians of the U.S. and Canada filed a lawsuit against Warner Music Group and Universal Music Group, alleging that the companies violated a collective bargaining agreement by licensing recordings to artificial intelligence companies without compensating or notifying union musicians whose performances were used.)

UMG was the first of the major music companies to settle with a generative AI platform, and this news set in motion an end-of-year domino effect of other deals in the music industry.

One of these deals occurred on Nov 25, when WMG and Suno announced that the parties had settled a lawsuit over the unlicensed use of music from the record label’s repertoire to train Suno’s AI model. According to the announcement, Suno will launch an entirely new model in 2026 in which artists and songwriters will have full control over “whether and how their names, images, likenesses, voices, and compositions are used in new AI-generated music.” Free tier users of the new AI model will not be allowed to download audio outputs but instead can play or share them. Paid tier users have limited monthly download caps and the ability to pay for additional downloads.

The settlements of course do not resolve Suno and Udio’s assertions of fair use — which remain a fundamental issue in ongoing litigation — but it may signal an openness on the part of AI companies to consider licensing agreements as a way out of costly and protracted litigation. It is a fact that such agreements have been on the rise, most notably since the Anthropic settlement in November 2025.

Separate from the Suno and Udio litigation, and the Anthropic settlement agreement discussed above, music publishers including Universal Music Publishing Group have sued Anthropic for allegedly reproducing copyrighted song lyrics through its Claude AI assistant. This case focuses on output-based infringement rather than input training. The case was originally filed in Nashville federal court but in June 2024 removed to federal court in San Francisco.

Disney/OpenAI Licensing Agreement. In a major generative AI development, on Dec 11 Disney and OpenAI announced a three-year licensing agreement that will allow Sora, OpenAI’s video generation model, to generate short, user-prompted social videos that can be viewed and shared by fans, drawing on more than 200 Disney, Marvel, Pixar and Star Wars characters. As part of the agreement, some fan-inspired short-form videos will be available to stream on Disney+.

In March 2026, however, OpenAI shut down Sora, and as a result the Disney licensing agreement was effectively terminated before it meaningfully launched.

Continued on Next Page ›

Return to Top
Return to Home Page