Copyright Litigation
Tracking Litigation at the Intersection of
Copyright and Artificial Intelligence
Editor: Robert S. Want rwant@LegalEditor.com
(347) 804-6763
Return to Previous Page
Return to Homepage
Cases Continue to Be Filed
Licensing agreements between content creators and AI companies are increasing in frequency, though new cases involving AI-related copyright issues continue to be filed. Around 100 such lawsuits have already appeared on court dockets. Some of the more significant recent filings are included here. While most of these cases are still pending, a few have been settled or otherwise dismissed. For current status, log in to PACER.
June 1, 2026
CNN Sues Perplexity, Accusing AI Firm of Massive Copyright and Trademark Infringement
Cable News Network has filed a lawsuit in Manhattan federal court against Perplexity AI, alleging the artificial intelligence company unlawfully copied and distributed thousands of CNN news stories and other copyrighted works to power its AI products without authorization or compensation.
CNN contends in its suit that Perplexity built its business by crawling and storing CNN content, then using that material to generate answers for users through its chatbot and AI-powered browser. The complaint alleges that Perplexity copied more than 17,000 CNN articles, videos, images, and other works, undermining CNN’s ability to monetize its journalism.
According to the complaint, Perplexity markets its products as providing users with information while avoiding the “extra steps and clicks” associated with traditional search engines. CNN claims the company’s outputs often reproduce or closely paraphrase CNN content. The complaint also alleges trademark violations, arguing that Perplexity falsely suggests affiliations with CNN and generates inaccurate or fabricated content that is attributed to the news organization.
May 12, 2026
Journalists Accuse Google of Using Their Voices in AI Training
A group of prominent Illinois journalists, podcasters, and audiobook narrators has sued Google in federal court in Chicago, alleging that the company unlawfully used plaintiffs’ recorded voices to train commercial artificial intelligence products without their consent.
The complaint claims that Google extracted “voiceprints” from “hundreds of thousands of hours of human speech” used to develop products including Gemini Live, NotebookLM Audio Overviews, YouTube auto-dubbing, and Google Cloud Text-to-Speech. Plaintiffs include veteran broadcaster Carol Marin, Pulitzer Prize-winning podcaster Yohance Lacour, and audiobook narrators Lindsey Dorcus and Victoria Nassif.
According to the lawsuit, the plaintiffs’ recordings were used without “notice, informed written consent, [or] a written release” in violation of Illinois’ Biometric Information Privacy Act. The complaint describes a voiceprint as “a digital fingerprint of the human voice” that cannot be changed. “Google knew how to obtain consent,” the complaint states. “It chose to obtain consent for some voices and not others.” The suit further contends that Google’s AI voice tools now compete directly with the same journalists and narrators whose voices allegedly helped train the systems.
May 5, 2026
Publishers Sue Meta for Copyright Infringement Over AI Training
Major publishers and authors have filed a proposed class action in Manhattan federal court against Meta Platforms Inc. and CEO Mark Zuckerberg, accusing defendants of widespread copyright infringement in building its Llama artificial intelligence system.
The complaint alleges that Meta “illegally torrented [downloaded] millions of copyrighted books and journal articles” and copied them repeatedly to train its generative AI models, calling the conduct “one of the most massive infringements of copyrighted materials in history.” Plaintiffs include Elsevier, Cengage, Hachette, Macmillan, McGraw-Hill, and author Scott Turow.
Meta sourced training data from pirate sites and unauthorized web scrapes, then reproduced the works “many times over” in developing its AI chatbot Llama, according to the complaint. The lawsuit further claims that Meta removed copyright management information to conceal the materials’ origins and facilitate unauthorized use. Plaintiffs contend that the AI system now generates outputs that compete with original works, including “verbatim and near-verbatim copies” and derivative content mimicking authors’ styles. They argue this conduct undermines licensing markets and deprives creators of compensation.
March 16, 2026
Britannica Launches Copyright Action Against OpenAI
Encyclopedia Britannica and Merriam-Webster have filed a lawsuit in Manhattan federal court accusing OpenAI of unlawfully copying their copyrighted content to train ChatGPT and generate responses that reproduce or mimic their articles and dictionary definitions.
OpenAI built its generative-AI systems by engaging in “massive copying” of the publishers’ copyrighted works without authorization or payment, according to the complaint. ChatGPT “free ride[s] on Plaintiffs’ trusted, high-quality content,” the complaint says, by generating summaries and answers that substitute for visits to Britannica and Merriam-Webster websites.
In the suit, Britannica, founded more than 250 years ago, and Merriam-Webster, a leading dictionary publisher for nearly two centuries, say they rely on subscription and advertising revenue to fund the work of writers, editors, and researchers who produce fact-checked reference content. The suit argues that OpenAI’s system diverts web traffic and revenue by delivering AI-generated answers that replicate or paraphrase that content. In some instances, plaintiffs claim, ChatGPT produced “near-verbatim reproductions” of Britannica articles or identical Merriam-Webster definitions.
March 10, 2026
Gracenote Sues OpenAI Over Use of Metadata in AI Training
Gracenote Media Services LLC accuses OpenAI of copying and using its proprietary entertainment metadata database without permission to train and power its artificial intelligence products.
In a complaint filed in Manhattan federal court, Gracenote says OpenAI “copied and used Gracenote Data to create and improve highly lucrative AI products like ChatGPT,” despite never obtaining a license or paying for the material. The company claims the data includes “millions upon millions of narrative descriptions, original video descriptors, unique identifiers, and other program elements” compiled by its editors over decades.
OpenAI’s models, according to the complaint, can generate exact program identifiers and reproduce descriptive summaries verbatim, demonstrating that the models “have been trained on Gracenote Data.” The lawsuit argues that the alleged copying threatens Gracenote’s core business of licensing metadata to media companies and emerging AI developers, and that OpenAI’s use of the material also enables customers to build competing metadata tools without compensation.
March 10, 2026
Independent Musicians Sue Google Over Alleged AI Music Copyright Infringement
A group of independent musicians and songwriters has filed a lawsuit accusing Google of unlawfully copying vast amounts of copyrighted music to train its artificial-intelligence music generator and related tools without permission or compensation.
According to the complaint filed in Chicago federal court, Google copied millions of recordings and lyrics to build its Lyria AI music-generation system and other models. Plaintiffs claim the company “copied millions of copyrighted sound recordings, musical compositions, and lyric[s]” and commercialized the resulting technology while failing to pay the artists whose work powered it.
Google’s own research papers, the complaint states, describe training AI models using massive music datasets. One paper cited in the complaint reported collecting roughly 50 million internet music videos and retaining about 44 million audio clips totaling nearly 370,000 hours of music. Another described training systems on five million audio clips representing about 280,000 hours of recordings. Plaintiffs argue that Google used its dominance in the music ecosystem, including YouTube and its Content ID system, to unlawfully obtain and process copyrighted recordings for AI development.
Click Here for Additional Recent Filings
Key Legal Issues Across AI-Copyright Cases
Input-Based Infringement: Fair Use Defense. Every AI defendant in the growing number of pending litigation has asserted fair use as a primary defense. Under Copyright Law 17 U.S.C. § 107, fair use analysis requires consideration of four factors:
(1) Purpose and Character of Use: AI companies argue that their use is a fair use, in that it is highly transformative. Rather than reproducing works for consumption, they claim they are using copyrighted materials as data to teach AI systems to recognize patterns and generate new, original content. This argument, defendants say, finds support in cases like Authors Guild v. Google, Inc., 804 F.3d 202 (2d Cir. 2015), where the Second Circuit held that creating a searchable database of books was transformative fair use.
Plaintiffs counter, however, that AI companies’ use is commercial, not transformative, because the AI outputs that result from input data training compete directly with the original works and serve similar functions. They cite Andy Warhol Found. for Visual Arts, Inc. v. Goldsmith, 598 U.S. 508 (2023), where the Supreme Court emphasized that commercial use weighed against transformative fair use.
(2) Nature of Copyrighted Work: This factor typically favors plaintiff content creators because the works at issue — novels, news articles, photographs, songs — are highly creative works at the core of copyright protection.
(3) Amount and Substantiality: AI training typically involves copying entire works, which weighs against fair use. AI companies argue, however, that wholesale copying is necessary for their transformative purpose, analogizing to Perfect 10, Inc. v. Amazon.com, Inc., 508 F.3d 1146 (9th Cir. 2007), which held that copying entire images to create thumbnails was fair use.
(4) Market Effect: This is often the most important factor in determining fair use and is often the most hotly contested. Plaintiffs argue that AI systems directly harm their markets by serving as substitutes for their works, eliminating sales, subscriptions, and licensing revenue. They point to emerging licensing markets for AI training data as evidence that AI platforms are aware of market harm.
AI defendants respond that their services do not serve as market substitutes because users of their AI models are not seeking to access the specific copyrighted works but rather to generate new content. But when AI outputs closely replicate copyrighted works, this argument becomes substantially weaker.
Output-Based Infringement. Even if courts ultimately hold that input training on copyrighted works is fair use, substantial questions remain about when AI outputs infringe copyrights. Courts will need to apply traditional substantial similarity analysis to AI-generated content, asking whether outputs are substantially similar to copyrighted works in ways that appeal to the ordinary observer.
This analysis is complicated by the random nature of AI generation, which results in outputs varying with different prompts, and the unsettled question is that of who is the infringer (the AI company, the user, or both).
Secondary Liability. The contributory infringement claims in the OpenAI MDL raise questions about AI companies’ liability for user-generated infringement. If AI systems, such as ChatGPT and Claude, enable users to generate infringing outputs, are the AI companies liable even if they do not directly create the infringing content?
A recent Supreme Court decision suggests a high bar for finding AI companies liabile.
In Cox v. Sony, the court held that AI companies are not liabile for copyright infringement by users simply for providing internet access, even if they know infringement is occurring. The ruling establish that “contributory liability” requires proof that an internet service provider actively induced infringement or tailored its services specifically for illegal activity.
Digital Millennium Copyright Act. Several of the pending complaints noted above include claims under the DMCA, alleging that AI companies removed or altered copyright management information (which identifies the authors of a copyrighted work or the conditions for its use) when the companies used such works for training. These claims under 17 U.S.C. § 1202 may provide an alternative path to liability independent of the fair use analysis.
Impact of the Anthropic Settlement on Pending Litigation
Though it did not create legal precedent (only a ruling by an appellate court can do that), the Anthropic settlement is likely to influence ongoing and future AI copyright litigation in several significant ways:
Pressure to Settle or License. The $1.5 billion Anthropic settlement demonstrates that AI-copyright litigation presents enormous financial risk for defendants, particularly when pirated materials are involved. AI companies facing similar claims may feel increased pressure to settle rather than risk trial.
The settlement may also accelerate legitimate licensing negotiations. AI companies may conclude that paying for licenses up front is more cost-effective than defending protracted litigation that carries the risk of devastating damages awards. Several major AI companies have already entered into licensing agreements with content providers (as in the Suno and Udio cases discussed above), and this trend will likely accelerate.
Strengthened Plaintiff Negotiating Position. Content creators and copyright holders now have concrete evidence that courts may award substantial damages for AI training on pirated content. This strengthens their negotiating position in licensing discussions and may lead to higher licensing fees and more favorable terms.
The Authors Guild and other content creator organizations have stated that they view the Anthropic settlement as establishing a precedent for AI companies paying copyright owners a licensing fee for use of their material in the training process. While in legal terms no formal precedent has been established, the practical effect may be similar as content owners demand comparable compensation.
Distinction Between Lawfully and Unlawfully Acquired Training Data. Judge Alsup’s bifurcated ruling in the Anthropic litigation and the resulting settlement create a clear distinction between training on lawfully acquired versus pirated content. This may lead AI companies to:
• Conduct thorough audits of their existing training datasets to identify and remove pirated materials,
• Implement stricter protocols for acquiring training data,
• Focus on licensing agreements and legally purchased materials.
However, the extent to which Judge Alsup’s fair use holding for lawfully acquired books will be adopted by other courts remains uncertain. The holding applies only to the specific facts of the Anthropic case and only binds the parties to that settlement.
Continued Uncertainty on Output Infringement. Critically, the Anthropic settlement explicitly preserves all claims based on infringing outputs. This signals that even if the training question is resolved favorably for AI companies, substantial litigation over output-based infringement is almost certain to continue. Courts will need to develop frameworks for analyzing when AI-generated content infringes copyrights, which may vary significantly across different types of creative works (text, images, music, code).
Implications for the AI Industry
The Future of AI Development. The current wave of copyright litigation presents an existential challenge for some AI companies while creating opportunities for others:
AI Companies with Licensed Training Data. Companies that have invested in legitimate licensing arrangements or developed methods for training on licensed or public-domain data will have a competitive advantage. Their business models are more sustainable and less vulnerable to catastrophic copyright liability.
Startups and Smaller Players. The potentially massive damages in AI-copyright cases may deter venture capital investment in AI startups that have not secured proper licensing. The barrier to entry for AI development may increase substantially as licensing costs and legal risks rise.
Technical Innovation. AI companies may invest in developing models that can be trained effectively on smaller, fully licensed datasets or on synthetic data. And privacy-preserving methods may become more important as companies seek to minimize copyright risk.
Evolution of Licensing Markets. Settlements in AI-copyright litigation are accelerating the development of robust licensing markets for AI training data. In addition to the licensing deals discussed above, here are some additional recent developments:
• Reuters licensing its news archive to OpenAI,
• Reddit licensing user-generated content to AI companies,
• Getty Images creating its own licensed AI image generator,
• Meta reaching licensing agreements with CNN, Fox News and other media outlets,
Licensing markets may evolve to include:
• Standardized licensing terms for different types of content,
• Collective licensing organizations (similar to ASCAP and BMI for music performance rights),
• Tiered licensing based on model size, commercial use, and output restrictions,
• Revenue-sharing arrangements where content creators receive ongoing royalties.
Implications for Creators
The outcomes of current litigation will fundamentally shape content creators’ rights and economic interests:
Best Case Scenario for Creators. Courts reject AI companies’ fair use defense for training and require comprehensive licensing. Creators thus gain substantial licensing revenue and maintain control over how their works are used. The Anthropic settlement provides a model for compensation.
Worst Case Scenario for Creators: Courts adopt broad fair use holdings similar to Judge Alsup’s ruling on lawfully acquired works, potentially extending to other acquisition methods. AI companies obtain a largely free license to train on any publicly available or lawfully acquired copyrighted works. Creators’ ability to control and monetize their works for AI purposes is severely limited.
Likely Middle Ground. Courts may adopt nuanced approaches, potentially finding fair use for some training uses (particularly educational and research purposes) while requiring licensing for commercial AI applications. Output-based infringement claims will likely proceed even if training is deemed fair use, providing creators some protection against direct reproduction of their works.
Conclusion
The current wave of AI-copyright litigation represents a watershed moment in intellectual property law. The cases currently pending and those yet to be filed in federal courts across the country will fundamentally shape the relationship between artificial intelligence and copyright for decades to come.
The OpenAI multidistrict litigation in the Southern District of New York stands at the center of this legal battle, involving some of the world’s most prominent news publishers, bestselling authors, and valuable copyrighted works. Judge Stein’s early rulings denying motions to dismiss core infringement claims signal that these cases will proceed through extensive discovery and potentially to trial, where juries will grapple with complex questions about substantial similarity, market harm, and the nature of AI-generated content.
The $1.5 billion Anthropic settlement demonstrates that AI-copyright litigation presents existential financial risks for companies that train on pirated content while simultaneously providing a potential roadmap for resolving training disputes through licensing and compensation. The settlement’s sharp distinction between lawfully and unlawfully acquired training data will influence how AI companies structure their data acquisition going forward.
However, substantial uncertainty remains. Judge Alsup’s ruling as a trial judge that input training on lawfully purchased books constitutes fair use is not legal precedent other courts must follow; only an appeals court can create binding legal precedent. Further, the question of when AI outputs infringe copyrights remains largely unresolved. And the distinction between direct and contributory infringement in the AI context raises novel questions that courts have not yet addressed.
As this litigation proceeds, several predictions seem reasonable:
(1) Continued Settlements: More AI companies will be inclined to settle rather than risk the massive damages demonstrated in the Anthropic case, particularly where pirated training data is involved.
(2) Increased Licensing: The combination of litigation risk and likely plaintiff victories will drive more AI companies to enter into licensing arrangements with content providers. We are seeing this happen already.
(3) Legal Clarity Through Trials or Appeals: Eventually, some cases will proceed to trial and appeal, providing binding precedent on key fair use and infringement questions. The OpenAI MDL seems likely to produce such precedent given the stakes involved and the parties’ resources. Though such a resolution is a long way off.
(4) Potential Legislative Intervention: Congress may ultimately need to address the AI-copyright questions through legislation, potentially creating new frameworks that balance AI innovation with creator rights. The U.S. Copyright Office has already begun studying these issues and may recommend legislative solutions.
(5) International Dimensions: With parallel cases proceeding in the UK and potential litigation in the European Union and elsewhere, international coordination or conflicts may emerge regarding AI-copyright issues.
For now, content creators, AI companies, investors, and courts are navigating uncharted legal territory. The outcomes of pending cases will either validate the current approach of training AI systems on vast amounts of copyrighted content or require fundamental restructuring of how AI is developed.
What remains certain is that these cases represent far more than ordinary copyright disputes — they will help define the relationship between human creativity and artificial intelligence in the 21st century.
Return to Top
Return to Previous Page
Return to Homepage
Copyright © 2026 WANT Publishing Co.
