Copyright Litigation

Tracking Litigation at the Intersection of
Copyright and Artificial Intelligence

Editor: Robert S. Want rwant@LegalEditor.com  
(347) 804-6763

 

Return to Previous Page
Return to Home Page

Additional Recent Filings – Cont.
February 24, 2026
YouTuber Alleges Runway AI Unlawfully ‘Scrapes’ Videos in Its AI Training
A YouTube creator has filed a proposed class action accusing Runway AI of unlawfully scraping and downloading copyrighted videos to train its generative artificial intelligence system without consent.

Plaintiff alleges that Runway AI “unlawfully accessed and extracted copyrighted videos from YouTube” to train its models, bypassing the platform’s technological protection measures in violation of the Digital Millennium Copyright Act.

According to the complaint, YouTube provides “streaming-only access” and prohibits users from using automated means such as “robots, botnets, or scrapers” to access content. The suit alleges Runway used automated tools to download videos while “bypassing the platform’s protective measures,” then incorporated those works into its commercial AI products.

February 10, 2026
Author Sues Adobe, Alleging AI Model Trained on Pirated Books
A New York author has filed a proposed class action against Adobe Inc., accusing the software company of using pirated books to train its artificial intelligence models without authorization or compensation.


In a complaint filed in U.S. District Court for the Northern District of California, plaintiff alleges that Adobe engaged in the “surreptitious, non-consensual use and collection of author’s books and written works” to train its SlimLM small language models. SlimLM was pretrained, the lawsuit says, on the SlimPajama dataset, which contains “hundreds of thousands of copyrighted books that were acquired without the authorization or consent of the author.”

Plaintiff argues that Adobe’s conduct was not “fair use,” asserting that the company could have lawfully licensed the works but instead relied on datasets sourced from “pirate” libraries and unauthorized web crawls. The proposed nationwide class seeks statutory damages, injunctive relief, and destruction of infringing copies.

January 28, 2026
Anthropic Faces New Music Publisher Lawsuit Over Alleged Piracy

Major music publishers, including Concord Music Group, Universal Music Publishing Group, and others, have filed a new federal lawsuit accusing AI company Anthropic and its founders of copyright infringement involving musical compositions.

Anthropic illegally downloaded millions of pirated books containing copyrighted song lyrics and sheet music using BitTorrent, a file-sharing system “synonymous with internet piracy,” according to the complaint filed in San Francisco federal court. The publishers claim Anthropic used “illegal shadow libraries” such as Library Genesis and Pirate Library Mirror to obtain the works without permission or payment.

The new lawsuit builds on an earlier ongoing lawsuit that the same publishers brought against Anthropic in 2023 over the use of their work to train Claude to respond to human prompts. According to the new suit, Anthropic’s founders, including CEO Dario Amodei and co-founder Benjamin Mann, authorized and participated in the torrenting. The publishers say Anthropic’s business was “built on piracy,” accusing the company of downloading and simultaneously uploading infringing copies, thereby multiplying the harm.

January 23, 2026
Creators Accuse Snap of Scraping YouTube Videos to Train AI
A group of YouTube content creators has filed a proposed class action accusing Snap Inc. (d/b/a/ Snapchat) of illegally scraping millions of copyrighted videos from YouTube to train and commercialize its generative artificial-intelligence products.


Filed in Los Angeles federal court, the lawsuit alleges Snap “unlawfully circumvent[ed] technological measures to access and scrape millions of copyrighted videos” in violation of the Digital Millennium Copyright Act. The plaintiffs say Snap bypassed YouTube’s access controls to obtain video files that are otherwise available only through streaming.

According to the complaint, Snap used automated tools, including video-downloading software and rotating IP addresses, to evade detection while downloading videos at scale. The creators allege Snap used the material to train text-to-video and image-to-video AI systems that now power features such as Snapchat’s “Imagine Lens,” calling the conduct “an unconscionable attack on the community of content creators.”

January 15, 2026 
Publishers Sue Google, Alleging Mass Copyright Theft to Train Gemini AI
Two major publishing companies have sued Google, accusing it of carrying out “one of the most prolific infringements of copyrighted materials in history” to train its generative artificial-intelligence system, Gemini


In a proposed class action filed in federal court in San Jose, Calif., plaintiffs Hachette Book Group and Cengage Learning allege that Google illegally copied millions of copyrighted books to build and refine its Gemini AI models rather than paying for licenses. The publishers claim Google sourced books from pirated websites and from behind paywalls, repeatedly copying the works during multiple stages of AI training.

The lawsuit argues that Gemini can now generate outputs that directly substitute for original books, including “verbatim and near-verbatim copies,” summaries, and AI-generated knockoffs that mimic specific authors’ styles. This infringement, according to the complaint, displaced book sales, undermined a growing market for AI-training licenses, and caused “substantial and irreparable harm” to authors and publishers. 

December 29, 2025
xAI Sues California Attorney General Over AI Data Disclosure Law
Artificial intelligence company xAI, owned by Elon Musk, has filed a lawsuit in Los Angeles federal lawsuit against California Attorney General Rob Bonta, seeking to block a new state law that would require AI developers to publicly disclose detailed information about the data used to train their AI systems.


In its complaint, xAI argues that the new law forces companies to reveal sensitive trade secrets about their proprietary training datasets, harming competition and innovation. The law, according to the complaint, requires developers of generative AI systems to post online documentation describing the sources, sizes, and types of data used to train their models, including whether the data contains copyrighted material or personal information. xAI contends that those disclosures would provide competitors with “a roadmap to mirror their [xAI] success” and undermine protections long afforded to trade secrets.

xAI alleges that the California statute violates the Constitution’s Takings Clause by compelling the surrender of valuable intellectual property without compensation, calling the disclosures a “quintessential per se taking.” The company also claims that the law unlawfully compels speech in violation of the First Amendment and is unconstitutionally vague.

December 22, 2025
Major AI Companies Used Widespread Book Piracy to Train AI Models, Authors Allege

A group of authors and journalists has sued leading artificial intelligence companies in San Francisco federal court, accusing the companies of illegally copying copyrighted books from pirate websites to build and train lucrative AI systems.

According to the complaint, defendants Anthropic, Google, OpenAI, Meta Platforms, xAI, and Perplexity AI committed a “straightforward and deliberate act of theft” by downloading pirated copies of books and using them to train large language models without permission or payment. Defendants, the complaint says, sourced works from so-called shadow libraries such as LibGen, Z-Library, and OceanofPDF, which, plaintiffs say, are long-recognized hubs for piracy. Rather than licensing content, the companies reproduced the books into AI systems now valued in the hundreds of billions of dollars, the lawsuit alleges.

Plaintiffs contend the infringement was willful, noting that the illegal status of those libraries was widely known within the technology industry. The lawsuit says companies pressed ahead anyway because pirated books offered “gold-standard” training material that helped AI models learn narrative structure, syntax, and style. Filing as individuals rather than as a class, plaintiffs argue that defendant AI firms should not be allowed to resolve claims “for pennies on the dollar” through class settlements while continuing to profit from allegedly unlawful copying.

December 19, 2025
Google Accuses SerpApi of Illegal ‘Data Scraping’ Scheme
Google has sued data services company SerpApi, alleging it illegally “scraped” Google search results and circumvented technological safeguards designed to protect copyrighted content.


The lawsuit, filed in the U.S. District Court for the Northern District of California, accuses SerpApi of violating the Digital Millennium Copyright Act by bypassing Google’s anti-scraping technology, known as SearchGuard, to access and resell search results “at an astonishing scale.” Google says in its suit that SerpApi sends “hundreds of millions of artificial search requests each day,” a volume that has increased by as much as 25,000% over two years.

Google argues that SerpApi, which promotes itself as providing real-time access to Google search results, unlawfully copies and redistributes search results containing licensed images and other copyrighted material from products such as Knowledge Panels, Google Maps, and Google Shopping. The complaint describes SerpApi’s business model as “parasitic,” asserting that it takes content “for free” while denying Google and its partners compensation.

December 17, 2025
Adobe Sued for Allegedly Misusing Author’s Work In AI Training

An author has filed a proposed class action against Adobe Inc. alleging the software company illegally copied and used copyrighted books to train its artificial-intelligence language models without permission or compensation.

The lawsuit is part of a wave of cases — many of which are covered in this article — brought by copyright owners against tech companies over their AI training, and it is the first such case against Adobe.

Filed in federal court for the Northern District of California, the suit was brought by author Elizabeth Lyon on behalf of similarly situated writers. The suit claims Adobe infringed copyrights by using authors’ works in training its SlimLM small language models, which are designed for on-device document assistance on mobile devices. Lyon, who specializes in writing instructional books on how to market novels, argues in her suit that Adobe trained its AI models on unlicensed copyrighted materials, including books owned by the plaintiff and other authors.

December 17, 2025
Musicians Say AI Music Platform Infringes Copyrighted Works
A group of independent musicians and composers has filed a proposed class action accusing Kunlun Tech Co. and its subsidiary Skywork AI of illegally copying plaintiffs’ music and lyrics to train and operate an AI music generator called “Mureka.”


Defendants systematically copied works by independent artists to fuel a “commercial, mass-market music-generation engine,” while marketing Mureka as a source of “royalty-free” and “copyright-friendly” music for commercial use, according to the complaint filed in Chicago federal court. Plaintiffs say that defendants copied “massive quantities of sound recordings and musical works” without permission and used them to train models designed to generate “studio-quality,” “radio-ready” songs that directly compete with human-made music in licensing, streaming, and production markets.

The lawsuit further alleges violations of the Digital Millennium Copyright Act, asserting that defendants circumvented technological safeguards and stripped copyright-management information through practices such as stream-ripping and reference-track uploads. In addition to copyright claims, plaintiffs accuse defendants of unlawfully collecting and exploiting artists’ voiceprints and biometric data through voice-cloning features, in violation of Illinois privacy and publicity laws.

December 5, 2025
New York Times Sues Perplexity AI for ‘Illegal’ Copying of Content

With its case against OpenAI currently pending, The New York Times has filed a similar lawsuit against Perplexity AI, accusing the artificial intelligence startup of copying millions of The Times’ articles without permission to power Perplexity’s generative AI products.

In its suit filed in Manhattan federal court, The Times alleges that Perplexity engaged in “large-scale, unlawful copying and distribution” of its copyrighted content without permission or payment, despite “express and repeated objections.” The suit says Perplexity’s products generate summaries and responses that are “verbatim or near-verbatim reproductions” of Times reporting and features.

The Times claims that the conduct occurs at two stages: when Perplexity’s crawlers scrape Times articles for its search index, and when its AI outputs reproduce or closely paraphrase that material. The Times also alleges trademark violations, arguing that Perplexity falsely attributes fabricated or incomplete content to the newspaper, misleading users and damaging its brand. Such practices, the suit says, divert readers and revenue away from the publisher.

November 26, 2025
Regional Newspapers Sue OpenAI and Microsoft, Alleging ‘Massive’ Copyright Theft
Nine regional newspapers — including the Hartford Courant Co., the Los Angeles Daily News Publishing Co., and the San Diego Union-Tribune — have filed a lawsuit in Manhattan federal court accusing OpenAI and Microsoft of “willful” and ongoing mass copyright infringement through the training and operation of their AI models.


Plaintiffs argue in their complaint that defendants “pilfered, copied, memorized, and replicated” hundreds of thousands of their articles to build generative AI systems that now reproduce news content without authorization or compensation. The complaint asserts that OpenAI trained its models on copyrighted material “scraped from the internet, regardless of paywalls or other restrictions.”

Citing testimony from OpenAI CEO Sam Altman, the suit says defendants and other AI companies “rely on copyrighted material” and that it would be “impossible to train today’s leading AI models without using copyrighted materials.” Plaintiffs allege that defendants knew their models “memorized” protected works and can output “verbatim or near-verbatim versions” of news articles.

November 21, 2025

Lawsuit Accuses Figma of Illegally Using Customer Designs to Train Its AI

A proposed class action filed in San Francisco federal court accuses design-software company Figma “juiced its valuation” ahead of its IPO by secretly using customer intellectual property to train its artificial-intelligence models.


According to the complaint, Figma violated explicit promises that users “own your User Content” and that the company “does not claim any ownership rights” over customer designs. The lawsuit says that in August 2024, Figma “silently and unilaterally opted vast swathes of its customers into its AI data training program,” despite years of assurances that it would not do so.

The suit says Figma used billions of dollars’ worth of proprietary customer designs to train AI tools that became central to its 2025 IPO, boosting its market value from $12 billion to nearly $70 billion. The complaint, unlike most unlawful data-training suits, does not allege copyright infringement but rather the illegal accessing of customer trade secrets.

November 21, 2025
Snowflake Charged in Alleged AI-Training Copyright Piracy
An author has filed a proposed class action lawsuit accusing data-cloud company Snowflake Inc. of illegally copying copyrighted books to train its artificial-intelligence models, in what the complaint calls mass copyright infringement tied to the company’s flagship “Arctic” large language models.


The lawsuit, filed by author Darius H. James in Montana federal court, alleges Snowflake trained its models using the RedPajama dataset, which included the pirated Books3 collection. The complaint states the training data relied in part on “unlicensed copyrighted materials,” including plaintiff’s registered works.

According to the complaint, Snowflake, headquartered in Bozeman, Mont., “downloaded, copied, stored, and used the RedPajama dataset” to develop Arctic, which was publicly launched in April 2024. The suit claims Snowflake repeated those acts during preprocessing and training while retaining the data for future model development. Plaintiff seeks a declaration of willful infringement and an order requiring destruction of all infringing copies.

November 5, 2025
Entrepreneur Sues Meta, Alleging “Industrial-Scale” Copyright Theft to Train Llama AI

Entrepreneur Media (publisher of Entrepreneur Magazine) has sued Meta Platforms in San Francisco federal court, accusing Meta of building its Llama artificial intelligence models on “systematic and widespread copyright theft” of books and magazine articles the publisher owns.

Entrepreneur claims in its lawsuit that Meta copied hundreds of its copyrighted works “without permission, without payment, and without regard for the law.” Meta “deliberately and willfully stole hundreds of terabytes of copyrighted works to train its family of LLMs, known collectively as ‘Llama,’” and opted for “industrial-scale piracy” rather than paid licensing agreements, according to the complaint.

The complaint asserts Meta obtained the works from “shadow libraries,” including the Books3 and LibGen datasets, and even distributed them through peer-to-peer file-sharing systems such as BitTorrent and LibTorrent. Further, the complaint adds, Meta removed copyright notices and other identifying metadata to conceal the infringement, with internal processes intended to “filter copyright lines.” 

October 22, 2025 
Perplexity Faces Reddit Lawsuit Over ‘Industrial-Scale’ Scraping of User Content
Social media platform Reddit Inc. has accused artificial intelligence company Perplexity AI and three data-scraping firms of unlawfully bypassing security measures to steal “industrial-scale” amounts of copyrighted Reddit content.

In its complaint filed in Manhattan federal court, Reddit likened the defendants to “bank robbers” who broke into the “armored truck” carrying its data, accusing them of “industrial-scale, unlawful circumvention” of anti-scraping protections. The complaint names Perplexity AI, SerpApi LLC, Oxylabs UAB, and AWMProxy as defendants, alleging that they collectively evaded Reddit’s and Google’s digital barriers to capture Reddit text, images, and videos from search results pages.

Reddit says in its lawsuit that Perplexity “was caught red-handed” using “the digital equivalent of marked bills” to trace its content, despite defendant having received a cease-and-desist letter. According to the suit, Perplexity’s citations to Reddit “increased forty-fold” after being told to stop. Reddit argues that the scraping violated the Digital Millennium Copyright Act’s ban on circumventing technological access controls. The suit seeks a court order barring defendants from using or selling Reddit data, saying “Reddit believes in an open internet but not the misuse of public content.”

October 22, 2025
Author Sues Apple, Alleging AI Training Used “Pirated Books” Without Permission
Apple “willfully” infringed millions of registered copyrighted works” to train its new Apple Intelligence artificial-intelligence system, alleges a proposed class action filed in the Northern District of California.


According to the complaint, author Tasha Alexander — suing on behalf of other writers — claims that Apple copied entire books from “pirated sources” and used them to develop Apple Intelligence, a suite of AI writing, image, and personal-assistant tools embedded across iPhones, iPads, and Macs. The lawsuit asserts that Apple “scrapes the internet and downloads pirated copies of [authors’] works, reproduces these works, and uses them to train its models.”

The suit says that Apple used datasets known to include bootlegged books, such as “Books3,” part of a larger dataset called “The Pile,” and that Books3 consists of the “full text of all 196,640 books” taken from a shadow library. Apple made no effort to seek permission or pay authors, even though it struck licensing deals with companies like Shutterstock, plaintiff states. 

October 15, 2025

Authors Accuse Salesforce of Pirating Nearly 200,000 Books to Train AI Models
Two authors have filed a proposed class action in San Francisco federal court accusing Salesforce Inc. of unlawfully copying “hundreds of thousands of copyrighted books” to build and commercialize its AI models, including the XGen series.

The lawsuit alleges Salesforce secretly downloaded and used massive book datasets known as “The Pile” and “RedPajama,” which contained the Books3 corpus — a library of roughly 196,000 books copied from pirate websites including Bibliotik, Z-Library and LibGen.

Salesforce “pirated hundreds of thousands of copyrighted books to develop its XGen series of large language models,” according to the complaint, “despite describing the datasets as “legally compliant.”
The complaint claims that Salesforce stored, copied and repeatedly used the datasets throughout the AI-training process, benefiting commercially by selling access to the models through its Agentforce AI platform.

September 24, 2025

Encyclopedia Britannica Sues Perplexity Over AI ‘Answer Engine’
AI startup Perplexity AI has been accused by Encyclopedia Britannica and Merriam-Webster of misusing their content in its “answer engine” for internet searches.

According to the complaint filed in Manhattan federal court, Perplexity’s “answer engine” searches the internet in response to user requests and summarizes what it finds, providing an AI-based alternative to traditional search engines like Google. The complaint alleges that Perplexity’s system “free rides” on Britannica and Merriam-Webster’s work by summarizing their articles and diverting traffic that would otherwise go to their websites.

The lawsuit argues that Perplexity infringed plaintiffs’ copyrights by scraping their websites, copying their articles, and reproducing their content without permission. They also accuse defendant of violating their trademark rights by attributing AI-hallucinated material to them. “The law does not permit Perplexity’s systematic disregard for the rights and intellectual property of Britannica and Merriam-Webster,” the suit states, seeking damages, injunctive relief, and protection of the public’s access to trustworthy information.

April 24, 2025
Ziff Davis Sues OpenAI Over Alleged AI Misuse of Copyrighted Content
Digital media giant Ziff Davis has filed a lawsuit against artificial intelligence company OpenAI, accusing it of copyright infringement and other intellectual property violations tied to the use of plaintiff’s published content.

Ziff Davis, which owns publications like PCMag, Mashable, IGN, and Everyday Health, alleges that OpenAI “intentionally and relentlessly” copied millions of its articles without permission to train and operate its AI models. The complaint, filed in Delaware federal court, claims that OpenAI “reproduced, distributed, displayed, performed, and made available” Ziff Davis content in both verbatim and derivative forms.

The suit also asserts that OpenAI violated the Digital Millennium Copyright Act by removing copyright management information and bypassing anti-scraping directives like robots.txt. Ziff Davis contends that OpenAI’s scraping of its content even escalated after the media company explicitly demanded it stop. The suit follows other recent legal challenges from publishers and authors accusing AI firms of misappropriating their content.

October 21, 2024
Perplexity AI Faces Wall Street Journal Allegations of Copyright Theft
Perplexity AI has been sued by Dow Jones companies The Wall Street Journal and the New York Post for massive copyright infringement tied to Perplexity’s AI-powered “answer engine” that encourages users to “skip the links” to original news sites.


The lawsuit, filed in Manhattan federal court, alleges startup Perplexity AI “engag[es] in a massive amount of illegal copying of publishers’ copyrighted works,” using news articles from The Wall Street Journal and the New York Post to generate answers that act as substitutes for visiting the publishers’ websites. This practice, plaintiffs say, siphons off subscription, advertising, and licensing revenue.

Perplexity is accused in the suit of scraping and storing “hundreds of thousands” of articles and, in some cases, providing users with full, verbatim copies of paywalled stories. The publishers also allege Perplexity fabricates content and falsely attributes it to their publications, a practice known as “hallucinations,” which they say damages their brands and misleads readers.

Return to Top
Return to Previous Page
Return to Home Page