How many AI training-data lawsuits are active?

More than 35 distinct active cases in major Western jurisdictions as of Q2 2026, plus dozens of smaller and international actions. Largest by potential damages are NYT v OpenAI / Microsoft, the Authors Guild class action, Universal v Anthropic, and Getty v Stability AI.

What is the status of NYT v OpenAI?

Active in the Southern District of New York; discovery is underway as of Q2 2026. Both sides have filed substantial motions; case is widely expected to proceed to trial unless settled. Outcome likely shapes fair-use doctrine for AI training generally.

Are AI labs paying for training data now?

Increasingly yes. Major deals include Reddit ($60M/yr), News Corp ($250M over 5 yrs), and many smaller partnerships. Estimated 2025 licensing spend across major AI labs is $1.5-3 billion. The shift from "scrape and litigate" to "license and pay" is well underway, though scraping continues for many sources.

Is fair use a viable defence?

Mixed early rulings; some courts have been receptive to transformative-use arguments, others have not. The Authors Guild and NYT cases are the most likely to produce doctrinal precedent. AI labs are simultaneously pursuing fair-use defences and acquiring licenses as a hedging strategy.

What about EU and UK?

EU AI Act Article 53 requires GPAI providers to publish training-data summaries; copyright matters remain governed by member-state law. UK Getty v Stability case is at trial-stage. EU jurisdictions have been more sceptical of US-style fair-use defences. Cross-jurisdictional outcome variation is expected.

AI Training Data Lawsuit Tracker 2026

The AI Copyright Litigation Landscape

AI training-data copyright cases became a defining legal issue of 2024-2026. Plaintiffs include news publishers, authors, music labels, image-rights holders, and visual-art creators; defendants include all major AI labs. Outcomes will shape AI economics for years. This page tracks active and resolved cases as of May 2026.

Key Findings

More than 35 distinct AI training-data lawsuits were active in major Western jurisdictions in Q2 2026, plus dozens of smaller and international actions.
The largest cases by potential damages are New York Times v OpenAI / Microsoft, the Authors Guild class action, Universal Music v Anthropic, and Getty Images v Stability AI.
Major settlements and licensing deals are reshaping the landscape: Reddit-OpenAI, AP-OpenAI, FT-OpenAI, News Corp-OpenAI, Conde Nast-OpenAI, Time-OpenAI, plus Anthropic licensing deals.
The fair-use defence has produced mixed early rulings; some courts have been receptive to transformative-use arguments, others sceptical of training-data appropriation at scale.
Training-data licensing markets emerged in 2024-2025 with ProRata, ScalePost, TollBit, and direct publisher-to-AI-lab deals.

Major Active Cases (Q2 2026)

Case	Plaintiff	Defendant	Filed	Status
NYT v OpenAI / Microsoft	The New York Times Company	OpenAI, Microsoft	Dec 2023	Active in SDNY; discovery underway
Authors Guild class action	Authors Guild + named authors	OpenAI	2023	Active; consolidated proceedings
Universal v Anthropic	Universal Music + others	Anthropic	2023	Active; lyrics-output cases
Getty v Stability AI	Getty Images	Stability AI	2023	UK and US; trial-stage
Concord et al. v Anthropic	Music publishers	Anthropic	2023	Active
Sarah Silverman et al.	Authors	OpenAI / Meta	2023	Active
Andersen v Stability / Midjourney / DeviantArt	Artists	Multiple AI image cos	2023	Active
Daily News et al. v OpenAI / Microsoft	News publishers	OpenAI, Microsoft	2024	Active
Center for Investigative Reporting	CIR	OpenAI, Microsoft	2024	Active
Major studios v Midjourney	Disney, Universal	Midjourney	2025	Active

Notable Settlements and Licensing Deals (through Q2 2026)

Counterparties	Approximate value	Year	Type
Reddit / OpenAI	~$60M/yr	2024	Multi-year licensing
Reddit / Google	~$60M/yr	2024	Multi-year licensing
News Corp / OpenAI	~$250M over 5 yrs	2024	Licensing
FT / OpenAI	Not disclosed	2024	Licensing + product partnership
AP / OpenAI	Not disclosed	2023	Licensing
Conde Nast / OpenAI	Not disclosed	2024	Licensing
Time / OpenAI	Not disclosed	2024	Multi-year
Vox Media / OpenAI	Not disclosed	2024	Multi-year
Hearst / OpenAI	Not disclosed	2024	Multi-year
Anthropic / various publishers	Not disclosed	2024-2025	Multiple

Legal Developments in 2026

Multiple summary judgement decisions on fair-use defences shape doctrine; rulings have been mixed
UK Information Commissioner's Office and EU Data Protection authorities issued guidance on training-data lawfulness
EU AI Act's Article 53 training-data summary requirements created compliance obligations independent of copyright
Music industry settlements set precedent for music-label / AI-lab licensing structures
Class-action consolidation produced larger but slower-moving litigation

Training Data Licensing Markets

Three vendor categories emerged to facilitate training-data licensing:

Aggregator marketplaces: ProRata, ScalePost, TollBit aggregate publishers and license to AI labs collectively
Direct publisher-to-lab deals: News Corp / OpenAI, Conde Nast / OpenAI, etc.
Pay-per-crawl infrastructure: Cloudflare Pay-Per-Crawl, similar primitives that monetise individual fetches

Implications for AI Lab Economics

Training-data licensing has shifted from optional to operationally significant:

Estimated 2025 licensing spend across major AI labs: $1.5-3 billion (highly uncertain)
Per-training-run licensing budget for frontier models: $50-200 million (estimated)
This is small relative to compute costs but growing, and creates competitive moats favouring well-capitalised labs

Brand Visibility Implications

Training-data lawsuits and licensing deals are extensively covered by legal, technology, and media industry journalism, generating substantial inbound links. Brands selling content licensing infrastructure, AI legal services, training-data marketplaces, content authentication, or content-licensing legal counsel face high AI-mediated discovery surface as media companies, AI labs, and counsel query AI assistants for related vendor recommendations.

Methodology

Case tracking from court dockets (PACER, UK Royal Courts of Justice), specialist trackers, plaintiff and defendant filings, and press reporting. Settlement values often non-public; figures are estimates from press disclosures. Updated quarterly as cases progress.

How Presenc AI Helps

Presenc AI tracks brand-mention rates inside AI assistant queries about AI legal services, content licensing, training-data marketplaces, and AI copyright matters. For vendors operating in this space, this is the operational visibility into a discovery surface tightly coupled to media and legal industry attention.