Factlen ExplainerAI Copyright LawEvidence PackJun 12, 2026, 1:57 PM· 8 min read· #5 of 5 in ai

AI and Copyright in 2026: The Evidence on Fair Use, Authorship, and Liability

With the Supreme Court denying copyright for AI-generated works and the EU AI Act enforcing strict transparency, the legal boundaries of artificial intelligence training data are finally solidifying.

By Factlen Editorial Team

Share this story

Rights Holders & Creators 35%Legal & Regulatory Consensus 35%AI Developers 30%

Rights Holders & Creators: Argue that ingesting copyrighted works for AI training without compensation is piracy and demand strict licensing frameworks.
Legal & Regulatory Consensus: Focus on enforcing human authorship requirements and shifting toward mandatory training data transparency rather than outright bans.
AI Developers: Maintain that model training is inherently transformative and qualifies as fair use, warning that strict licensing will stifle innovation.

What's not represented

· Open-source AI developers who rely on public data scraping
· Independent artists whose styles are mimicked by image generators

Why this matters

The legal consensus forming in 2026 fundamentally alters how businesses can use AI. If you use AI to generate code or marketing copy without human editing, you own none of it—and if the model memorized copyrighted data, your business can be held directly liable for infringement.

Key points

The U.S. Supreme Court declined to review a case confirming that purely AI-generated works cannot be copyrighted.
The U.S. Copyright Office warns that training AI on copyrighted material may constitute prima facie infringement.
Plaintiffs face massive evidentiary hurdles in proving that their specific works were used to train AI models.
The EU AI Act's August 2026 enforcement deadline mandates strict transparency for AI training data.
Businesses using AI tools face new liabilities if their unedited outputs infringe on existing copyrights.

$1.5B

Anthropic settlement benchmark

August 2, 2026

EU AI Act enforcement deadline

Factors of Fair Use evaluated

By mid-2026, the rapid deployment of generative artificial intelligence has collided forcefully with centuries-old copyright frameworks. With the U.S. Supreme Court declining to intervene in a landmark authorship case and multiple federal district courts issuing split decisions on training data, the legal boundaries of artificial intelligence are finally solidifying. This evidence pack evaluates the current legal precedents governing AI and copyright, examining the copyrightability of AI-generated outputs, the legality of ingesting copyrighted works for model training, and the shifting liability landscape for enterprise users. The outcomes of these legal battles will dictate the future of digital creation.[8]

The stakes are existential for both the technology sector and the global creative economy. If training data requires explicit licensing, the economics of frontier model development will fundamentally change, potentially locking out open-source developers. Conversely, if AI outputs cannot be protected and training is deemed universally fair use, businesses face unprecedented risks regarding intellectual property ownership, and human creators face systemic devaluation.[8]

**Claim 1: Purely AI-generated works cannot be copyrighted.** The evidence for this claim is currently definitive and backed by the highest courts. In March 2026, the U.S. Supreme Court denied a petition for certiorari in Thaler v. Perlmutter, letting stand a D.C. Circuit ruling that copyright protection strictly requires human authorship. This decision effectively ends the debate over whether an autonomous machine can be legally recognized as a creator, establishing a firm boundary that intellectual property rights are exclusively reserved for human beings. For businesses relying heavily on automated content generation, this ruling confirms that their unedited outputs reside entirely in the public domain.[2][4][6]

Computer scientist Stephen Thaler had sought copyright protection for a visual artwork titled 'A Recent Entrance to Paradise,' which was generated autonomously by his AI system, DABUS. The courts unanimously rejected the application, affirming the U.S. Copyright Office's long-standing position that the 'traditional elements of authorship' must be executed by a human being, not a machine. The ruling emphasized that copyright law was designed to incentivize human creativity, and extending those protections to algorithms would fundamentally distort the purpose of intellectual property frameworks.[1][2][4]

However, the U.S. Copyright Office has introduced a critical nuance: AI-assisted works can receive protection if the human contribution is sufficiently creative and controls the expressive elements of the final work. Merely entering a detailed text prompt into a generative model does not meet this threshold, but extensive human editing, arrangement, and modification of AI outputs may qualify the human-authored portions for legal protection. This requires companies to meticulously document the human involvement in their AI-assisted workflows.[1][4][6]

The U.S. Copyright Office requires meaningful human creative control for a work to receive legal protection.

**Claim 2: Ingesting copyrighted data for AI training is not categorically protected as Fair Use.** The evidence here is strong, though its application in court remains highly context-dependent. Technology companies have long argued that training large language models is inherently transformative—akin to a human reading books to learn concepts, facts, and linguistic styles. Under this theory, the models do not store exact copies of the ingested works, but rather mathematical weights and parameters, making the training process a non-infringing fair use of the original material.[4][5]

Regulatory bodies are increasingly rejecting this blanket defense. In its May 2025 Part 3 report on generative AI, the U.S. Copyright Office concluded that using copyrighted materials for model development may constitute prima facie infringement. The agency explicitly warned that 'transformative' arguments are not inherently valid when models possess the mechanical ability to output near-perfect copies of their training data, directly competing with the original works.[1][4]

Federal courts are beginning to align with this skeptical view. In the February 2025 ruling for Thomson Reuters v. Ross Intelligence, a Delaware federal court ruled against the automatic classification of AI training as fair use. The judge determined that the transformative nature of the AI model must be weighed against the potential market harm to the original creators, forcing developers to defend their data ingestion practices on a rigorous, case-by-case basis rather than relying on a universal legal shield.[4]

Federal courts are beginning to align with this skeptical view.

Furthermore, courts have drawn a hard line against the use of illicitly obtained data. In a mixed ruling involving Anthropic in mid-2025, a federal judge determined that while AI companies may legally use copyrighted materials for training if obtained legally, creating a permanent library from pirated sources or shadow libraries is not excused by the fair use doctrine. The ruling implies that AI companies must move toward licensed, permission-based models for their data pipelines.[4][5]

**Claim 3: Proving specific infringement and market harm remains a massive evidentiary hurdle for plaintiffs.** While the theoretical framework favors rights holders, the practical reality of litigation heavily favors AI developers. The evidence for this asymmetry is robust across multiple international jurisdictions. Even when courts acknowledge that unauthorized data scraping occurred, plaintiffs consistently struggle to prove that the ingestion of their specific works directly resulted in financial damage or that the AI model serves as a direct market substitute for their creations.[5][6]

A timeline of major legal and regulatory decisions shaping AI copyright law through 2026.

In May 2026, Judge Vince Chhabria denied a motion brought by a group of authors against Meta. The plaintiffs alleged that Meta's Llama models were trained on their copyrighted books. However, the court ruled that the authors failed to present meaningful evidence demonstrating how the training of large language models directly impacted the market for their specific works, halting the momentum of the class-action suit.[5]

A similar dynamic played out in the United Kingdom. In the 2025 case Getty Images v. Stability AI, the court dismissed the secondary copyright infringement claim, highlighting the immense evidential challenges rights holders face in proving that their specific images were used within datasets containing billions of parameters. The court noted that without direct access to the model's underlying architecture, proving specific infringement is nearly impossible.[6]

This evidentiary gap means that while unauthorized training might technically constitute infringement, successfully suing an AI company requires a level of forensic proof regarding the model's black-box training process that most plaintiffs simply cannot access without unprecedented legal discovery. This reality has forced creators to look toward legislative solutions rather than relying solely on the courts.[5][6][8]

**Claim 4: Global regulatory frameworks are pivoting from outright bans to mandatory transparency.** With courts struggling to process infringement claims efficiently, legislative bodies are stepping in to force disclosure. The evidence for this shift is strongest in the European Union, which has abandoned attempts to ban AI training in favor of strict auditing requirements. Regulators recognize that without transparency, copyright holders cannot even begin to enforce their rights, prompting a wave of new compliance mandates aimed at opening the black box of model development.[7][8]

The EU AI Act, which reaches its primary enforcement deadline on August 2, 2026, imposes strict transparency rules on general-purpose AI models. Providers must now publish detailed summaries of the content used for training their models, respecting EU copyright law regardless of where the model was originally trained or headquartered. Failure to comply can result in massive administrative fines or the suspension of the AI system within the European market.[7]

In the United States, similar momentum is building. Bipartisan bills introduced in early 2026 seek to compel AI companies to disclose when copyrighted works are utilized in training data, aiming to give creators the visibility needed to pursue licensing agreements or targeted litigation. While a federal framework has yet to pass, the push for transparency is becoming the dominant regulatory strategy.[4]

Courts are currently weighing these four factors to determine if AI training data ingestion constitutes fair use.

**The Enterprise Liability Shift.** The synthesis of these legal developments points to a significant transfer of risk from AI developers to the businesses that use their tools. Because purely AI-generated content lacks copyright protection, any business publishing unedited AI outputs has no legal recourse if a competitor copies their marketing materials, software code, or strategic documents. This creates a dangerous vulnerability for companies attempting to replace human creators entirely with automated systems, as their core assets become legally unprotected.[3]

More critically, as intellectual property lawyers now warn, if an AI-generated output infringes on an existing copyright—either because the model memorized training data or because the user prompted it to mimic a specific creator—the business utilizing the tool can be held directly liable. The legal burden is shifting from the developers who built the models to the enterprise users who deploy them, making corporate AI governance a critical necessity.[3][8]

As 2026 progresses, the legal consensus is clear: the era of unchecked data scraping is ending. AI developers must navigate a tightening web of licensing requirements and transparency mandates, while enterprise users must implement rigorous human-in-the-loop oversight to secure their intellectual property and avoid catastrophic liability. The technology has matured, and the law is finally catching up.[8]

How we got here

August 2024
The European Union's AI Act officially enters into force, beginning a staggered implementation period.
February 2025
A Delaware federal court rules in Thomson Reuters v. Ross Intelligence that AI training is not automatically protected as fair use.
May 2025
The U.S. Copyright Office releases its Part 3 report, warning that using copyrighted materials for AI training may constitute infringement.
March 2026
The U.S. Supreme Court declines to hear Thaler v. Perlmutter, cementing the requirement for human authorship in copyright.
August 2026
The EU AI Act's transparency rules and high-risk system obligations become fully enforceable.

Viewpoints in depth

Rights Holders' View

Creators and publishers view unauthorized AI training as systemic theft.

Authors, visual artists, and publishers argue that large language models are essentially commercial products built entirely on the uncompensated labor of human creators. They reject the 'fair use' defense, noting that AI systems can generate near-perfect copies of ingested works, directly competing with the original authors in the marketplace. This camp advocates for mandatory licensing regimes, where AI developers must pay royalties for the data that powers their models.

AI Developers' View

Tech companies argue that data ingestion is a transformative learning process.

The technology sector maintains that training an AI model is legally indistinguishable from a human reading a book to learn facts, styles, and concepts. They argue that models do not store copies of the training data, but rather mathematical representations of linguistic patterns. From this perspective, imposing strict copyright licensing on training data would make frontier model development prohibitively expensive, effectively ceding global AI leadership to countries with looser intellectual property laws.

Regulatory View

Regulators are prioritizing transparency and human authorship over outright bans.

Rather than attempting to dismantle existing AI models, global regulators and courts are focusing on boundaries. By firmly denying copyright protection to purely AI-generated outputs, they are preserving the commercial value of human creativity. Simultaneously, frameworks like the EU AI Act are forcing developers to open their black boxes, mandating that companies disclose what data they use so that existing copyright laws can be properly enforced.

What we don't know

Whether federal appellate courts will ultimately rule that AI training constitutes fair use across the board.
How exactly AI companies will comply with the EU AI Act's transparency mandates without exposing trade secrets.
The exact threshold of 'human involvement' required to successfully copyright an AI-assisted work.

Key terms

Fair Use: A legal doctrine that permits the unlicensed use of copyright-protected works under certain conditions, such as for criticism, comment, or transformative learning.
Certiorari: A formal request for a higher court, such as the U.S. Supreme Court, to review the decision of a lower court.
Prima Facie Infringement: A legal term meaning that, on its face, sufficient evidence exists to prove copyright infringement unless successfully rebutted by a defense like fair use.
General-Purpose AI (GPAI): Large-scale AI models, like GPT-4 or Claude, designed to perform a wide variety of tasks rather than one specific function.

Frequently asked

Can I copyright an image generated by AI?

No. The U.S. Supreme Court and Copyright Office have affirmed that purely AI-generated works lack the human authorship required for copyright protection.

Is it legal for AI companies to train models on copyrighted books?

The law is currently unsettled. While the U.S. Copyright Office suggests it may be infringement, courts require plaintiffs to prove specific market harm, which has proven difficult.

What happens if my business uses AI to create marketing materials?

You cannot copyright those materials unless a human substantially edited them. Additionally, your business could be held liable if the AI output infringes on an existing copyright.

How does the EU AI Act affect copyright?

Starting in August 2026, the EU AI Act requires providers of general-purpose AI models to publish detailed summaries of their training data and comply with EU copyright laws.

Sources

[1]U.S. Copyright OfficeLegal & Regulatory Consensus
Copyright and Artificial Intelligence: Part 3 Report
Read on U.S. Copyright Office →
[2]U.S. Supreme CourtLegal & Regulatory Consensus
Order List: 598 U.S. - Thaler v. Perlmutter Certiorari Denied
Read on U.S. Supreme Court →
[3]ForbesRights Holders & Creators
The Supreme Court Ruling On AI Copyright: What Businesses Need To Know
Read on Forbes →
[4]Built InAI Developers
AI-Generated Content and Copyright Law: What to Know in 2026
Read on Built In →
[5]IroncladRights Holders & Creators
The State of AI Copyright Lawsuits in 2026
Read on Ironclad →
[6]Penningtons Manches CooperLegal & Regulatory Consensus
AI, art and global approaches to copyright law: US Supreme Court declines to review Thaler v Perlmutter
Read on Penningtons Manches Cooper →
[7]European CommissionLegal & Regulatory Consensus
EU AI Act: Application Timeline and Transparency Rules
Read on European Commission →
[8]Factlen Editorial TeamLegal & Regulatory Consensus
Synthesis by Factlen editorial team
Read on Factlen Editorial Team →

Up next

On-Device AI

How Small Language Models Are Bringing Private, Zero-Latency AI to Your Phone

The AI industry is pivoting from massive cloud-based systems to Small Language Models (SLMs) that run directly on consumer hardware. Through advanced compression techniques, these compact models deliver zero-latency, privacy-first AI without requiring an internet connection.

Every angle. Every day.

Get ai stories with full source coverage and perspective breakdowns delivered to your inbox.

Get the briefing →Browse ai