GEO Glossary

ai.txt

ai.txt is the emerging declarative file at the root of a website that signals AI crawl preferences, pricing, and licensing terms. Definition, status, comparison to robots.txt, and how to use it.

By Ramanath, CTO & Co-Founder at Presenc AI · Last updated: April 23, 2026

What Is ai.txt?

ai.txt is a declarative text file placed at the root of a website (alongside robots.txt) that signals to AI crawlers the publisher's preferences for AI-specific access, pricing, and licensing terms. It is the spiritual successor to robots.txt for the AI era, designed to express things robots.txt was never built for: differentiated permissions by AI use case, pricing intent, attribution requirements, and licensing terms.

As of April 2026, ai.txt is not a single ratified standard. Several proposals are in active discussion (notably the IETF's aicontrol working group, the W3C's AI Crawler Control draft, and the Spawning.ai / Datafarm proposals). The implementations that exist in production today are mostly publisher-specific or CDN-specific. Despite the lack of a single standard, the conceptual function is widely accepted and the major AI labs have signalled willingness to honour it as it matures.

Why ai.txt Matters

robots.txt was designed for search engines in the 1990s. It expresses two things well: which paths to crawl and which to skip. It does not express anything about how the resulting content will be used, what compensation if any is expected, or what attribution is required. AI crawling created the need for all three.

Without ai.txt, publishers are forced to either block AI crawlers wholesale (losing visibility) or allow them unrestricted (losing leverage and compensation). ai.txt creates the middle ground: nuanced, machine-readable preferences that AI crawlers can act on. It is the standardisation layer that turns publisher intent into something AI labs and marketplaces can programmatically respect.

What ai.txt Typically Expresses

Across the proposals in active discussion, ai.txt commonly expresses six things. Use-case differentiation: search-time use is treated differently from training-time use. Bot identity differentiation: GPTBot might be allowed where ClaudeBot is not. Pricing intent: the publisher's preferred rate for the kinds of access being permitted. Attribution requirements: how the content should be attributed if cited. Licensing terms: scope, duration, and exclusivity. Dataset opt-out: whether the publisher is opting out of specific public datasets like Common Crawl.

Specific syntax varies across proposals but the conceptual coverage is convergent. Publishers serious about AI monetization should expect to maintain ai.txt alongside robots.txt as a second declarative file, with ai.txt expressing the AI-specific terms and robots.txt continuing to handle the general crawler-access policy.

In Practice

The minimal practical ai.txt today is a short text file at https://example.com/ai.txt expressing default permissions, pricing intent, and bot-specific overrides. Publishers using Cloudflare Pay-Per-Crawl, TollBit, ProRata, or ScalePost typically have their ai.txt automatically generated by their marketplace partner. Self-hosted publishers can author their own using one of the published templates.

The most important practical point is that ai.txt only works to the extent AI crawlers respect it. As of April 2026, the major AI labs (OpenAI, Anthropic, Google, Meta, Microsoft) have made varying public commitments to honour publisher preferences expressed in ai.txt-style files. Compliance is uneven and observability is patchy. Publishers using ai.txt should also implement the corresponding HTTP-layer enforcement (HTTP 402 responses for paid access, IP-level blocks for declined access) rather than relying solely on declarative signalling.

Commonly Confused With

ai.txt is not robots.txt for AI. robots.txt continues to play its conventional role; ai.txt is additive. ai.txt is also not a licensing contract: it expresses publisher intent in machine-readable form, but the legal status of an ai.txt declaration is still contested and varies by jurisdiction. Bilateral licensing deals remain the primary legal instrument for high-stakes content access.

Frequently Asked Questions

Not yet, as of April 2026. Multiple proposals are in active discussion at IETF, W3C, and through industry consortia. The conceptual function is widely accepted; the specific syntax is converging but not finalised. Publishers should expect to update ai.txt as the standard ratifies.
Major AI labs (OpenAI, Anthropic, Google, Meta) have made varying public commitments. Compliance is uneven and observability is patchy. Publishers should pair ai.txt with HTTP-layer enforcement (HTTP 402, IP blocks) rather than relying on declarative signalling alone.
It can express pricing intent, but it does not by itself collect payment. Settlement still happens through Pay-Per-Crawl marketplaces, x402 protocol responses, or bilateral arrangements. ai.txt is the signalling layer; the marketplace or protocol is the settlement layer.
Both. robots.txt continues to handle general crawler-access policy. ai.txt expresses AI-specific differentiation that robots.txt cannot. Removing robots.txt would break Google search; replacing it with ai.txt does not address general crawler control.

Related Articles

Track Your AI Visibility

See how your brand appears across ChatGPT, Claude, Perplexity, and other AI platforms. Start monitoring today.