How-To Guide

How to Set Up ai.txt and Pay-Per-Crawl

A 2026 guide to setting up ai.txt and Pay-Per-Crawl monetization on your site: what to declare, which marketplace to enroll in, how to validate the setup, and what to expect operationally.

By Ramanath, CTO & Co-Founder at Presenc AI · Last updated: April 30, 2026

What ai.txt and Pay-Per-Crawl Do Together

ai.txt is the declarative file that signals your AI crawl preferences (which use cases are permitted, what pricing intent applies, what attribution is required). Pay-Per-Crawl is the operational implementation that enforces those preferences through HTTP 402 responses and marketplace settlement. Setting them up together gives you a complete monetization-readiness baseline that takes about an hour for most publishers.

Step 1: Author Your ai.txt

Create a text file at https://example.com/ai.txt using one of the published templates. The minimum viable ai.txt expresses six things: default crawl permissions (allowed by default for general public content, paid for premium), bot-specific overrides (allow GPTBot, OAI-SearchBot, ClaudeBot, PerplexityBot for monetized access; consider blocking Bytespider and other low-compliance crawlers), use-case differentiation (search-time vs training-time use), pricing intent (preferred per-fetch rate range), attribution requirements (how the publisher should be credited when cited), and dataset opt-outs (whether you are excluding from Common Crawl and similar public datasets).

The Spawning.ai and IETF aicontrol templates are the most widely supported options as of April 2026. Pick one and stay consistent. Multiple templates can be combined in a single ai.txt if needed.

Step 2: Update robots.txt to Match

Update robots.txt to align with the permissions declared in ai.txt. The two files serve different functions but should be consistent. ai.txt is more expressive and AI-specific; robots.txt is the longer-standing protocol that all crawlers respect. Conflicts between the two files create ambiguity that AI bots resolve in different ways. Aim for the same allow/block decisions in both files, with ai.txt expressing additional pricing and attribution information that robots.txt cannot.

Step 3: Pick a Marketplace

Pick the marketplace that fits your publisher tier. Cloudflare Pay-Per-Crawl is the default if you are already on Cloudflare; enrollment is a feature flag in the dashboard. TollBit is the right choice for mid-market publishers with differentiated content. ProRata fits premium publishers with high-contribution inventory. ScalePost fits large mixed-content publishers prioritising operational simplicity. For most publishers, picking one to start and adding others later is the right approach; running 2-3 marketplaces is the mature target.

Step 4: Configure Pricing in the Marketplace

Set per-fetch and per-section pricing in your marketplace dashboard. For most general content, $0.005 to $0.02 per fetch is appropriate. For premium news, $0.05 to $0.20. For primary research, $0.10 to $0.50. The marketplace dashboards include recommended bands by content type; use those as anchors. Pricing too high makes AI crawlers walk away; pricing too low produces immaterial revenue. Differentiate by content tier rather than uniform pricing.

Step 5: Validate the Setup

Validate that everything works end-to-end before going live. Three checks: ai.txt is fetchable at https://example.com/ai.txt and parses correctly with a published validator (Spawning.ai's validator, IETF aicontrol's tooling). The marketplace integration is returning 402 responses for paid paths when an AI bot user agent fetches them. The settlement is reconciling correctly: a test transaction routes through the merchant of record and the publisher receives the expected payment minus marketplace fees.

Step 6: Monitor What Actually Happens

After going live, monitor the 402-to-paid conversion rate by AI bot identity. This is the single most important operational metric. As of April 2026, ChatGPT-User and OAI-SearchBot have meaningful 402 compliance for paid content. PerplexityBot complies for premium-tier sources. ClaudeBot, GPTBot, and Google-Extended mostly walk away. The conversion rates are improving over time but uneven across bots.

The operational implication is to set realistic expectations: most 402 responses will not be paid in 2026. The revenue comes from the meaningful subset of paid responses, not the total 402 count. Treat the metric as a leading indicator of how AI bot compliance is evolving rather than as a current-state revenue forecast.

Step 7: Iterate Quarterly

Reconcile actual revenue against expected revenue every quarter. The most common adjustment is pricing recalibration: prices that produced no revenue indicate AI bots walked away too often (price down), while prices that produced unexpectedly low revenue per fetch indicate the AI bots that paid were the lowest-tier (price up on premium inventory). Use Citation Value Score or marketplace-recommended bands as anchors for these adjustments rather than guessing.

Common Mistakes

Inconsistent ai.txt and robots.txt. Conflicts between the two files create ambiguity. Audit both quarterly to ensure alignment.

Single-marketplace participation when content tier supports more. Each marketplace captures different bot mix. Most upper-mid-market publishers benefit from running 2-3 marketplaces.

Setting and forgetting. AI bot compliance and pricing dynamics evolve. Treating the setup as a one-time decision produces meaningfully worse outcomes than quarterly reconciliation.

Expecting transformative revenue immediately. The 2026 numbers are real but not transformative for most publishers. The trajectory is upward; the immediate revenue is supplemental rather than dominant.

How Presenc AI Helps

Presenc AI provides ai.txt validation, marketplace performance monitoring, and Citation Value Score-driven pricing recommendations. For publishers running self-hosted x402 alongside marketplace participation, Presenc AI integrates the analytics so you have a unified dashboard rather than reconciling across multiple sources manually. The combination is the operational layer that turns a one-time setup into a managed channel.

Frequently Asked Questions

No. ai.txt is a voluntary declarative standard. Compliance by AI bots is increasing but not legally enforced. Combine ai.txt with HTTP-layer enforcement (Pay-Per-Crawl 402 responses, IP-level blocks for non-compliant bots) for operational reliability.
Yes. ai.txt expresses preferences declaratively even without a settlement layer. AI bots that respect ai.txt will adjust behaviour accordingly. The downside is that without a settlement layer, you cannot collect on the pricing intent expressed in ai.txt. Most publishers pair ai.txt with marketplace enrollment for operational completeness.
About an hour for a single-marketplace setup with ai.txt and robots.txt updates. Adding additional marketplaces takes 1-2 hours each because of the marketplace-specific dashboard configuration. Validation and monitoring setup takes another hour. Total first-time setup is typically 3-5 hours for a multi-marketplace, fully-instrumented configuration.
Some will. As of April 2026, compliance is uneven. The HTTP-layer enforcement (Pay-Per-Crawl 402 responses, IP-level blocks for non-compliant bots) is the backstop. ai.txt expresses intent; HTTP-layer mechanisms enforce it. The combination is more reliable than either alone.

Track Your AI Visibility

See how your brand appears across ChatGPT, Claude, Perplexity, and other AI platforms. Start monitoring today.