Which AI bot is most efficient per fetch?

PerplexityBot leads on crawl-to-citation efficiency among major platforms because Perplexity's product surfaces sources prominently in every answer. ClaudeBot, GPTBot, and Google-Extended sit lower because their downstream products either ground internally without surfacing sources or are training-focused.

Should publishers stop optimising for GPTBot?

No. GPTBot's observable citation rate in a short window is low, but its long-run contribution to model training is significant. Optimising for GPTBot is a slow-payoff investment that compounds over multiple model release cycles, while optimising for PerplexityBot is a fast-payoff investment that materialises in days. Publishers should do both.

Is the efficiency ratio stable over time?

Mostly stable in rank ordering across platforms but with meaningful within-platform variation. PerplexityBot's efficiency has been stable at the top of the ranking through 2025 and 2026. GPTBot's short-window efficiency varies as OpenAI changes the relative weight of GPTBot vs OAI-SearchBot crawling. Quarterly updates capture the rank stability and within-platform variation.

Can a publisher influence the efficiency ratio for their own content?

Yes, partially. Content quality, structural clarity, schema completeness, and source authority all increase the likelihood that a crawl results in a citation. Publishers cannot change platform-level efficiency, but they can position their content at the top of each platform's within-platform efficiency distribution. Pages that score high on Citation Value Score tend to be at the top of within-platform efficiency rankings.

Crawl-to-Citation Efficiency by Platform 2026

The Single Metric That Defines Platform Citation Behaviour

Crawl-to-citation efficiency is the ratio of how many crawl fetches a given AI platform performs against a publisher's content versus how many citations of that content appear in the platform's user-facing answers. The ratio varies by an order of magnitude across platforms, and the variance is one of the most useful predictors of where publisher monetization investments pay back.

Most published research treats crawl and citation as separate phenomena. Joining them produces a different lens: an AI platform with a high efficiency ratio is using each crawl efficiently for citation, while a platform with a low ratio is fetching speculatively. This page reports the April 2026 efficiency ratios across major platforms, drawn from joint observation of crawl events and citation outcomes on a sample of monitored publisher domains.

Platform-Level Efficiency Ranking

Platform / bot	Crawl-to-citation efficiency (April 2026)	Read
PerplexityBot + Perplexity user-fetcher	Highest among major platforms	Tight crawl-to-cite loop; citation-focused product design
OAI-SearchBot + ChatGPT-User	Mid-to-high	Search product cites efficiently; main ChatGPT cites less explicitly
Google-Extended	Low to mid	Training-focused; AI Overviews cites but ratio is dampened by low display rate
ClaudeBot + Claude user-fetcher	Low to mid	Grounds internally without surfacing every source explicitly
GPTBot	Low	Training-focused; downstream citation rate hard to attribute
Bytespider	Very low	Indiscriminate crawl; citation tracking minimal

Why Perplexity Leads

Perplexity's product design surfaces sources prominently in every answer. The crawl-to-citation feedback loop is therefore tight: PerplexityBot crawls in patterns that closely match query patterns, and the resulting citations show up in Perplexity user-facing answers at a high rate. The efficiency ratio is roughly 5-10x what training-focused crawlers like GPTBot achieve. For publishers, this makes Perplexity-targeted optimisation disproportionately high-leverage relative to fetch volume.

Why GPTBot's Ratio Is Low

GPTBot is a training-data crawler. Its job is to ingest content for future model training, not to feed real-time citations. The citation outcome from a GPTBot fetch is delayed and diffuse: the content might appear in a future model's training corpus, which might surface as a citation in an answer months or years later. The attribution chain is too long to produce a clean efficiency ratio in the same observation window. The published number above is therefore an underestimate of GPTBot's eventual citation contribution; it accurately describes the in-window efficiency, not the long-run training contribution.

Why ClaudeBot's Ratio Is Below Perplexity's

Claude grounds answers internally without surfacing every source by default. When Claude does cite explicitly (in Computer Use sessions, in Claude.ai responses with explicit source requests, in API responses with citation flags enabled), the rate is meaningful. But the default behaviour shows fewer sources to users than Perplexity does, which compresses the observable citation rate per crawl.

Implications for Publisher Investment

Three concrete implications. First, prioritise PerplexityBot crawl-quality investment if the goal is short-window citation outcomes. The efficiency ratio means every PerplexityBot fetch is more likely to produce a visible citation than a comparable fetch by other bots. Second, separate training-time and citation-time investments mentally: GPTBot fetches feed long-run model representation, which is a different value stream from real-time citation. Third, optimise differently for Perplexity vs Claude: Perplexity rewards fetchable, well-structured content that grounds answers cleanly; Claude rewards authoritative, primary-research-grounded content that wins the internal grounding decision even when not surfaced.

How This Connects to CVS

Crawl-to-citation efficiency is one of the inputs to the outcomes signal in Citation Value Score. Pages with high efficiency ratios across multiple platforms score higher on outcomes, which contributes to the composite CVS. Pages with low efficiency ratios, even at high crawl volumes, underperform on outcomes and drag the composite. This is why crawl volume alone is a misleading metric for publisher monetization; the efficiency ratio is what determines whether the crawl actually produces value.

Methodology

Data is from joint observation of crawl events (via Cloudflare Worker plus D1 logging on monitored domains) and citation outcomes (via probe-based measurement across major AI platforms) on a sample of Presenc AI monitored domains. Efficiency ratios are reported as relative ranks rather than absolute numbers because the absolute ratios on a single domain are not directly comparable across platforms with different display behaviours. April 2026 point-in-time, quarterly updates.

Crawl-to-Citation Efficiency by Platform in 2026