Which AI bot crawls the most?

On presenc.ai during April 2026, OpenAI accounted for roughly half of all declared AI crawler traffic when GPTBot and OAI-SearchBot are combined. PerplexityBot was the next largest, followed by ClaudeBot and Google-Extended. The exact mix depends on the site, but the top-three concentration is consistent with what other publishers have reported.

Is OAI-SearchBot the same as GPTBot?

No. OpenAI runs at least three distinct crawl identities: GPTBot for scheduled training-data crawling, OAI-SearchBot for the search index that powers ChatGPT search, and ChatGPT-User for on-demand fetches triggered by user queries. Each one has different request patterns, frequencies, and purposes.

Why does ChatGPT-User matter more than GPTBot for live answers?

GPTBot determines what ChatGPT learned during training. ChatGPT-User determines what ChatGPT can fetch in real time when answering a current user query. For pages that change frequently or for brands that want to be cited in live answers with browsing enabled, ChatGPT-User behaviour is the more direct signal.

Are these numbers representative of the whole web?

Not directly. They reflect one domain (presenc.ai) with no AI bot blocks, no CDN rate limits, and a content profile heavy on AI-related research and guides. The top-three concentration and the GPTBot vs OAI-SearchBot split are likely generalisable. Absolute volumes are not.

AI Bots Observed on Presenc AI: April 2026 Crawl Log Analysis

What This Page Is

Most published research on AI crawler behaviour relies on aggregated industry datasets, leaked logs, or vendor-disclosed numbers. This page is different. It reports on a single domain (presenc.ai) using a Cloudflare Worker that logs every inbound request to a D1 database with 41 fields per row, including IP, ASN, declared user agent, JA3 fingerprint, referer, requested path, response status, country, and timing data. The data below covers the calendar month of April 2026.

The point is not that one domain is representative of the whole web. It is that one domain with disciplined logging gives a clean, falsifiable view of how AI crawlers actually behave when nothing about the site is hidden from them. Presenc AI runs no robots.txt blocks for AI crawlers and no CDN-level rate limits, so the numbers reflect crawler behaviour against a fully open target.

Top AI Bot User Agents by Request Volume

The table below ranks declared AI-crawler user agents by total requests during April 2026. Counts are deduplicated by IP plus path within a 60-second window to avoid double-counting retries.

User agent (declared)	Operator	Requests	Unique paths	Share of AI traffic
GPTBot	OpenAI	Highest volume of any AI crawler	Broad coverage of /research and /guides	~30 to 35%
OAI-SearchBot	OpenAI	Second-highest, growing fastest	Concentrated on hub pages and recent posts	~15 to 20%
PerplexityBot	Perplexity	Steady, high path diversity	Heavy /compare and /alternatives focus	~12 to 15%
ClaudeBot	Anthropic	Moderate, episodic bursts	Often re-fetches the same URLs	~10 to 12%
Google-Extended	Google	Lower than Googlebot, distinctly different pattern	Tracks sitemap.xml updates	~6 to 8%
Bytespider	ByteDance	Spiky, often blocked at edge by other sites	Indiscriminate path coverage	~3 to 5%
Amazonbot	Amazon (Alexa/Rufus context)	Low but consistent	Product and pricing pages	~2 to 3%
Applebot-Extended	Apple	Very low, present	Top-level hubs only	under 1%
Meta-ExternalAgent	Meta	Very low	Recent crawl additions	under 1%

A few observations. First, OpenAI accounts for roughly half of all AI crawler traffic when GPTBot and OAI-SearchBot are combined. This understates OpenAI's real share because ChatGPT-User (the on-demand fetcher triggered by ChatGPT browsing) is excluded from this declared-bot count and reported separately. Second, Google-Extended remains a small fraction of overall Google fetch traffic on this domain, with Googlebot itself responsible for the majority of Google-side crawling. Third, the long tail of declared AI bots (Apple, Meta, Cohere) is real but small. The market is concentrated at the top.

The ChatGPT-User and Browse Traffic Layer

The user-triggered browse traffic (ChatGPT-User, Claude with browsing enabled, Perplexity's on-demand fetches) is structurally different from the scheduled crawler traffic above. It comes in human-like burst patterns, often requests one specific path tied to a current user query, and rarely revisits. This layer matters disproportionately because it is the moment a brand has the chance to be cited in a live answer.

In April 2026, ChatGPT-User accounted for a meaningfully larger share of OpenAI fetches than GPTBot did during weekday business hours in the Americas and Europe. The ratio inverts overnight, when GPTBot does most of its scheduled work. This shift in mix matters for performance and serving strategy: the pages that need to be fast and complete in real time are different from the pages that need to be discoverable by scheduled crawlers.

What Changed Since March 2026

Comparing April 2026 to March 2026 logs, three trends stand out. OAI-SearchBot volume grew the fastest among declared crawlers, roughly doubling its request count as OpenAI's search product rolled out to more surfaces. ClaudeBot fetches became more bursty, with several days showing 5x to 10x normal volume followed by quiet periods, consistent with batch retraining or evaluation runs. PerplexityBot fetched a higher proportion of /compare and /alternatives URLs than in March, which is what you would expect from a system increasingly used for tool-shopping queries.

Methodology

Data source: Cloudflare Worker logging every inbound request from presenc.ai to a Cloudflare D1 database. Schema captures 41 fields including request URL, method, declared user agent, IP, ASN, JA3 TLS fingerprint, referer, response status, response bytes, country, colo, and timing. AI crawler classification is based on declared user agent strings cross-referenced against published documentation from OpenAI, Anthropic, Google, Perplexity, ByteDance, Amazon, Apple, and Meta. Volumes are reported as relative shares because the absolute traffic of a single domain is not the point; the relative ranking and the trend are.

How Presenc AI Helps

The same logging stack is available as a product feature for any brand that wants this view of its own traffic. Presenc AI customers can deploy a similar Cloudflare Worker against their own zone, route logs into a managed store, and read the resulting analytics in the same dashboard as their AI visibility scores. The combination of "what AI says about you" and "which AI bots fetch you" is the full feedback loop most brands are missing.

AI Bots Observed on Presenc AI in April 2026