What This Page Is
Most published research on AI crawler behaviour relies on aggregated industry datasets, leaked logs, or vendor-disclosed numbers. This page is different. It reports on a single domain (presenc.ai) using a Cloudflare Worker that logs every inbound request to a D1 database with 41 fields per row, including IP, ASN, declared user agent, JA3 fingerprint, referer, requested path, response status, country, and timing data. The data below covers the calendar month of April 2026.
The point is not that one domain is representative of the whole web. It is that one domain with disciplined logging gives a clean, falsifiable view of how AI crawlers actually behave when nothing about the site is hidden from them. Presenc AI runs no robots.txt blocks for AI crawlers and no CDN-level rate limits, so the numbers reflect crawler behaviour against a fully open target.
Top AI Bot User Agents by Request Volume
The table below ranks declared AI-crawler user agents by total requests during April 2026. Counts are deduplicated by IP plus path within a 60-second window to avoid double-counting retries.
| User agent (declared) | Operator | Requests | Unique paths | Share of AI traffic |
|---|---|---|---|---|
| GPTBot | OpenAI | Highest volume of any AI crawler | Broad coverage of /research and /guides | ~30 to 35% |
| OAI-SearchBot | OpenAI | Second-highest, growing fastest | Concentrated on hub pages and recent posts | ~15 to 20% |
| PerplexityBot | Perplexity | Steady, high path diversity | Heavy /compare and /alternatives focus | ~12 to 15% |
| ClaudeBot | Anthropic | Moderate, episodic bursts | Often re-fetches the same URLs | ~10 to 12% |
| Google-Extended | Lower than Googlebot, distinctly different pattern | Tracks sitemap.xml updates | ~6 to 8% | |
| Bytespider | ByteDance | Spiky, often blocked at edge by other sites | Indiscriminate path coverage | ~3 to 5% |
| Amazonbot | Amazon (Alexa/Rufus context) | Low but consistent | Product and pricing pages | ~2 to 3% |
| Applebot-Extended | Apple | Very low, present | Top-level hubs only | under 1% |
| Meta-ExternalAgent | Meta | Very low | Recent crawl additions | under 1% |
A few observations. First, OpenAI accounts for roughly half of all AI crawler traffic when GPTBot and OAI-SearchBot are combined. This understates OpenAI's real share because ChatGPT-User (the on-demand fetcher triggered by ChatGPT browsing) is excluded from this declared-bot count and reported separately. Second, Google-Extended remains a small fraction of overall Google fetch traffic on this domain, with Googlebot itself responsible for the majority of Google-side crawling. Third, the long tail of declared AI bots (Apple, Meta, Cohere) is real but small. The market is concentrated at the top.
The ChatGPT-User and Browse Traffic Layer
The user-triggered browse traffic (ChatGPT-User, Claude with browsing enabled, Perplexity's on-demand fetches) is structurally different from the scheduled crawler traffic above. It comes in human-like burst patterns, often requests one specific path tied to a current user query, and rarely revisits. This layer matters disproportionately because it is the moment a brand has the chance to be cited in a live answer.
In April 2026, ChatGPT-User accounted for a meaningfully larger share of OpenAI fetches than GPTBot did during weekday business hours in the Americas and Europe. The ratio inverts overnight, when GPTBot does most of its scheduled work. This shift in mix matters for performance and serving strategy: the pages that need to be fast and complete in real time are different from the pages that need to be discoverable by scheduled crawlers.
What Changed Since March 2026
Comparing April 2026 to March 2026 logs, three trends stand out. OAI-SearchBot volume grew the fastest among declared crawlers, roughly doubling its request count as OpenAI's search product rolled out to more surfaces. ClaudeBot fetches became more bursty, with several days showing 5x to 10x normal volume followed by quiet periods, consistent with batch retraining or evaluation runs. PerplexityBot fetched a higher proportion of /compare and /alternatives URLs than in March, which is what you would expect from a system increasingly used for tool-shopping queries.
Methodology
Data source: Cloudflare Worker logging every inbound request from presenc.ai to a Cloudflare D1 database. Schema captures 41 fields including request URL, method, declared user agent, IP, ASN, JA3 TLS fingerprint, referer, response status, response bytes, country, colo, and timing. AI crawler classification is based on declared user agent strings cross-referenced against published documentation from OpenAI, Anthropic, Google, Perplexity, ByteDance, Amazon, Apple, and Meta. Volumes are reported as relative shares because the absolute traffic of a single domain is not the point; the relative ranking and the trend are.
How Presenc AI Helps
The same logging stack is available as a product feature for any brand that wants this view of its own traffic. Presenc AI customers can deploy a similar Cloudflare Worker against their own zone, route logs into a managed store, and read the resulting analytics in the same dashboard as their AI visibility scores. The combination of "what AI says about you" and "which AI bots fetch you" is the full feedback loop most brands are missing.