How Much of the Web Is AI Bots Now
Cloudflare sits in front of approximately 20 percent of all internet traffic, which makes its Radar dashboard the largest publicly visible window into bot and AI-crawler activity. This page consolidates Cloudflare Radar AI Insights data for Q1 2026 and the April 2026 monthly update, with the specific cite-bait numbers most likely to land in third-party reporting about AI training data and crawler behaviour.
Headline Numbers (April 2026)
| Metric | Value | Change |
|---|---|---|
| Bot traffic as share of all HTTP requests | 32.00% | +0.81 pp vs Q1 2026 (31.19%) |
| Training crawlers as share of AI bot traffic | 49.9% | Reached 50% milestone one quarter early |
| Applebot share of AI crawler traffic (Apr 2026) | 9.23% | +2 pp month-over-month (surged after Apple Intelligence push) |
| GPTBot share of AI crawler traffic (Apr 2026) | 9.84% | -2 consecutive months of decline |
| Bingbot share of AI crawler traffic (Apr 2026) | 8.04% | Overtaken by Applebot in April |
AI Crawler Ranking by Request Share (Q1 2026)
| Rank | Crawler | Operator | Share of AI Bot Requests | Primary Purpose |
|---|---|---|---|---|
| 1 | Googlebot | 31.6% | Search + AI training (Google-Extended subset) | |
| 2 | Meta-ExternalAgent | Meta | 16.7% | Training (Llama family) |
| 3 | GPTBot | OpenAI | 12.0% | Training |
| 4 | ClaudeBot | Anthropic | 11.7% | Training |
| 5 | Bingbot | Microsoft | ~8% | Search + AI training |
| 6 | Applebot | Apple | 5.8% | Search + Apple Intelligence training |
Googlebot is included in the "AI bot" tally because its content is increasingly piped into Google's Gemini training and AI Overviews. The "AI crawler-only" subset (excluding Googlebot/Bingbot dual-purpose traffic) would re-rank Meta-ExternalAgent first.
Six Things the Data Tells You
- Bots are now nearly one third of all HTTP traffic. 32 percent in April 2026, up from 31.19 percent in Q1. Bot-to-human traffic ratio is the most-quoted number in CDN-industry reporting and continues to drift upward each quarter.
- Half of all AI bot traffic is now dedicated training crawlers. 49.9 percent in Q1 2026, reaching the predicted Q2 milestone a quarter early. Training crawlers (GPTBot, ClaudeBot, Meta-ExternalAgent, CCBot, etc.) now equal the volume of search and other dual-purpose AI-related crawlers. If your site only has robots.txt rules for search crawlers, you are addressing roughly half the relevant traffic.
- Applebot is the breakout AI crawler of 2026. 5.8 percent share in Q1 2026, then a +124 percent single-month surge that lifted it to 9.23 percent by April. Apple Intelligence's training and indexing push is now visible at the CDN level. Applebot overtook Bingbot during April.
- GPTBot share is declining. 12.0 percent in Q1 falling to 9.84 percent in April, the second consecutive month of decline. The explanation is partly that other AI crawlers grew faster (Applebot, Meta-ExternalAgent) and partly that OpenAI shifted some training traffic to the alternate ChatGPT-User user-agent for live retrieval rather than training.
- ClaudeBot is the most-disallowed AI user-agent in robots.txt. ClaudeBot's share of DISALLOW rules rose from 9.6 percent in January 2026 to 10.1 percent in March, overtaking CCBot as the second-most-blocked AI agent by raw count. The publisher backlash against Anthropic is the most legible bot-blocking trend in the data.
- The "AI training share of web traffic" math now works. Bot traffic at 32 percent × AI bot share of bot traffic × training share of AI bot traffic ≈ 4-5 percent of all HTTP requests are now AI training fetches. For high-traffic publishers, this is the structural cost of being in the open web in 2026.
What This Means for AI Visibility
The crawler ranking maps directly to which training corpora future models will be trained on. A brand visible to ClaudeBot at high crawl frequency will appear more often in Claude responses two to four model generations from now; a brand only visible to Googlebot will appear in Gemini and Google AI Overviews but be weaker on Anthropic-hosted assistants. The structural implication is that brand-visibility programmes should monitor at minimum: GPTBot (OpenAI), ClaudeBot (Anthropic), Meta-ExternalAgent (Meta), and Applebot (Apple Intelligence). Sites blocking any of these are deliberately excluding themselves from that vendor's downstream model recommendations, sometimes inadvertently via aggressive WAF rules that block bots wholesale.
Methodology
Source: Cloudflare Radar AI Insights (Q1 2026 quarterly report) and the Cloudflare blog April 2026 monthly update. Numbers reflect Cloudflare's ~20 percent share of internet traffic and apply only to sites behind Cloudflare; absolute global numbers may differ. "AI bot" classification is Cloudflare's own and includes both dedicated AI training crawlers and dual-purpose search-plus-AI crawlers like Googlebot and Bingbot. Refreshed monthly with the next CF Radar update.
How Presenc AI Helps
Presenc AI tracks brand-mention rates across the major AI platforms whose training crawlers are ranked above. The connection from crawler share to AI visibility outcomes is multi-step: crawler reach → training corpus inclusion → brand recall in the deployed model → consumer-facing mention rate. Presenc AI closes the last step (mention rate) so that brand teams can determine whether their training-data presence (the first step, partly inferable from crawler logs) is translating into the consumer-visible recommendation behaviour that drives commercial outcomes.