Research

arXiv AI Paper Volume 2020-2025

How AI research publication volume on arXiv changed from 2020 to 2025. Annual submission counts by category (cs.AI, cs.CL, cs.LG, cs.CV) plotting the 6x growth in core AI submissions and the post-ChatGPT NLP surge.

By Ramanath, CTO & Co-Founder at Presenc AI · Last updated: May 2026

How Many AI Papers Are Being Published, and How Fast Is It Growing

arXiv is the de facto preprint server for AI and machine-learning research. Submission counts to its core AI categories are the closest available proxy for the volume of new AI research the world is producing each year. This page tabulates yearly submission counts for the three primary AI categories (cs.AI, cs.CL, cs.LG) from 2020 through 2025, the most recent complete calendar year. The growth curve is steeper than most observers realise.

Yearly Submission Volume by Category, 2020-2025

Yearcs.AIcs.CL (NLP)cs.LG (ML)Three-Cat Total
20207,4177,12525,88940,431
202112,5238,08326,53247,138
202214,8068,97128,72152,498
202321,85613,60133,02168,478
202433,03620,68939,78693,511
202545,13323,75346,005114,891

Growth Multiples (2025 vs 2020)

Category2020 Submissions2025 Submissions5-Year Multiple
cs.AI (artificial intelligence)7,41745,1336.1x
cs.CL (computation and language / NLP)7,12523,7533.3x
cs.LG (machine learning)25,88946,0051.8x
Combined40,431114,8912.8x

Daily Submission Rates (2025 Average)

CategoryPapers per Day
cs.AI~124
cs.CL~65
cs.LG~126
Three-category total~315

An analyst reading every cs.AI + cs.CL + cs.LG submission would need to clear 315 new papers per day, every day, throughout 2025. The figure is a lower bound because cross-listed papers (counted once here under their primary category) and adjacent categories like cs.CV (vision) are not included.

Six Things the Data Tells You

  1. Core AI submissions (cs.AI) grew 6.1x in five years. From 7,417 in 2020 to 45,133 in 2025. The compound annual growth rate is approximately 43 percent. No other research area in any field has matched this pace over the same window.
  2. The post-ChatGPT NLP surge is visible in the data. cs.CL grew modestly from 2020 to 2022 (7,125 to 8,971, a 26 percent total increase). Then cs.CL grew 52 percent in 2023 alone (to 13,601) and another 52 percent in 2024 (to 20,689). The ChatGPT-era influx of researchers into NLP shows up cleanly in arXiv data.
  3. cs.LG (machine learning) was already huge in 2020. 25,889 submissions, larger than cs.AI plus cs.CL combined. The "AI boom" added cs.AI papers, not cs.LG papers; cs.LG grew at a much steadier 12 percent CAGR.
  4. 2024 was the year the curve bent up. Year-over-year growth across the three categories: 2021/2020 +17 percent, 2022/2021 +11 percent, 2023/2022 +30 percent, 2024/2023 +37 percent, 2025/2024 +23 percent. The 2023-2024 transition is the steepest acceleration, consistent with ChatGPT-era researcher influx fully reaching arXiv submission cadence.
  5. cs.AI overtook cs.CL in 2021 and never looked back. By 2025, cs.AI submissions are nearly 2x cs.CL submissions, a reversal of the 2020 ranking. The shift reflects work being filed under cs.AI as the umbrella category for agents, reasoning, and multimodal systems that do not fit neatly under language or vision.
  6. The literature is now uncoverable. 315 new papers per day in three categories alone means any single researcher reads less than 1 percent of new submissions. Survey papers, paper-of-the-week summary newsletters, and AI-assisted literature search tools are not optional any more; they are the only way to maintain awareness.

What This Means for AI Visibility

Two implications matter for brand-visibility programmes. First, training data exposure decays in relative terms even when absolute brand-content production is constant: if a brand publishes 12 white papers per year, that fraction of total annual AI literature dropped roughly 65 percent from 2020 to 2025. Brands that maintained content production at flat volume have a structurally smaller share of AI training corpora than they did five years ago. Second, the cs.AI explosion is heavily oriented toward agents, tool-use, and brand-relevant evaluation work; brand-visibility teams should expect the next two years of AI research to produce many more papers directly about how AI assistants recommend, compare, and reason about brands. Tracking the cs.AI literature for brand-relevant methodology should now be a documented part of any sophisticated GEO strategy.

Methodology

Submission counts pulled from the arXiv public API on May 14, 2026. Query format: cat:<category> AND submittedDate:[YYYY01010000 TO YYYY12312359]. Counts are total submissions in the primary-category-or-cross-listed sense (the same paper appears once per category it is listed under, so the three-category total double-counts papers cross-listed between cs.AI / cs.CL / cs.LG). cs.CV (computer vision) was attempted but the API consistently returned zero for years after 2020 within our collection window and is excluded; the 2020 cs.CV figure was 15,340 submissions for context. 2026 YTD is excluded because partial-year comparisons distort the trend. Refreshed annually after each year-end snapshot.

How Presenc AI Helps

Presenc AI tracks how brand mentions appear inside AI assistant responses, which is the consumer-facing layer downstream of the research literature this page measures. As cs.AI publication volume continues to scale, more of the methods that determine which brand surfaces in an AI answer (retrieval, ranking, citation, agent tool-selection) are originating in research published on arXiv. For sophisticated brand-visibility teams, watching arXiv for papers about brand-comparison methodologies in LLMs is now part of staying ahead of the next cycle of platform changes.

Frequently Asked Questions

Combined across the three core AI categories (cs.AI, cs.CL, cs.LG), 114,891 submissions in 2025. Individually: cs.AI 45,133, cs.CL 23,753, cs.LG 46,005. The combined total is approximately 2.8x the 2020 figure and works out to about 315 new papers per day.
cs.AI (artificial intelligence) is growing fastest, at 6.1x from 2020 to 2025 and a CAGR near 43 percent. cs.CL (NLP) grew 3.3x with a sharp post-ChatGPT acceleration in 2023-2024. cs.LG was already large in 2020 and grew slowest among the three, at 1.8x.
Yes, particularly in cs.CL. NLP submissions grew at single-digit percentages year over year through 2022, then jumped 52 percent in 2023 and another 52 percent in 2024. The two-year doubling immediately after ChatGPT&apos;s launch is the most legible AI-event-driven shift in arXiv submission history.
cs.LG (machine learning) was already the largest AI-adjacent category in 2020 with 25,889 submissions, a high baseline that limits multiplicative growth. Most of the post-2022 influx of new researchers filed under cs.AI as the umbrella for agent, reasoning, and multimodal work that did not cleanly fit cs.LG or cs.CL. The ranking changed accordingly.
Not in any of the three categories on this page. At 124 papers per day in cs.AI alone, no individual reads everything. Survey papers, weekly summary newsletters, and AI-assisted literature search are now the de facto reading layer. Researchers focus on subareas, watch citation graphs, and rely on filtered feeds rather than comprehensive review.

Track Your AI Visibility

See how your brand appears across ChatGPT, Claude, Perplexity, and other AI platforms. Start monitoring today.