How Many AI Papers Are Being Published, and How Fast Is It Growing
arXiv is the de facto preprint server for AI and machine-learning research. Submission counts to its core AI categories are the closest available proxy for the volume of new AI research the world is producing each year. This page tabulates yearly submission counts for the three primary AI categories (cs.AI, cs.CL, cs.LG) from 2020 through 2025, the most recent complete calendar year. The growth curve is steeper than most observers realise.
Yearly Submission Volume by Category, 2020-2025
| Year | cs.AI | cs.CL (NLP) | cs.LG (ML) | Three-Cat Total |
|---|---|---|---|---|
| 2020 | 7,417 | 7,125 | 25,889 | 40,431 |
| 2021 | 12,523 | 8,083 | 26,532 | 47,138 |
| 2022 | 14,806 | 8,971 | 28,721 | 52,498 |
| 2023 | 21,856 | 13,601 | 33,021 | 68,478 |
| 2024 | 33,036 | 20,689 | 39,786 | 93,511 |
| 2025 | 45,133 | 23,753 | 46,005 | 114,891 |
Growth Multiples (2025 vs 2020)
| Category | 2020 Submissions | 2025 Submissions | 5-Year Multiple |
|---|---|---|---|
| cs.AI (artificial intelligence) | 7,417 | 45,133 | 6.1x |
| cs.CL (computation and language / NLP) | 7,125 | 23,753 | 3.3x |
| cs.LG (machine learning) | 25,889 | 46,005 | 1.8x |
| Combined | 40,431 | 114,891 | 2.8x |
Daily Submission Rates (2025 Average)
| Category | Papers per Day |
|---|---|
| cs.AI | ~124 |
| cs.CL | ~65 |
| cs.LG | ~126 |
| Three-category total | ~315 |
An analyst reading every cs.AI + cs.CL + cs.LG submission would need to clear 315 new papers per day, every day, throughout 2025. The figure is a lower bound because cross-listed papers (counted once here under their primary category) and adjacent categories like cs.CV (vision) are not included.
Six Things the Data Tells You
- Core AI submissions (cs.AI) grew 6.1x in five years. From 7,417 in 2020 to 45,133 in 2025. The compound annual growth rate is approximately 43 percent. No other research area in any field has matched this pace over the same window.
- The post-ChatGPT NLP surge is visible in the data. cs.CL grew modestly from 2020 to 2022 (7,125 to 8,971, a 26 percent total increase). Then cs.CL grew 52 percent in 2023 alone (to 13,601) and another 52 percent in 2024 (to 20,689). The ChatGPT-era influx of researchers into NLP shows up cleanly in arXiv data.
- cs.LG (machine learning) was already huge in 2020. 25,889 submissions, larger than cs.AI plus cs.CL combined. The "AI boom" added cs.AI papers, not cs.LG papers; cs.LG grew at a much steadier 12 percent CAGR.
- 2024 was the year the curve bent up. Year-over-year growth across the three categories: 2021/2020 +17 percent, 2022/2021 +11 percent, 2023/2022 +30 percent, 2024/2023 +37 percent, 2025/2024 +23 percent. The 2023-2024 transition is the steepest acceleration, consistent with ChatGPT-era researcher influx fully reaching arXiv submission cadence.
- cs.AI overtook cs.CL in 2021 and never looked back. By 2025, cs.AI submissions are nearly 2x cs.CL submissions, a reversal of the 2020 ranking. The shift reflects work being filed under cs.AI as the umbrella category for agents, reasoning, and multimodal systems that do not fit neatly under language or vision.
- The literature is now uncoverable. 315 new papers per day in three categories alone means any single researcher reads less than 1 percent of new submissions. Survey papers, paper-of-the-week summary newsletters, and AI-assisted literature search tools are not optional any more; they are the only way to maintain awareness.
What This Means for AI Visibility
Two implications matter for brand-visibility programmes. First, training data exposure decays in relative terms even when absolute brand-content production is constant: if a brand publishes 12 white papers per year, that fraction of total annual AI literature dropped roughly 65 percent from 2020 to 2025. Brands that maintained content production at flat volume have a structurally smaller share of AI training corpora than they did five years ago. Second, the cs.AI explosion is heavily oriented toward agents, tool-use, and brand-relevant evaluation work; brand-visibility teams should expect the next two years of AI research to produce many more papers directly about how AI assistants recommend, compare, and reason about brands. Tracking the cs.AI literature for brand-relevant methodology should now be a documented part of any sophisticated GEO strategy.
Methodology
Submission counts pulled from the arXiv public API on May 14, 2026. Query format: cat:<category> AND submittedDate:[YYYY01010000 TO YYYY12312359]. Counts are total submissions in the primary-category-or-cross-listed sense (the same paper appears once per category it is listed under, so the three-category total double-counts papers cross-listed between cs.AI / cs.CL / cs.LG). cs.CV (computer vision) was attempted but the API consistently returned zero for years after 2020 within our collection window and is excluded; the 2020 cs.CV figure was 15,340 submissions for context. 2026 YTD is excluded because partial-year comparisons distort the trend. Refreshed annually after each year-end snapshot.
How Presenc AI Helps
Presenc AI tracks how brand mentions appear inside AI assistant responses, which is the consumer-facing layer downstream of the research literature this page measures. As cs.AI publication volume continues to scale, more of the methods that determine which brand surfaces in an AI answer (retrieval, ranking, citation, agent tool-selection) are originating in research published on arXiv. For sophisticated brand-visibility teams, watching arXiv for papers about brand-comparison methodologies in LLMs is now part of staying ahead of the next cycle of platform changes.