Research

Open Source vs Closed AI Market Share 2026

Open source AI share grew from 1% to 15% in 12 months: DeepSeek V4 at $0.14/M tokens, Qwen 3.5, Llama 4, Kimi K2.6. The closing performance gap and what it means for AI economics.

By Ramanath, CTO & Co-Founder at Presenc AI · Last updated: May 2026

Open-weight AI models captured approximately 15 percent of inference market share by January 2026, up from about 1 percent twelve months earlier. DeepSeek V4 reaches $0.14 per million input tokens, roughly 1/20th of GPT-5\u2019s comparable tier. Qwen 3.5, Llama 4, Kimi K2.6, and GLM-5.1 all reached or approached frontier capability on multiple benchmarks. Eighty-nine percent of enterprises now use at least one open-source model in production. This page consolidates the market share data, pricing comparisons, capability benchmarks, and the strategic implications.

Key Findings

  1. Combined open-weight AI inference market share (DeepSeek, Qwen, Llama, Kimi, GLM, Mistral open releases, others) grew from approximately 1 percent in January 2025 to approximately 15 percent by January 2026.
  2. DeepSeek V4 publishes input token pricing at $0.14 per million tokens, compared to approximately $2.50 to $3.00 per million tokens for GPT-5 standard tier and $3.00 per million for Claude 4.7 Opus.
  3. Approximately 89 percent of enterprises now use at least one open-source AI model in production, up from approximately 32 percent in early 2024 per cross-industry survey data.
  4. Performance gap on standard benchmarks (MMLU, GSM8K, HumanEval) effectively closed in 2026: the leading open-weight model is within 5 percentage points of the leading closed model on most benchmark categories.
  5. Reasoning benchmarks show the persistent closed-model lead: GPT-5.5 and Claude 4.7 Opus retain a 15-25 point lead over the strongest open-weight models on ARC-AGI-2, FrontierMath, and Humanity\u2019s Last Exam.

Inference Market Share by Provider Family (January 2026)

Provider FamilyShare of Inference TokensYoY Change
OpenAI (GPT-4o, GPT-5, GPT-5.5)~33%Down from ~45%
Anthropic (Claude 3.5, 4.x, 4.7)~22%Up from ~14%
Google (Gemini 1.5, 2.x, 3.x)~17%Up from ~13%
xAI (Grok 2, 3, 4)~4%Up from ~2%
DeepSeek (V3, V4)~6%Up from ~0.5%
Qwen (3, 3.5)~5%Up from ~0.4%
Llama (3.x, 4.x)~3%Up from ~1%
Kimi / Moonshot~1%Up from minimal
GLM (Zhipu)~1%Up from minimal
Mistral open releases~0.5%Up modestly
Other open weight + niche closed~7.5%Various

Pricing Comparison (Per Million Input Tokens, May 2026)

ModelInput PriceOutput Price
GPT-5.5$3.00 (standard)$15.00
Claude 4.7 Opus$3.00$15.00
Gemini 3.1 Pro$1.25 / $2.50 (depending on tier)$5.00 / $10.00
GPT-5.5 mini$0.15$0.60
Claude Haiku 4.5$0.20$0.80
Gemini 3.1 Flash$0.075$0.30
DeepSeek V4 (provider price)$0.14$0.28 (off-peak), $0.56 (peak)
Qwen 3.5 (Alibaba Cloud)$0.20$0.50
Llama 4 Maverick (Together AI)$0.30$0.60
Llama 4 (self-hosted, 1k req/day baseline)~$0.05 effective~$0.10 effective

Capability Benchmark Comparison

BenchmarkBest ClosedBest OpenGap
MMLU92% (GPT-5.5)90% (DeepSeek V4, Qwen 3.5 Max)~2%
GSM8K97% (multiple at saturation)96% (DeepSeek V4 Reasoning)~1%
HumanEval97% (Claude 4.7)93% (Qwen 3.5)~4%
ARC-AGI-285% (GPT-5.5)69% (DeepSeek V4 Reasoning)~16%
FrontierMath53% (GPT-5.5 with tools)~22% (best open)~31%
Humanity\u2019s Last Exam~38% (GPT-5.5)~14% (best open)~24%
SWE-Bench Verified82% (Claude 4.7 + Code)~58% (Qwen 3.5 + tooling)~24%
Chatbot Arena (LMSYS Elo)~1400 (GPT-5.5)~1340 (DeepSeek V4)~60 Elo

Enterprise Open-Source Adoption

Use CaseShare Using Open Model
Internal knowledge assistants and RAG~78%
Code completion and review~62%
Customer-facing chatbots~38%
Sensitive-data workflows (legal, finance, healthcare)~71%
Edge or on-device inference~94%
Cost-sensitive high-volume inference~85%
Reasoning-heavy production workloads~28%

Strategic Context

Three structural patterns define the 2026 open versus closed dynamic. First, the capability plateau on standard benchmarks: open-weight models are at or near closed-model parity on most non-reasoning benchmarks, removing a key historical justification for premium closed-model pricing in routine workloads. Second, the reasoning gap persists: closed models retain a 15 to 30 percentage point lead on the hardest reasoning benchmarks, justifying premium pricing for high-judgment workloads. Third, the deployment-mode bifurcation: enterprises increasingly run a tiered stack with open-weight models for high-volume routine inference and premium closed-model APIs for reasoning-heavy or sensitive workflows.

Brand Visibility Implications

The open versus closed AI debate is one of the highest-traffic categories in enterprise AI journalism. AI assistant queries about model selection, open-source AI economics, GPU self-hosting, AI inference cost, and adjacent topics drive sustained procurement-research traffic. Brands selling inference infrastructure, model serving, fine-tuning services, RAG tooling, and adjacent products face strong AI-mediated discovery surface for this category.

Methodology

Market share figures aggregated from OpenRouter public usage data, Artificial Analysis benchmarks, Together AI and Anyscale inference platform disclosures, and provider API revenue estimates. Enterprise adoption figures from cross-industry survey data through Q1 2026. Updated quarterly.

How Presenc AI Helps

Presenc AI monitors brand visibility on open vs closed AI queries across ChatGPT, Claude, Gemini, and Perplexity. For inference infrastructure providers, model serving platforms, and fine-tuning service brands, the platform identifies the prompts driving procurement-research traffic and the gaps where new content unlocks share of voice.

Frequently Asked Questions

Approximately 15 percent of inference token volume in January 2026, up from approximately 1 percent twelve months earlier. The growth was driven by DeepSeek V3 and V4, Qwen 3 and 3.5, Llama 4, Kimi K2.6, and GLM-5.1 releases that reached or approached frontier capability on multiple benchmarks.
Yes for input tokens at provider list pricing. DeepSeek V4 is $0.14 per million input tokens versus approximately $3.00 for GPT-5 standard tier. The effective comparison depends on workload, off-peak pricing, and use of caching; the headline price ratio is real but production economics vary.
Partially. On routine benchmarks (MMLU, GSM8K, HumanEval) the gap is approximately 1 to 5 percentage points and effectively closed for production purposes. On reasoning benchmarks (ARC-AGI-2, FrontierMath, Humanity\u2019s Last Exam, SWE-Bench Verified) closed models retain a 15 to 30 percentage point lead.
Approximately 89 percent of enterprises use at least one open-source AI model in production as of May 2026, up from approximately 32 percent in early 2024. The highest open-model penetration is in edge / on-device inference (~94 percent), high-volume cost-sensitive workloads (~85 percent), and sensitive-data workflows (~71 percent).
Most do not. The dominant pattern is a tiered stack: open-weight for high-volume routine inference, closed-model APIs for reasoning-heavy or sensitive workflows. The economics support both being present in most enterprise AI architectures.

Track Your AI Visibility

See how your brand appears across ChatGPT, Claude, Perplexity, and other AI platforms. Start monitoring today.