Research

LLM Context Window Comparison June 2026

Current spec snapshot of every major frontier LLM context window in June 2026. Gemini 3.2 Pro at 2M, Llama 4.5 Scout at 10M, DeepSeek V4.1 and Qwen 3.7 at 1M.

By Ramanath, CTO & Co-Founder at Presenc AI · Last updated: June 2026

This page snapshots the current context-window specification for every major frontier LLM as of June 2026, organized by maximum advertised context size.

Context Window Snapshot

ModelVendorMax ContextType
Llama 4.5 ScoutMeta10,000,000 tokensOpen-weight
Gemini 3.2 ProGoogle2,000,000 tokensClosed
Gemini 3.2 FlashGoogle2,000,000 tokensClosed
Llama 4.5 MaverickMeta1,000,000 tokensOpen-weight
Claude Opus 4.7 (1M variant)Anthropic1,000,000 tokensClosed
Claude Sonnet 4.6 (1M variant)Anthropic1,000,000 tokensClosed
DeepSeek V4.1 FlashDeepSeek1,000,000 tokensOpen + closed
DeepSeek V4.1 ProDeepSeek1,000,000 tokensClosed
Qwen 3.7 (flagship)Alibaba1,000,000 tokensOpen + closed
Hunyuan Large 3Tencent512,000 tokensClosed + partial open
GPT-5.6 / ProOpenAI256,000 tokensClosed
Mistral Medium 3Mistral AI256,000 tokensClosed + self-host
ERNIE 5.1Baidu256,000 tokensClosed
Doubao Pro (June 2026)ByteDance256,000 tokensClosed
GLM-6Zhipu AI256,000 tokensOpen
Claude Opus 4.7 (standard)Anthropic200,000 tokensClosed
Claude Sonnet 4.6 (standard)Anthropic200,000 tokensClosed
Claude Mythos 5Anthropic200,000 tokensClosed
Claude Haiku 4.5Anthropic200,000 tokensClosed

Key Takeaways

  • Llama 4.5 Scout at 10M tokens leads the industry by an order of magnitude.
  • Gemini 3.2 Pro at 2M leads among closed models with retrieval quality refined in the 3.2 release.
  • 1M context is now the open-weight frontier baseline (DeepSeek V4.1, Qwen 3.7, Llama 4.5 Maverick).
  • 256K is the modal context size for the Chinese consumer-anchored frontier (ERNIE, Doubao, GLM, Hunyuan).
  • Maximum advertised context does not equal usable context; retrieval quality at the upper end varies materially by model.

Methodology

Specs from vendor disclosures as of June 2026. Maximum context is the advertised ceiling; effective context (where retrieval quality degrades meaningfully) is typically 50 to 80% of maximum. Updated monthly.

How Presenc AI Helps

Presenc AI tracks how long-context capability shapes brand-visibility behavior. Long-context models reward authoritative long-form content and penalize thin marketing pages at the synthesis step.

Frequently Asked Questions

Llama 4.5 Scout from Meta at 10,000,000 tokens leads the industry by an order of magnitude.
Gemini 3.2 Pro and Flash from Google at 2,000,000 tokens.
No. Effective context (where retrieval quality degrades meaningfully) is typically 50 to 80% of maximum. Long-context evaluations like RULER and LongBench measure usable context separately from advertised maximum.
Not consistently. Retrieval quality at the upper end varies materially by model. Gemini 3.2 specifically addressed the documented long-context retrieval degradation in 3.1 at the 2M ceiling.

Track Your AI Visibility

See how your brand appears across ChatGPT, Claude, Perplexity, and other AI platforms. Start monitoring today.