Q2 2026 is the quarter open-weight LLMs stopped being the budget alternative and became genuine frontier competitors. Six labs shipped models in April alone that compete with closed-frontier benchmarks on workloads enterprises actually run. Meta confirmed plans to keep Llama 5 open-weight at frontier capability. The cost-performance and capability-performance crossover are happening simultaneously.
Our existing open-source LLM landscape covers the broad market. This page is a Q2-specific delta: what changed in the last 90 days, and what brand teams should do differently as a result.
The Q2 2026 Open-Weight Wave
| Model | Lab | License | Active params / Total | Headline capability |
|---|---|---|---|---|
| Gemma 4 31B Dense | Apache 2.0 | 31B / 31B | Beats 20x-larger models on MMLU and GPQA | |
| Qwen 3.6-27B Dense | Alibaba | Apache 2.0 | 27B / 27B | Fits in 18GB RAM; SWE-bench Verified 77.2% |
| Kimi K2.6 | Moonshot AI | MIT-mod | 32B / 1T MoE | BrowseComp 83.2%, $0.95/M input tokens |
| GLM-5.1 | Zhipu AI | MIT | 40B / 744B MoE | SWE-Bench Pro tops Claude Opus 4.6 and GPT-5.4 |
| Llama 4 Scout / Maverick | Meta | Llama community | Scout: dense; Maverick: 400B MoE | 10M context (Scout); frontier-tier (Maverick) |
| DeepSeek V4 Flash | DeepSeek | DeepSeek-permissive | Frontier MoE | 1M context, frontier coding parity |
| Nemotron 3 Nano Omni | NVIDIA | NVIDIA-permissive | Multimodal | 9x more efficient agentic inference |
Two patterns matter for brand teams. First, the licenses are converging on Apache 2.0 or near-equivalents (Kimi MIT-mod, GLM MIT, NVIDIA-permissive). The commercial-use friction that historically slowed Llama enterprise adoption is gone. Second, the capability spread between the strongest open-weight models and the strongest closed-weight models on real workloads is narrower than at any point in LLM history.
The Brand-Visibility Implication: Surface Expansion You Cannot Monitor
Closed-weight model visibility is monitorable because the deployment surface is finite. ChatGPT, Claude.ai, Gemini, Microsoft Copilot, Perplexity, and a known set of API-based wrappers cover almost all production exposure. Open-weight visibility surface expands in places you cannot directly observe:
- IDE and developer tools. Cursor, Continue, Aider, Cline, Codeium, and dozens of less-named tools route to whichever open-weight model has the best price-performance for the task. DeepSeek V4 Flash and Qwen 3.6-27B are now defaults in many.
- Self-hosted enterprise RAG. Banks, healthcare systems, defense contractors, and regulated industries that cannot send data to OpenAI now run Gemma 4 or Llama 4 internally. Your brand visibility inside their procurement workflows depends on Gemma 4's and Llama 4's training corpora.
- Consumer apps and browser extensions. A growing class of consumer products embed Qwen 3.6-27B or Kimi K2.6 directly via Hugging Face transformers or llama.cpp. Most do not advertise their model choice. Your brand can be misrepresented in places you cannot easily audit.
- On-device deployment. Gemma 4 2B and 9B variants, Qwen 3.6 small variants, and quantized DeepSeek V4 Flash run on consumer hardware. The on-device deployment surface inside Apple Intelligence, Pixel AI, and Galaxy AI ecosystems is now meaningfully open-weight.
What Open-Weight Training Corpora Reward
Training corpora for open-weight models are documented in ways closed labs do not match. The published training mixes for Llama 4, Qwen 3, Kimi K2, GLM-5, and Gemma 4 share five overweighted source categories: GitHub code commons, Common Crawl with quality filtering, Wikipedia in 25+ languages, peer-reviewed scientific papers (arXiv, PubMed), and open-license textbooks.
Brands strong on those sources earn disproportionate open-weight recall. Brands relying on press coverage and marketing-domain content (which appears in closed-lab training but is filtered more aggressively in open-weight mixes) underperform.
The actionable list:
- Audit your GitHub presence. Public repos with strong README files, technical documentation, and example code show up in code-commons training disproportionately. For technical brands this is the highest-leverage open-weight optimization.
- Strengthen your Wikipedia presence. Wikipedia overweight is consistent across open-weight training mixes. Wikipedia effect research applies more strongly to open-weight than to closed-weight models.
- Get into peer-reviewed or arXiv-indexed work where credible. Open-weight training mixes overweight scientific corpora. Brands that contribute primary research, white papers, or technical reports to these channels earn outsized recall.
- Diversify off marketing-domain content. Pages that read like marketing get filtered more aggressively in open-weight quality filtering than in closed-weight equivalents.
- Monitor on at least two open-weight models in your AI visibility program. DeepSeek V4 Flash and Llama 4 Maverick or Gemma 4 31B Dense are reasonable picks for global English-speaking markets. Add Qwen 3.6 or Kimi K2.6 for Asia-facing brands.
The Geographic Asymmetry
Open-weight crossover is not uniform across markets. Chinese-market AI assistants increasingly default to Qwen, DeepSeek, and Kimi. Western enterprise deployments split between Llama and Gemma. UAE and MENA deployments split between Falcon and Llama. Korean ecosystems lean on HyperCLOVA alongside Western open-weight models. Falcon and Jais research covers the Arabic-language case; Chinese open-source LLM comparison covers the China case.