The LMSYS Chatbot Arena Elo leaderboard is the most widely cited blind-A/B evaluation of frontier LLMs. This page snapshots the public leaderboard as of June 2026 across the headline categories.
Arena Hard Top 15 (June 2026)
| Rank | Model | Vendor | Elo |
|---|---|---|---|
| 1 | GPT-5.6 Pro | OpenAI | ~1465 |
| 2 | Claude Mythos 5 | Anthropic | ~1458 |
| 3 | Claude Opus 4.7 | Anthropic | ~1452 |
| 4 | Gemini 3.2 Pro | ~1448 | |
| 5 | GPT-5.6 | OpenAI | ~1440 |
| 6 | Claude Sonnet 4.6 | Anthropic | ~1428 |
| 7 | Gemini 3.2 Flash | ~1418 | |
| 8 | DeepSeek V4.1 Pro | DeepSeek | ~1410 |
| 9 | Qwen 3.7 | Alibaba | ~1400 |
| 10 | GPT-5.6 mini | OpenAI | ~1392 |
| 11 | Grok 4 | xAI | ~1385 |
| 12 | Llama 4.5 Maverick | Meta | ~1370 |
| 13 | GLM-6 | Zhipu AI | ~1360 |
| 14 | Mistral Large 3 | Mistral AI | ~1352 |
| 15 | Kimi K2.6 | Moonshot AI | ~1345 |
Key Takeaways
- GPT-5.6 Pro overtook Claude Mythos 5 in Arena Elo within two weeks of release.
- The top eight models are clustered within ~55 Elo points, the tightest spread on record.
- DeepSeek V4.1 Pro is the highest open-weight entry, within ~55 points of the top closed model.
- Specialized variants (Pro, Mythos) outscore base variants by 6 to 15 Elo on Arena Hard but the gap is narrower on default Arena.
Methodology
Scores from lmarena.ai public leaderboard; Arena Hard is the more discriminating variant used here. Elo numbers are approximate and shift daily as new votes accumulate. Updated monthly.
How Presenc AI Helps
Presenc AI tracks brand visibility on the top-Elo models continuously so brand teams see citation patterns shift in step with frontier-model competitive dynamics.