Research

Chatbot Arena Elo Leaderboard June 2026

LMSYS Chatbot Arena Elo leaderboard for June 2026. GPT-5.6, Claude Opus 4.7, Gemini 3.2 Pro, and Claude Mythos 5 lead the frontier; DeepSeek V4.1 sits in the top open-weight slot.

By Ramanath, CTO & Co-Founder at Presenc AI · Last updated: June 2026

The LMSYS Chatbot Arena Elo leaderboard is the most widely cited blind-A/B evaluation of frontier LLMs. This page snapshots the public leaderboard as of June 2026 across the headline categories.

Arena Hard Top 15 (June 2026)

Rank	Model	Vendor	Elo
1	GPT-5.6 Pro	OpenAI	~1465
2	Claude Mythos 5	Anthropic	~1458
3	Claude Opus 4.7	Anthropic	~1452
4	Gemini 3.2 Pro	Google	~1448
5	GPT-5.6	OpenAI	~1440
6	Claude Sonnet 4.6	Anthropic	~1428
7	Gemini 3.2 Flash	Google	~1418
8	DeepSeek V4.1 Pro	DeepSeek	~1410
9	Qwen 3.7	Alibaba	~1400
10	GPT-5.6 mini	OpenAI	~1392
11	Grok 4	xAI	~1385
12	Llama 4.5 Maverick	Meta	~1370
13	GLM-6	Zhipu AI	~1360
14	Mistral Large 3	Mistral AI	~1352
15	Kimi K2.6	Moonshot AI	~1345

Key Takeaways

GPT-5.6 Pro overtook Claude Mythos 5 in Arena Elo within two weeks of release.
The top eight models are clustered within ~55 Elo points, the tightest spread on record.
DeepSeek V4.1 Pro is the highest open-weight entry, within ~55 points of the top closed model.
Specialized variants (Pro, Mythos) outscore base variants by 6 to 15 Elo on Arena Hard but the gap is narrower on default Arena.

Methodology

Scores from lmarena.ai public leaderboard; Arena Hard is the more discriminating variant used here. Elo numbers are approximate and shift daily as new votes accumulate. Updated monthly.

How Presenc AI Helps

Presenc AI tracks brand visibility on the top-Elo models continuously so brand teams see citation patterns shift in step with frontier-model competitive dynamics.

Frequently Asked Questions

A blind A/B ranking system where users vote on which of two anonymous model responses is better. Elo scores update continuously as votes accumulate.

Arena Hard is a curated harder prompt subset that better discriminates between top-tier models. Score spreads are wider on Arena Hard than on the default mixed-difficulty Arena.

GPT-5.6 Pro from OpenAI at approximately 1465, narrowly ahead of Claude Mythos 5 at approximately 1458.

Top-of-leaderboard positions typically hold for four to six weeks before a competitor release reshuffles the top spots.

Track Your AI Visibility

See how your brand appears across ChatGPT, Claude, Perplexity, and other AI platforms. Start monitoring today.