Research

LLM API Pricing Snapshot June 2026

Per-token API pricing snapshot for every major frontier LLM in June 2026. DeepSeek V4.1 Flash leads on cost; Claude Mythos 5 commands the highest premium.

By Ramanath, CTO & Co-Founder at Presenc AI · Last updated: June 2026

This page snapshots per-token API pricing for every major frontier LLM as of June 2026. Pricing in USD per million tokens; verify vendor pricing pages before quoting in production estimates.

Frontier LLM Pricing (USD per 1M tokens)

Model	Vendor	Input	Output
Claude Mythos 5	Anthropic	~$22	~$110
Claude Opus 4.7	Anthropic	~$15	~$75
GPT-5.6 Pro	OpenAI	~$15	~$60
GPT-5.6	OpenAI	~$5	~$15
Gemini 3.2 Pro	Google	~$3.50	~$10.50
Claude Sonnet 4.6	Anthropic	~$3	~$15
Mistral Medium 3	Mistral AI	~$2	~$6
Gemini 3.2 Flash	Google	~$0.40	~$1.20
GPT-5.6 mini	OpenAI	~$0.30	~$1.20
Claude Haiku 4.5	Anthropic	~$0.80	~$4
Qwen 3.7 (flagship via Alibaba)	Alibaba	~$0.40	~$1.20
DeepSeek V4.1 Pro	DeepSeek	~$0.55	~$2.20
DeepSeek V4.1 Flash	DeepSeek	~$0.12	~$0.32
GLM-6 (Zhipu)	Zhipu AI	~$0.30	~$0.90
Kimi K2.6	Moonshot AI	~$0.95	~$3.80

Key Takeaways

DeepSeek V4.1 Flash at ~$0.12 input is the cheapest frontier-class model on the market.
Claude Mythos 5 at ~$22 input commands the highest per-token pricing in the frontier set.
The spread between cheapest and most-expensive frontier-class models is approximately 180x on input tokens.
Chinese frontier models cluster at $0.12 to $0.55 per million input tokens, well below Western peers.
Mini and Flash variants from Western labs (GPT-5.6 mini, Gemini 3.2 Flash) compete in the same price band as Chinese frontier.

Methodology

Pricing from vendor public pricing pages and reported enterprise tiers as of June 2026. Many vendors offer volume discounts, prompt caching, and reserved capacity that materially reduce effective pricing. Updated monthly.

How Presenc AI Helps

Presenc AI tracks how price-performance shifts shape model adoption inside developer tools and enterprise deployments where brand visibility propagates into derivative outputs.

Frequently Asked Questions

DeepSeek V4.1 Flash at approximately $0.12 per million input tokens. The next-cheapest Western-origin frontier-class model is Gemini 3.2 Flash at approximately $0.40.

Claude Mythos 5 at approximately $22 per million input tokens and $110 per million output tokens. The premium reflects the cybersecurity-aligned reasoning capability.

Chinese frontier models (DeepSeek V4.1, Qwen 3.7, GLM-6, ERNIE 5.1) cluster at $0.12 to $0.55 per million input tokens, well below Western peers except for Mini and Flash variants.

Yes, materially. Anthropic prompt caching can reduce input cost by 90% on repeated prompt prefixes. Volume tier and reserved-capacity discounts typically reduce effective pricing 30 to 60% for enterprise customers.

Track Your AI Visibility

See how your brand appears across ChatGPT, Claude, Perplexity, and other AI platforms. Start monitoring today.