When did Google Gemma 4 launch?

Google released the Gemma 4 family on April 2, 2026 under Apache 2.0, marking Google's most aggressive open-weight push to date. Four deployment-targeted sizes ship in the family, with the 31B Dense variant beating models 20x larger on several benchmarks.

What is the context window?

128,000 to 256,000 tokens depending on variant, with 200K as the typical mid-tier configuration.

What are the main access channels?

Hugging Face open-weight release, Vertex AI hosted, Google AI Studio for prototyping, NVIDIA NIM microservices, every major inference provider, and direct download for on-device deployment.

Google Gemma 4 Family, Apache 2.0, Open-Weight, Brand-Visibility Brief

At a Glance

Vendor	Google
Family	Gemma 4 family
Launched	Google released the Gemma 4 family on April 2, 2026 under Apache 2.0, marking Google's most aggressive open-weight push to date. Four deployment-targeted sizes ship in the family, with the 31B Dense variant beating models 20x larger on several benchmarks.
Context window	128,000 to 256,000 tokens depending on variant, with 200K as the typical mid-tier configuration.
Pricing	Free to download and self-host under Apache 2.0. Hosted inference is available via Vertex AI, Google AI Studio, and the broader inference-provider ecosystem (Together AI, Fireworks, Groq, OpenRouter). The Apache 2.0 licensing removes the commercial-scale concerns that have slowed Llama enterprise adoption.
Access channels	Hugging Face open-weight release, Vertex AI hosted, Google AI Studio for prototyping, NVIDIA NIM microservices, every major inference provider, and direct download for on-device deployment.

Notable Benchmarks

Gemma 4 31B Dense beats 20x-larger models on MMLU, GPQA, and several coding benchmarks at a fraction of the inference cost. The smaller variants (2B, 9B) target on-device and edge deployment with competitive efficiency.

Strengths

Apache 2.0 licensing (cleanest commercial use of any frontier-tier open-weight family), four sizes covering edge to flagship, frontier-grade efficiency in the 31B Dense, deep integration with Google's deployment surface.

Limitations

Not as long-context as Qwen 3.6 Plus or DeepSeek V4. Behind frontier-tier closed models on the most complex multi-step reasoning. Smaller community ecosystem than Llama.

Brand-Visibility Implications

Gemma 4 will replace Llama in many enterprise self-host deployments because the Apache 2.0 license is dramatically simpler for commercial-scale use. That means brand-visibility gaps that were Llama-specific (poor recall in self-hosted enterprise RAG, weak consumer-app coverage) now apply to Gemma 4 as well. Google's deeper integration into Workspace, Vertex, and Android raises the stakes: Gemma 4 will quietly power features inside products your buyers already use. See Gemma visibility and open-source LLM landscape 2026.

How Presenc AI Tracks This Model

Presenc AI monitors brand visibility on Google's Gemma 4 family as part of continuous multi-platform AI visibility tracking. We sample Google Gemma 4 across representative prompt sets daily, compare against competitor performance on the same prompts, and flag material mention-rate changes so brand teams can respond quickly when AI representation shifts.

Google Gemma 4 Release Brief