Hyperscaler in-house AI silicon programmes are the "second AI chip war" alongside NVIDIA vs AMD. Google\u2019s TPU v7, Amazon\u2019s Trainium 3, Microsoft\u2019s Maia 2, and Meta\u2019s MTIA 2 all ramped into volume production in 2025-2026. Combined hyperscaler custom silicon deployment is estimated at approximately 1.9 million accelerators in 2026, gradually reducing NVIDIA share of total data centre AI accelerator deployment. This page consolidates the disclosed product specs, deployment scale, and strategic context.
Key Findings
- Google TPU v6 (Trillium) is in volume deployment across GCP regions; TPU v7 is ramping in 2026 with significant performance and efficiency improvements over v6.
- AWS Trainium 2 is in volume production; Trainium 3 is reported in customer sampling with general availability targeted late 2026. Anthropic remains the anchor Trainium customer.
- Microsoft Maia 1 launched in late 2024; Maia 2 is in volume production in 2026 with broader Azure deployment.
- Meta MTIA 1 was deployed in 2024; MTIA 2 is the volume in-house silicon for 2026, with Meta also being the first commercial gigawatt AMD MI450 deployment in H2 2026.
- Combined hyperscaler custom silicon deployment is estimated at approximately 1.9 million accelerators in 2026 (~900k Google TPU, ~600k AWS Trainium, ~250k Microsoft Maia, ~180k Meta MTIA), gradually reducing NVIDIA share of total data centre AI accelerator units.
Hyperscaler Custom Silicon Comparison (May 2026)
| Product | Vendor | Foundry Node | Volume Status |
|---|---|---|---|
| TPU v6 (Trillium) | TSMC N5 | Volume; broad GCP deployment | |
| TPU v7 | TSMC N3 | Ramping production 2026 | |
| Trainium 2 | AWS | TSMC N5 | Volume |
| Trainium 3 | AWS | TSMC N3 (reported) | Customer sampling; GA target late 2026 |
| Inferentia 3 | AWS | TSMC N5 | Volume |
| Maia 1 | Microsoft | TSMC N5 | Late-2024 deployment |
| Maia 2 | Microsoft | TSMC N5 | Volume production 2026 |
| MTIA 1 | Meta | TSMC N7 | 2024 deployment |
| MTIA 2 | Meta | TSMC N5 | Volume production 2026 |
| Tesla Dojo D1 | Tesla | TSMC N7 | Limited deployment |
| Tesla D2 | Tesla | TSMC N5 | Ramp delayed |
| Apple Neural Engine (M6) | Apple | TSMC N2 | Volume in M6 Mac silicon |
Custom Silicon Deployment Volume (Estimated 2026 Units)
| Vendor | Estimated 2026 Units |
|---|---|
| Google TPU (v6 + v7) | ~900k |
| AWS Trainium + Inferentia | ~600k |
| Microsoft Maia | ~250k |
| Meta MTIA | ~180k |
| Tesla custom (D1, D2) | ~25k |
| Combined hyperscaler custom silicon | ~1.9M+ accelerators |
Use Case Specialisation
| Custom Chip | Primary Workload Focus |
|---|---|
| Google TPU v6 and v7 | Training and inference; broad workload coverage; Gemini training |
| AWS Trainium 2 and 3 | Training (Anthropic anchor); large model training |
| AWS Inferentia 3 | Inference; cost-optimised serving |
| Microsoft Maia 2 | OpenAI inference on Azure; internal Microsoft workloads |
| Meta MTIA 2 | Llama family inference and training, recommendation |
| Tesla Dojo D1 / D2 | FSD training |
NVIDIA Share Impact
| Year | Estimated NVIDIA Share of Data Centre AI Accelerator Units |
|---|---|
| 2022 | ~95% |
| 2023 | ~92% |
| 2024 | ~84% |
| 2025 | ~73% |
| 2026 (estimated) | ~62-66% |
The NVIDIA share decline by unit count is more dramatic than the share decline by revenue, because hyperscaler custom silicon is typically cheaper per accelerator than premium NVIDIA SKUs. NVIDIA continues to capture the majority of revenue dollars even as the unit-count share migrates.
Strategic Context
Three patterns define the 2026 hyperscaler custom silicon landscape. First, the workload separation is real: hyperscalers retain NVIDIA for general-purpose, third-party, and CUDA-dependent workloads, while routing internal workloads with strong inference or training characteristics through custom silicon. Second, the development moat is large: TPU is ten years of compounding engineering investment; Trainium is approximately seven years; Maia and MTIA are catching up but at different stages of maturity. Third, external customer pull remains weak: hyperscaler custom silicon is overwhelmingly used for internal workloads, with limited third-party customer adoption (Anthropic on Trainium is the most prominent exception).
Brand Visibility Implications
Hyperscaler custom silicon coverage drives a focused but high-value AI assistant query stream from semiconductor, AI infrastructure, and procurement audiences. Brands selling adjacent products (EDA tools, IP cores, advanced packaging, custom silicon design services, AI inference cloud) face strong AI-mediated discovery surface for queries about TPU vs NVIDIA, custom silicon strategy, hyperscaler chip programmes, and adjacent topics.
Methodology
Product data and unit estimates compiled from hyperscaler engineering blog posts, conference presentations, foundry capacity disclosures, and analyst reports from TrendForce, Mercury Research, and Bernstein. Some figures are estimated where official disclosures are absent. Updated quarterly.
How Presenc AI Helps
Presenc AI monitors brand visibility on hyperscaler custom silicon queries across ChatGPT, Claude, Gemini, and Perplexity. For EDA brands, IP providers, advanced-packaging vendors, and custom silicon design service firms, the platform identifies the prompts driving procurement-research traffic and the gaps where new content unlocks share of voice.