Research

Hyperscaler Custom Silicon Tracker 2026

Hyperscaler in-house AI silicon in 2026: Google TPU v7, AWS Trainium 3, Microsoft Maia 2, Meta MTIA 2. The second AI chip war and its impact on NVIDIA share.

By Ramanath, CTO & Co-Founder at Presenc AI · Last updated: May 2026

Hyperscaler in-house AI silicon programmes are the "second AI chip war" alongside NVIDIA vs AMD. Google\u2019s TPU v7, Amazon\u2019s Trainium 3, Microsoft\u2019s Maia 2, and Meta\u2019s MTIA 2 all ramped into volume production in 2025-2026. Combined hyperscaler custom silicon deployment is estimated at approximately 1.9 million accelerators in 2026, gradually reducing NVIDIA share of total data centre AI accelerator deployment. This page consolidates the disclosed product specs, deployment scale, and strategic context.

Key Findings

  1. Google TPU v6 (Trillium) is in volume deployment across GCP regions; TPU v7 is ramping in 2026 with significant performance and efficiency improvements over v6.
  2. AWS Trainium 2 is in volume production; Trainium 3 is reported in customer sampling with general availability targeted late 2026. Anthropic remains the anchor Trainium customer.
  3. Microsoft Maia 1 launched in late 2024; Maia 2 is in volume production in 2026 with broader Azure deployment.
  4. Meta MTIA 1 was deployed in 2024; MTIA 2 is the volume in-house silicon for 2026, with Meta also being the first commercial gigawatt AMD MI450 deployment in H2 2026.
  5. Combined hyperscaler custom silicon deployment is estimated at approximately 1.9 million accelerators in 2026 (~900k Google TPU, ~600k AWS Trainium, ~250k Microsoft Maia, ~180k Meta MTIA), gradually reducing NVIDIA share of total data centre AI accelerator units.

Hyperscaler Custom Silicon Comparison (May 2026)

ProductVendorFoundry NodeVolume Status
TPU v6 (Trillium)GoogleTSMC N5Volume; broad GCP deployment
TPU v7GoogleTSMC N3Ramping production 2026
Trainium 2AWSTSMC N5Volume
Trainium 3AWSTSMC N3 (reported)Customer sampling; GA target late 2026
Inferentia 3AWSTSMC N5Volume
Maia 1MicrosoftTSMC N5Late-2024 deployment
Maia 2MicrosoftTSMC N5Volume production 2026
MTIA 1MetaTSMC N72024 deployment
MTIA 2MetaTSMC N5Volume production 2026
Tesla Dojo D1TeslaTSMC N7Limited deployment
Tesla D2TeslaTSMC N5Ramp delayed
Apple Neural Engine (M6)AppleTSMC N2Volume in M6 Mac silicon

Custom Silicon Deployment Volume (Estimated 2026 Units)

VendorEstimated 2026 Units
Google TPU (v6 + v7)~900k
AWS Trainium + Inferentia~600k
Microsoft Maia~250k
Meta MTIA~180k
Tesla custom (D1, D2)~25k
Combined hyperscaler custom silicon~1.9M+ accelerators

Use Case Specialisation

Custom ChipPrimary Workload Focus
Google TPU v6 and v7Training and inference; broad workload coverage; Gemini training
AWS Trainium 2 and 3Training (Anthropic anchor); large model training
AWS Inferentia 3Inference; cost-optimised serving
Microsoft Maia 2OpenAI inference on Azure; internal Microsoft workloads
Meta MTIA 2Llama family inference and training, recommendation
Tesla Dojo D1 / D2FSD training

NVIDIA Share Impact

YearEstimated NVIDIA Share of Data Centre AI Accelerator Units
2022~95%
2023~92%
2024~84%
2025~73%
2026 (estimated)~62-66%

The NVIDIA share decline by unit count is more dramatic than the share decline by revenue, because hyperscaler custom silicon is typically cheaper per accelerator than premium NVIDIA SKUs. NVIDIA continues to capture the majority of revenue dollars even as the unit-count share migrates.

Strategic Context

Three patterns define the 2026 hyperscaler custom silicon landscape. First, the workload separation is real: hyperscalers retain NVIDIA for general-purpose, third-party, and CUDA-dependent workloads, while routing internal workloads with strong inference or training characteristics through custom silicon. Second, the development moat is large: TPU is ten years of compounding engineering investment; Trainium is approximately seven years; Maia and MTIA are catching up but at different stages of maturity. Third, external customer pull remains weak: hyperscaler custom silicon is overwhelmingly used for internal workloads, with limited third-party customer adoption (Anthropic on Trainium is the most prominent exception).

Brand Visibility Implications

Hyperscaler custom silicon coverage drives a focused but high-value AI assistant query stream from semiconductor, AI infrastructure, and procurement audiences. Brands selling adjacent products (EDA tools, IP cores, advanced packaging, custom silicon design services, AI inference cloud) face strong AI-mediated discovery surface for queries about TPU vs NVIDIA, custom silicon strategy, hyperscaler chip programmes, and adjacent topics.

Methodology

Product data and unit estimates compiled from hyperscaler engineering blog posts, conference presentations, foundry capacity disclosures, and analyst reports from TrendForce, Mercury Research, and Bernstein. Some figures are estimated where official disclosures are absent. Updated quarterly.

How Presenc AI Helps

Presenc AI monitors brand visibility on hyperscaler custom silicon queries across ChatGPT, Claude, Gemini, and Perplexity. For EDA brands, IP providers, advanced-packaging vendors, and custom silicon design service firms, the platform identifies the prompts driving procurement-research traffic and the gaps where new content unlocks share of voice.

Frequently Asked Questions

Approximately 1.9 million accelerators in 2026 across Google TPU (~900k), AWS Trainium and Inferentia (~600k), Microsoft Maia (~250k), and Meta MTIA (~180k). Tesla custom silicon adds a smaller number. The combined custom silicon volume materially reduces NVIDIA share of total data centre AI accelerator units.
In hyperscaler internal workloads, yes meaningfully. TPU v7 is competitive with NVIDIA Blackwell on Google\u2019s own benchmarks. AWS Trainium 3 closes the gap further. The CUDA software moat keeps NVIDIA dominant for third-party and general-purpose workloads even as hyperscalers route internal workloads to custom silicon.
Custom silicon is typically cheaper per accelerator than premium NVIDIA SKUs. The unit-count share migration toward hyperscaler custom silicon does not translate proportionally into revenue migration. NVIDIA continues to capture the majority of total revenue dollars even as the unit share migrates.
Custom silicon is concentrated in workloads with clear cost-of-inference or training optimisation: Anthropic on Trainium for training, OpenAI on Microsoft Maia for inference, Meta Llama family on MTIA, Google Gemini on TPU. General-purpose, third-party, and CUDA-dependent workloads continue to run on NVIDIA.
Marginally. Google TPU is accessible only through GCP. AWS Trainium and Inferentia are accessible only through AWS. Microsoft Maia is accessible only through Azure. Meta MTIA is internal-only. Apple Neural Engine is in Apple silicon. The hyperscaler custom silicon programmes are explicitly cloud-distribution-only, which limits broader market impact.

Track Your AI Visibility

See how your brand appears across ChatGPT, Claude, Perplexity, and other AI platforms. Start monitoring today.