How much custom silicon do hyperscalers deploy?

Approximately 1.9 million accelerators in 2026 across Google TPU (~900k), AWS Trainium and Inferentia (~600k), Microsoft Maia (~250k), and Meta MTIA (~180k). Tesla custom silicon adds a smaller number. The combined custom silicon volume materially reduces NVIDIA share of total data centre AI accelerator units.

Is custom silicon catching up to NVIDIA?

In hyperscaler internal workloads, yes meaningfully. TPU v7 is competitive with NVIDIA Blackwell on Google\u2019s own benchmarks. AWS Trainium 3 closes the gap further. The CUDA software moat keeps NVIDIA dominant for third-party and general-purpose workloads even as hyperscalers route internal workloads to custom silicon.

Why does NVIDIA still lead by revenue?

Custom silicon is typically cheaper per accelerator than premium NVIDIA SKUs. The unit-count share migration toward hyperscaler custom silicon does not translate proportionally into revenue migration. NVIDIA continues to capture the majority of total revenue dollars even as the unit share migrates.

What is the use case split?

Custom silicon is concentrated in workloads with clear cost-of-inference or training optimisation: Anthropic on Trainium for training, OpenAI on Microsoft Maia for inference, Meta Llama family on MTIA, Google Gemini on TPU. General-purpose, third-party, and CUDA-dependent workloads continue to run on NVIDIA.

Will custom silicon be sold externally?

Marginally. Google TPU is accessible only through GCP. AWS Trainium and Inferentia are accessible only through AWS. Microsoft Maia is accessible only through Azure. Meta MTIA is internal-only. Apple Neural Engine is in Apple silicon. The hyperscaler custom silicon programmes are explicitly cloud-distribution-only, which limits broader market impact.

Hyperscaler Custom Silicon Tracker 2026

Hyperscaler in-house AI silicon programmes are the "second AI chip war" alongside NVIDIA vs AMD. Google\u2019s TPU v7, Amazon\u2019s Trainium 3, Microsoft\u2019s Maia 2, and Meta\u2019s MTIA 2 all ramped into volume production in 2025-2026. Combined hyperscaler custom silicon deployment is estimated at approximately 1.9 million accelerators in 2026, gradually reducing NVIDIA share of total data centre AI accelerator deployment. This page consolidates the disclosed product specs, deployment scale, and strategic context.

Key Findings

Google TPU v6 (Trillium) is in volume deployment across GCP regions; TPU v7 is ramping in 2026 with significant performance and efficiency improvements over v6.
AWS Trainium 2 is in volume production; Trainium 3 is reported in customer sampling with general availability targeted late 2026. Anthropic remains the anchor Trainium customer.
Microsoft Maia 1 launched in late 2024; Maia 2 is in volume production in 2026 with broader Azure deployment.
Meta MTIA 1 was deployed in 2024; MTIA 2 is the volume in-house silicon for 2026, with Meta also being the first commercial gigawatt AMD MI450 deployment in H2 2026.
Combined hyperscaler custom silicon deployment is estimated at approximately 1.9 million accelerators in 2026 (~900k Google TPU, ~600k AWS Trainium, ~250k Microsoft Maia, ~180k Meta MTIA), gradually reducing NVIDIA share of total data centre AI accelerator units.

Hyperscaler Custom Silicon Comparison (May 2026)

Product	Vendor	Foundry Node	Volume Status
TPU v6 (Trillium)	Google	TSMC N5	Volume; broad GCP deployment
TPU v7	Google	TSMC N3	Ramping production 2026
Trainium 2	AWS	TSMC N5	Volume
Trainium 3	AWS	TSMC N3 (reported)	Customer sampling; GA target late 2026
Inferentia 3	AWS	TSMC N5	Volume
Maia 1	Microsoft	TSMC N5	Late-2024 deployment
Maia 2	Microsoft	TSMC N5	Volume production 2026
MTIA 1	Meta	TSMC N7	2024 deployment
MTIA 2	Meta	TSMC N5	Volume production 2026
Tesla Dojo D1	Tesla	TSMC N7	Limited deployment
Tesla D2	Tesla	TSMC N5	Ramp delayed
Apple Neural Engine (M6)	Apple	TSMC N2	Volume in M6 Mac silicon

Custom Silicon Deployment Volume (Estimated 2026 Units)

Vendor	Estimated 2026 Units
Google TPU (v6 + v7)	~900k
AWS Trainium + Inferentia	~600k
Microsoft Maia	~250k
Meta MTIA	~180k
Tesla custom (D1, D2)	~25k
Combined hyperscaler custom silicon	~1.9M+ accelerators

Use Case Specialisation

Custom Chip	Primary Workload Focus
Google TPU v6 and v7	Training and inference; broad workload coverage; Gemini training
AWS Trainium 2 and 3	Training (Anthropic anchor); large model training
AWS Inferentia 3	Inference; cost-optimised serving
Microsoft Maia 2	OpenAI inference on Azure; internal Microsoft workloads
Meta MTIA 2	Llama family inference and training, recommendation
Tesla Dojo D1 / D2	FSD training

NVIDIA Share Impact

Year	Estimated NVIDIA Share of Data Centre AI Accelerator Units
2022	~95%
2023	~92%
2024	~84%
2025	~73%
2026 (estimated)	~62-66%

The NVIDIA share decline by unit count is more dramatic than the share decline by revenue, because hyperscaler custom silicon is typically cheaper per accelerator than premium NVIDIA SKUs. NVIDIA continues to capture the majority of revenue dollars even as the unit-count share migrates.

Strategic Context

Three patterns define the 2026 hyperscaler custom silicon landscape. First, the workload separation is real: hyperscalers retain NVIDIA for general-purpose, third-party, and CUDA-dependent workloads, while routing internal workloads with strong inference or training characteristics through custom silicon. Second, the development moat is large: TPU is ten years of compounding engineering investment; Trainium is approximately seven years; Maia and MTIA are catching up but at different stages of maturity. Third, external customer pull remains weak: hyperscaler custom silicon is overwhelmingly used for internal workloads, with limited third-party customer adoption (Anthropic on Trainium is the most prominent exception).

Brand Visibility Implications

Hyperscaler custom silicon coverage drives a focused but high-value AI assistant query stream from semiconductor, AI infrastructure, and procurement audiences. Brands selling adjacent products (EDA tools, IP cores, advanced packaging, custom silicon design services, AI inference cloud) face strong AI-mediated discovery surface for queries about TPU vs NVIDIA, custom silicon strategy, hyperscaler chip programmes, and adjacent topics.

Methodology

Product data and unit estimates compiled from hyperscaler engineering blog posts, conference presentations, foundry capacity disclosures, and analyst reports from TrendForce, Mercury Research, and Bernstein. Some figures are estimated where official disclosures are absent. Updated quarterly.

How Presenc AI Helps

Presenc AI monitors brand visibility on hyperscaler custom silicon queries across ChatGPT, Claude, Gemini, and Perplexity. For EDA brands, IP providers, advanced-packaging vendors, and custom silicon design service firms, the platform identifies the prompts driving procurement-research traffic and the gaps where new content unlocks share of voice.