How much does an H100 cost to rent in 2026?

On-demand cloud rental is approximately $1.80-3.50/hr per H100 in Q2 2026, with spot pricing as low as $1.20/hr. Lambda Cloud, CoreWeave, and Crusoe lead the price-aggressive segment. Hyperscaler list pricing (AWS p5, GCP A3, Azure NDH100v5) is materially higher but reservations and commitments bring effective rates down.

Are GPUs still in shortage?

Functional balance, not shortage by 2026. NVIDIA H100 lead times are 6-12 weeks (down from 50+ weeks in 2023). B200 is supply-constrained but not allocation-only outside the largest hyperscalers. GB200 NVL72 remains in tight allocation. The acute 2023-era shortage has resolved.

Should I buy or rent GPUs?

For sustained moderate-to-high utilisation workloads, buy and amortise. Breakeven is 6-14 months on H100 at moderate utilisation. For spiky or experimental workloads, rent. Hybrid (small on-prem for baseline, cloud for spikes) is the practical default for most enterprises with sustained AI workloads.

Is AMD MI300X competitive with H100?

On inference, yes, comparable performance at 30-40 percent lower price. On training, software ecosystem (ROCm vs CUDA) remains the differentiator, ROCm has improved materially in 2025-2026 but CUDA retains the developer ergonomics edge. AMD adoption is growing fastest on inference workloads.

When does B200 become widely available?

Mid-to-late 2026 for general cloud availability outside the largest hyperscalers. NVIDIA Blackwell ramp continues through 2026 with GB200 NVL72 systems concentrated at the largest hyperscalers and select neoclouds. By H1 2027, B200 access should be similar to H100 access today.

AI GPU Supply and Pricing 2026

The State of AI GPU Supply in 2026

AI GPU supply moved from acute shortage in 2023 to functional balance in 2026 as NVIDIA Blackwell ramped at scale and AMD MI300X plus Google TPU v6 added competing capacity. Cloud rental prices fell sharply over 2024-2025; on-prem economics remain attractive for sustained workloads. This page consolidates pricing and supply data through Q2 2026.

Key Findings

NVIDIA H100 cloud rental rates fell from approximately $8/hr in early 2023 to $1.80-3.50/hr in Q2 2026, with spot pricing as low as $1.20/hr.
NVIDIA B200 (Blackwell) rental rates in Q2 2026 are approximately $4.50-7.00/hr; supply is constrained but not allocated.
GB200 NVL72 rack-scale systems remain in tight allocation; access is primarily through hyperscaler clouds and select neoclouds.
NVIDIA H100 lead times for direct purchase are approximately 6-12 weeks in Q2 2026, down from 50+ weeks in 2023.
AMD MI300X rental pricing is approximately 30-40 percent below H100 at comparable performance for inference; software ecosystem (ROCm) remains the differentiator.

NVIDIA H100 Cloud Rental Pricing Trajectory

Period	On-demand $/hr (single H100)	Spot $/hr (single H100)	Notable provider rates
Q1 2023	~$8.00	~$5.00	Allocation-constrained; long waitlists
Q1 2024	~$5.00	~$3.20	Lambda, Crusoe, CoreWeave
Q1 2025	~$3.20	~$2.00	Spot supply ramp
Q2 2026	~$1.80-3.50	~$1.20-2.00	B200 supply pressures H100

Pricing aggregated from Lambda Cloud, CoreWeave, Crusoe, AWS p5, Google Cloud public rate cards.

NVIDIA B200 / GB200 Pricing

SKU	On-demand $/hr	Notes
B200 (single)	~$4.50-7.00	Limited but growing availability
HGX B200 8-GPU	~$36-55	Premium for tightly-coupled inference
GB200 NVL72 (per GPU equivalent)	~$8-14	Tight allocation; hyperscaler-mediated
H200 (single)	~$3.00-4.50	Bridge between H100 and B200

AMD and Alternative Accelerator Pricing

Accelerator	On-demand $/hr	Comparable to
AMD MI300X	~$1.20-2.50	~30-40% below H100; competitive on inference
AMD MI325X	~$1.80-3.20	~20-30% below H200
Google TPU v5p (per chip)	~$2.50-4.50	GCP-only; competitive on training
AWS Trainium 2	~$0.80-1.80	AWS-only; cost-leader on inference
Cerebras WSE-3 (cloud)	premium pricing	Niche use cases; very high single-chip throughput
Groq LPU (inference)	per-token pricing	Inference-only; extreme tps on small models

Cloud Provider Comparison (H100, on-demand 8-GPU box)

Provider	Approximate on-demand $/hr	Notes
Lambda Cloud	~$22-26	Among lowest; AI-focused
CoreWeave	~$24-32	Reliable allocation; strong for training
Crusoe	~$22-28	Sustainable energy positioning
Together AI / Anyscale	~$28-36	Managed services premium
AWS p5.48xlarge	~$98 (on-demand list)	Reservations and savings plans bring effective rate down
GCP A3 (8x H100)	~$88 (on-demand list)	Significant discount with commitments
Azure NDH100v5	~$98 (on-demand list)	Significant discount with reservations

Lead Times for Direct Purchase

SKU	Lead time Q2 2026	Lead time peak (2023)
H100 SXM	6-12 weeks	50+ weeks
H100 PCIe	4-8 weeks	30+ weeks
H200	8-14 weeks	n/a
B200	16-26 weeks	n/a (allocated)
GB200 NVL72	allocation-only	n/a

On-Prem vs Rental Economics

For sustained workloads at moderate-to-high utilisation, on-prem H100 amortises favourably against cloud rental within 6-14 months. Beyond utilisation rate, the decision depends on:

Capital availability for upfront purchase
Datacentre space, power, cooling availability
Engineering team to operate the cluster
Workload predictability (rental wins for spiky loads)
Need for newest-generation hardware (rental upgrades automatically)

Brand Visibility Implications

GPU economics are heavily journalist-covered, particularly cost-per-token math, GPU shortage / oversupply narratives, and cloud-pricing wars. Brands selling GPU cloud, AI accelerator competitors to NVIDIA, AI cost-optimisation services, or compute-marketplace platforms face high AI-mediated discovery surface as buyers query AI assistants for cost-efficient compute recommendations. Hyperscaler GPU services and neocloud providers compete heavily on AI-mediated visibility for "cheapest H100 cloud" type queries.

Methodology

Pricing aggregated from public rate cards: Lambda Cloud, CoreWeave, Crusoe, AWS, GCP, Azure. Lead times triangulated from NVIDIA reseller channel reports and procurement-team interviews. Spot pricing reflects monitored neocloud spot markets. Updated monthly as the market remains fast-moving.

How Presenc AI Helps

Presenc AI tracks brand-mention rates inside AI assistant queries about GPU cloud pricing, AI accelerator selection, and compute-marketplace comparison, the surface where compute purchasing decisions increasingly originate. For brands selling AI compute or AI cost-optimisation, this is the operational visibility into a high-stakes commercial discovery surface.