The Quietly Large Air-Gapped AI Surface
Air-gapped LLM deployments (no internet egress, no cloud API connection, often no telemetry of any kind) are growing fast in regulated industries. They are invisible to every cloud-AI visibility tool by design. This page consolidates the public data on adoption, sector mix, and deployment patterns. Numbers are estimates triangulated from vendor disclosures, industry surveys, and procurement data; air-gap deployments do not publish telemetry, so figures are directional rather than precise.
Key Findings
- An estimated 14-19 percent of enterprise LLM deployments at companies with over 5,000 employees were operating air-gapped or fully on-prem in Q1 2026, up from approximately 7 percent in Q1 2025.
- Defence, intelligence, and government deployments account for the largest share by spend (approximately 38 percent), followed by healthcare (19 percent), regulated finance (17 percent), and legal (9 percent).
- The most common deployment shape is on-prem GPU server (2x-8x H100) running open-weight base models (Llama 4, Qwen 3, Mistral) with internal fine-tunes; cloud-API access is forbidden by policy.
- Average procurement cycle for air-gapped LLM deployment is 9-14 months from initial vendor evaluation to production, materially longer than cloud-API adoption.
- Roughly 60 percent of air-gapped deployments use a single base model family across the organisation, not multi-model orchestration; complexity is a deal-breaker in compliance-driven environments.
Adoption by Sector (estimated share of $5,000+ employee enterprise LLM deployments operating air-gapped, Q1 2026)
| Sector | Estimated air-gap share | Primary compliance driver |
|---|---|---|
| Defence and intelligence | ~78% | Classification, ITAR, FedRAMP High, IL5/IL6 |
| Healthcare (large hospital systems) | ~22% | HIPAA, BAA constraints, PHI residency |
| Banking and capital markets | ~28% | Internal data residency, regulator audit, model governance |
| Insurance | ~18% | PHI/PII, customer data residency |
| Pharma R&D | ~31% | Trade-secret protection, IP firewalls |
| Legal (top 100 firms) | ~24% | Privilege protection, client confidentiality |
| Energy and utilities | ~11% | Critical infrastructure protection |
| Manufacturing | ~6% | Trade secrets, CAD/IP protection |
| Retail / consumer | ~3% | Limited; mostly cloud-permitted |
Hardware Mix in Air-Gapped Deployments (estimated share by deployment count)
| Hardware tier | Share | Typical model class |
|---|---|---|
| Single workstation (DGX Spark, Mac Studio) | ~22% | Small team, 7B-70B models |
| 2x H100 / A100 PCIe server | ~31% | Department-level, 70B-class generators |
| 4x-8x H100 cluster | ~28% | Enterprise central serving, multi-team |
| 16x+ H100 / B200 cluster | ~11% | Large-bank, large-defence-prime, training-capable |
| Specialist accelerators (Cerebras, Groq on-prem) | ~3% | Niche high-throughput inference |
| Edge devices (Jetson, custom) | ~5% | Field deployments, robotics |
Base Model Selection Patterns
Air-gapped enterprises overwhelmingly select open-weight base models, then internally fine-tune.
- Llama 4 family: ~38% (Meta's permissive licence and broad ecosystem dominate)
- Qwen 3 family: ~21% (Chinese enterprises plus selective non-Chinese adoption)
- Mistral family (Mixtral, Mistral Large open releases): ~12%
- gpt-oss family: ~9% (OpenAI's open releases)
- Gemma family: ~6%
- Custom from-scratch models: ~3% (mostly defence primes)
- Other: ~11%
Deployment Architecture Patterns
Three patterns dominate air-gapped deployments:
- Centralised inference cluster: single multi-GPU server runs all models; departments connect via internal API. Most common pattern (~55% of deployments).
- Federated workstations: individual analysts run local LLMs on workstations; no central coordination. Common in legal and pharma R&D (~25%).
- Edge-and-central hybrid: small local models for low-latency tasks, central cluster for frontier reasoning. Common in defence (~20%).
Why Adoption Is Accelerating
Three drivers compounding in 2026:
- Open-weight models reached frontier-class quality (Llama 4 70B, Qwen 3 32B competitive with GPT-4o-class on most benchmarks).
- Hardware costs fell to single-workstation reach for 70B-class deployment (DGX Spark, Mac M5 Max).
- Regulatory and compliance pressure increased: HIPAA AI guidance updates, EU AI Act enforcement Q1 2026, increased financial-regulator focus on AI model governance.
Brand Visibility Implications
This is the largest and least-measured AI brand-visibility surface in 2026. Conservative estimate: 200,000-400,000 daily LLM-driven brand-relevant interactions in air-gapped deployments at large enterprises in regulated sectors, all invisible to every cloud-AI visibility platform. The implication: brands targeting regulated-industry buyers (selling to defence, healthcare, finance, legal, pharma) cannot rely on cloud-AI visibility as a complete signal; air-gap visibility is a separate operational concern. See our local LLM blind spot page for the operational answer.
Methodology
Adoption figures are triangulated estimates from vendor disclosures (NVIDIA DGX customer testimonials, Anthropic and OpenAI enterprise programmes excluded by definition), IDC AI infrastructure surveys (where summary data is public), Gartner AI deployment guidance, public defence procurement records (FedRAMP, IL5/IL6 authorisations), and HIPAA-compliant AI vendor disclosures. Air-gap deployments by definition do not report telemetry; figures are directional with ±25 percent confidence intervals. Updated quarterly.
How Presenc AI Helps
Presenc AI partners with regulated-industry buyers to deploy brand-visibility instrumentation inside air-gapped LLM environments. Our deployment-side measurement runs entirely within the customer's isolated network, producing brand-mention and recommendation-rate data without violating air-gap policy. For brands selling into regulated sectors, this is the only operational visibility into a meaningful and growing surface.