Why an AI-Specific Incident Response Plan
AI-specific incidents (hallucinations causing harm, bias surfacing in production, prompt injection, model degradation, training-data leaks) don't fit neatly into a traditional security incident response plan. They have different blast radius, different remediation paths, and different communication requirements. This template gives you a stand-alone playbook.
Section 1: Roles and Responsibilities
| Role | Owner | Responsibilities |
| Incident Commander | On-call engineer | Triages, declares severity, coordinates response |
| AI Lead | ML / Applied AI lead | Diagnoses model / prompt / data layer; proposes fixes |
| Comms Lead | PR / Communications | External communications, customer messaging |
| Legal Lead | In-house counsel | Regulatory disclosure, customer notification, liability |
| Customer Trust | CS leadership | Direct customer outreach, escalation handling |
| Executive Sponsor | VP-level or above | Approves customer comms, business decisions |
Section 2: Incident Categories
- Hallucination Harm: AI output causes a customer or third party material harm (financial, medical, reputational).
- Bias Incident: Disparate impact surfaces in production AI decisions or outputs.
- Data Leak: Training data, internal prompts, or customer data exposed via AI output or model.
- Prompt Injection: External input manipulates the AI system into harmful behaviour.
- Model Degradation: Upstream model version change causes silent quality drop.
- Vendor Outage: AI vendor outage cascades into our production systems.
Section 3: Severity Levels
| Severity | Definition | Response time | Notify |
| SEV-1 | Active harm to customers or material legal exposure | Immediate | Full team, exec, legal, board if material |
| SEV-2 | Significant degradation, no active harm yet | Within 1 hour | Full team, exec sponsor |
| SEV-3 | Limited impact, contained | Within 4 hours | Owning team, exec sponsor |
| SEV-4 | Internal-only, no customer impact | Next business day | Owning team |
Section 4: Response Workflow
- Detect. Customer report, monitoring alert, or internal observation triggers an incident channel.
- Triage. Incident Commander assigns severity and starts the incident log.
- Contain. Stop the bleeding: roll back, throttle, or disable the affected AI feature.
- Diagnose. AI Lead identifies root cause (model, prompt, data, integration).
- Remediate. Apply fix, re-test, restore service.
- Communicate. Internal and external comms per the matrix below.
- Document. Full incident report within 5 business days.
- Learn. Post-incident review within 10 business days. Findings drive policy and monitoring updates.
Section 5: Communications Matrix
| Severity | Customers | Regulators | Public | Internal |
| SEV-1 | Within 24h or per law | Per applicable timeline | If material; coordinate with legal | Real-time updates |
| SEV-2 | Within 72h if affected | If applicable | Optional | Daily updates |
| SEV-3 | Optional, if affected | Usually no | No | End-of-incident summary |
| SEV-4 | No | No | No | Owning team only |
Section 6: Customer Notification Template
Subject: [Incident notification] [Brief description] — [Date]
We are writing to inform you of an incident that may have affected your account.
What happened:
- [Plain-language description in 2-3 sentences]
When:
- [Start time]
- [End time / "ongoing"]
What we have done:
- [Containment steps]
- [Remediation steps]
- [What is fixed; what is in progress]
What you should do:
- [Specific actions if any]
- [Where to escalate or seek support]
We are continuing to investigate and will share updates at [URL or schedule]. If you have questions, contact [support contact].
Thank you for your patience.
[Signed by accountable executive]
Section 7: Post-Incident Report Template
# Incident Report — [Title]
## Summary
[2-3 sentences]
## Timeline
- [Detect time]: [Event]
- [Containment time]: [Event]
- [Resolution time]: [Event]
## Root Cause
[Technical detail]
## Impact
[Customer / regulatory / brand impact]
## Response
[Actions taken and by whom]
## What Went Well
[Positive findings]
## What Went Wrong
[Negative findings, blameless]
## Action Items
[Owner-tagged remediation list with due dates]
## Lessons Learned
[Policy or monitoring updates triggered]