Incident Response Playbook for Security Breaches (2025)
A practiced incident response saves time, money, and reputation. This playbook provides concrete steps and templates.
Roles and responsibilities
- Incident commander, comms lead, forensics, legal/privacy, partners
Phases
- Detect → Triage → Contain → Eradicate → Recover → Lessons learned
Runbooks
- Ransomware containment; credential compromise; web app breach; insider threat
Forensics
- Evidence handling, chain of custody, immutable storage, timeline reconstruction
Communications
- Internal and external templates; regulator/customer notifications; status cadence
Tabletop exercises
- Quarterly, scenario rotation, measurable objectives
Post-incident
- RCAs with action items/owners/dates; control improvements; retests
FAQ
Q: How soon to notify customers?
A: Follow legal requirements and contracts; communicate early with known facts and next steps.
Related posts
- API Security OWASP: /blog/api-security-owasp-top-10-prevention-guide-2025
- Zero Trust Architecture: /blog/zero-trust-architecture-implementation-guide-2025
- Compliance Automation: /blog/compliance-automation-sox-hipaa-gdpr-devops-2025
- Supply Chain Security: /blog/supply-chain-security-sbom-slsa-sigstore-2025
- LLM Security: /blog/llm-security-prompt-injection-jailbreaking-prevention
Call to action
Need an IR readiness review? Schedule a tabletop exercise.
Contact: /contact • Newsletter: /newsletter
Incident Response Playbook (Security Breaches, 2025)
A practical, battle-tested playbook to detect, triage, contain, eradicate, recover, and learn from security incidents. Designed for cloud-native, SaaS, and hybrid enterprises.
1) Principles
- Minimize harm, protect customers, and preserve evidence
- Act fast, communicate clearly, and document everything
- Contain first; eradicate with validated hypotheses; recover safely
2) Incident Lifecycle
1. Detect → 2. Triage → 3. Declare → 4. Contain → 5. Eradicate → 6. Recover → 7. Lessons Learned
3) Severity Matrix (SEV)
- SEV1: Active breach with customer impact or regulatory exposure
- SEV2: Confirmed compromise with limited blast radius
- SEV3: Suspicious activity, potential compromise
- SEV4: Benign anomaly or false positive requiring tuning
4) RACI
- Incident Commander (IC): overall lead
- Comms Lead: internal/external updates
- Forensics Lead: evidence, chain of custody
- Legal/Privacy: regulatory, notification
- Security Engineering: containment/eradication actions
- SRE/App Owners: systems recovery
5) Communications Plan
- Cadence: every 30–60 min for SEV1/2, 2–4h for SEV3
- Channels: incident room (chat), bridges, status page, exec updates, customer comms
- Single source of truth: incident doc with timestamps, owners, actions
6) Evidence Handling and Chain of Custody
- Capture: logs, disk images, memory dumps, configs, IAM changes
- Preserve with hashes; store in evidence bucket (WORM)
- Document who collected, when, how; sign evidence manifests
7) Detection Sources
- SIEM: auth anomalies, data egress, malware indicators
- EDR: process, persistence, lateral movement
- IAM: risky sign-ins, token misuse, consent grants
- Cloud: CW/Stackdriver/Monitor, flow logs, config drift
- App: WAF, RASP, rate limits, error spikes
8) Triage Checklist
- Validate alert fidelity; correlate across sources
- Identify impacted identities, hosts, regions, data classes
- Establish preliminary scope and blast radius
- Decide SEV and declare or close as non-incident with evidence
9) Declaration Template
Title: [Type] – [System] – [Date]
SEV: [1–4]
IC: [Name]
Scope: [assets, accounts, data]
First Seen: [timestamp]
Current Status: [ongoing/contained]
Next Update: [timestamp]
10) Containment Strategies
- Account: password reset, token revoke, step-up auth, session kill
- Network: block indicators, isolate subnets, revoke peering
- Host: EDR isolation, kill processes, disable persistence
- App: rotate secrets, disable integrations, rate-limit risky APIs
- Data: suspend exports, restrict S3/Blob read, snapshot affected stores
11) Forensics Playbook
- Host: memory (Volatility), disk (dd/EWF), timeline (plaso)
- Cloud: CloudTrail/Activity Logs, IAM change sets, config history
- App: request traces, auth logs, admin actions, feature flags
- Network: VPC flow logs, firewall logs, proxy logs
12) Eradication
- Remove persistence (autoruns, cron, startup tasks)
- Patch vulnerabilities; rotate credentials and keys
- Re-image or re-deploy immutable infrastructure
- Validate indicators no longer present
13) Recovery
- Restore services gradually; canary; monitor error/security metrics
- Validate data integrity; run consistency checks; re-index if needed
- Communicate recovery steps and timelines to stakeholders
14) Lessons Learned / Postmortem
- Facts-only timeline; root cause(s); contributing factors
- What worked, what didn’t; action items with owners and due dates
- Preventive controls, detective controls, response improvements
15) Legal and Regulatory
- Assess reportability (GDPR, HIPAA, SOX, state breach laws)
- Regulator notifications within statutory timelines
- Customer/partner notifications: scope, data types, mitigation steps
16) Privacy and Data Classification
- Classification: Public, Internal, Confidential, PII/PHI/PCI
- Map impacted data; evidence labeling; minimization by default
17) Stakeholder Updates Templates
- Exec Brief: impact, timeline, risk, decisions needed
- Customer Update: what happened, what we did, what you can do, next steps
- Regulator Report: facts, scope, affected data, remediation, contact
18) Metrics and SLOs
- MTTD (mean time to detect), MTTR (respond/recover)
- Containment time, eradication time, comms cadence adherence
- False positive rate, detection coverage, exercise frequency
19) Alerts and Detections (Examples)
- Suspicious admin grant
- Impossible travel
- Data egress spike to unfamiliar ASN
- OAuth consent grant to untrusted app
- Mass token refresh failures
20) Dashboards (Sketch)
{
"title": "Security Incident Overview",
"panels": [
{"type":"stat","title":"Open Incidents by SEV"},
{"type":"graph","title":"MTTD/MTTR"},
{"type":"table","title":"Top Alerts"}
]
}
21) Tabletop Exercises (Scenarios)
- Ransomware in prod cluster
- OAuth app takeover and data exfil
- Supply chain malicious dependency
- S3 bucket exfil via access key leak
- Zero-day exploitation of edge service
22) Runbook: Ransomware
- Detect: EDR encryptor patterns, file rename rates
- Contain: isolate hosts, suspend shares, disable service accounts
- Eradicate: wipe/re-image, validate backups, patch vectors
- Recover: staged restore, integrity scans, user comms
23) Runbook: Data Exfiltration
- Detect: egress anomalies, object access logs, API spikes
- Contain: block egress to indicators, suspend export jobs
- Eradicate: rotate keys, revoke tokens, patch sources
- Recover: notify impacted parties, legal steps, tighten DLP
24) Runbook: Account Takeover (ATO)
- Detect: anomalous sign-ins, device posture, geo-velocity
- Contain: force reset, session revoke, step-up MFA
- Eradicate: remove malicious apps, reset recovery methods
- Recover: notify user, restore access, monitor
25) Runbook: Supply Chain (Dependency)
- Detect: SBOM diff, SCA alerts, unusual behavior after deploy
- Contain: blocklist package, freeze releases, revert build
- Eradicate: pin safe versions, rebuild artifacts with attestations
- Recover: communicate, rotate related secrets, audits
26) Runbook: DDoS
- Detect: traffic surge, SYN floods, application errors
- Contain: enable WAF/DDoS protection, rate limits, geo-block
- Eradicate: upstream filters, ISP coordination
- Recover: scale back; report IoCs; tune thresholds
27) Runbook: Insider Threat
- Detect: anomalous access, unusual exports, off-hours activity
- Contain: suspend accounts, restrict access promptly
- Eradicate: rotate credentials, audit shares and links
- Recover: HR/legal coordination, aftercare for teams
28) Cloud Specific — AWS
- Logs: CloudTrail, GuardDuty, VPC Flow, ELB, S3 Access
- Contain: SCPs, IAM deny, security groups, NACLs
- Tools: Detective, Macie, SSM for isolation
29) Cloud Specific — Azure
- Logs: Azure Activity, Sign-in, Defender for Cloud
- Contain: NSGs, Conditional Access, Privileged Identity Management
- Tools: Sentinel, Lighthouse
30) Cloud Specific — GCP
- Logs: Admin Activity, Data Access, VPC Flow, Cloud Armor
- Contain: Org policies, FW rules, service perimeter
- Tools: Security Command Center, Forseti
31) Identity Incidents (IdP)
- SSO config tampering, SCIM abuse, illicit app consent
- Actions: rotate SAML/OIDC secrets, invalidate sessions, review policies
32) SaaS Incidents
- CRM, code hosting, docs; data exposure via sharing links
- Actions: revoke tokens, audit shares, enforce DLP, vendor tickets
33) Vendor and Third-Party Incidents
- Confirm scope, data flows; require vendor IR evidence
- Compensating controls; temporary blocks; customer comms
34) BCDR and DR
- RTO/RPO definitions; tested runbooks; alternate sites
- Regular full restore rehearsals; evidence packs
35) DR Drills and Evidence
- Quarterly drills: region failover, key rotations, restore tests
- Evidence: timings, success criteria, gaps, owners, follow-ups
36) Policy Library
- Access control, data classification, encryption, logging, retention, vendor risk
- Change management, exception handling, acceptable use
37) SOPs and Checklists
- On-call handover, escalation, comms cadence, incident room etiquette
- Evidence collection kits, isolation procedures, containment checklists
38) Templates
- Incident doc, comms briefs, customer notices, regulator reports, postmortems
39) Metrics and Reporting
- Monthly: MTTD, MTTR, detection coverage, drill scores
- Quarterly: control effectiveness, audit findings, risk register deltas
40) Postmortem Template
- Summary, impact, timeline, detection, response, root cause
- What helped/hurt, action items (prevent/detect/respond), owners, dates
41) Mega FAQ (1–400)
-
When to declare?
As soon as credible harm is possible; bias to declare and downgrade later. -
Should we shut down systems?
Contain surgically; avoid unnecessary downtime that destroys evidence. -
Who talks to customers?
Comms lead with Legal approval; avoid speculation. -
Do we pay ransom?
Follow policy and law enforcement guidance; prioritize restoration.
...
JSON-LD
<script type="application/ld+json">
{
"@context": "https://schema.org",
"@type": "Article",
"headline": "Incident Response Playbook: Security Breaches (2025)",
"description": "End-to-end incident response guide: detection, triage, containment, forensics, eradication, recovery, communication, and compliance.",
"datePublished": "2025-10-28",
"dateModified": "2025-10-28",
"author": {"@type":"Person","name":"Elysiate"}
}
</script>
Related Posts
- Supply Chain Security: SBOM, SLSA, Sigstore (2025)
- Zero Trust Architecture: Implementation Guide (2025)
CTA
Need a ready-to-run incident response program? We build detection, runbooks, drills, and governance tailored to your stack.
Appendix A — Roles and On-Call
- IC rotation with backups; shadow program; escalation tree
- Deputies: Comms, Forensics, SecEng, SRE, Legal/Privacy, HR, Product
- Paging policies; silent hours; handover SOP
Appendix B — Severity and Escalation
- SEV rubric with examples; automatic page for SEV1/2
- 15-min initial status; 60-min executive brief for SEV1
Appendix C — Evidence Kit
- Scripts: log collectors, disk/memory imaging, config snapshotters
- Hashing: sha256; manifests; WORM bucket pathing
- Access: least privilege, break-glass with approvals and audit
Appendix D — Logging Schema
{
"ts": "ISO-8601",
"sev": "INFO|WARN|ERROR|SECURITY",
"actor": {"type": "user|svc|ip", "id": "..."},
"action": "...",
"target": "...",
"requestId": "...",
"auth": {"mfa": true, "device": "..."},
"pii": false
}
Appendix E — Cloud Containment Playbook
- AWS: SCP deny, IAM quarantine, SG lockdown, VPC isolation
- Azure: Conditional Access, NSG lockdown, PIM revoke
- GCP: Org policy lock, FW deny, perimeter tighten
Appendix F — Identity and Access
- Rotate SSO secrets; revoke refresh tokens; invalidate sessions
- Review privilege escalations; PIM approvals; risky sign-in checks
Appendix G — Data Handling
- Snapshot before change; checksum; diff later
- Encrypt exports; access tokens with TTL; DLP rules for transfers
Appendix H — Communication Guardrails
- No speculation; facts only; time-bounded unknowns
- Coordinate with legal; embargo until verified
Appendix I — Regulatory Map
- GDPR 72h; state breach laws; sectoral (HIPAA, GLBA, PCI)
- Data categories mapping; thresholds; notification templates
Appendix J — Vendor Management
- Contact matrix; security contacts; ticketing integration
- Require incident reports and evidence; compensating controls
Appendix K — BCDR Integration
- Declare DR if RTO at risk; failover runbook; comms alignment
- Data integrity gates before resuming writes
Appendix L — Metrics and KPIs
- Coverage: % critical detections deployed
- Quality: FP rate, alert fidelity, MTTA/MTTD/MTTR
- Program: exercise cadence, action item closure rate
Appendix M — Tools Catalog
- SIEM/SOAR, EDR, NDR, CSPM, CWPP, IAM, DLP, SCA/SAST/DAST, SBOM, ticketing
- Evidence vault: immutable storage with retention and legal holds
Extended Runbooks — Malware/EDR
- Contain via EDR isolate; triage persistence keys; memory capture
- IOC sweep across fleet; patch vector; reimage at scale
Extended Runbooks — OAuth Consent Abuse
- Revoke app consent; tenant-wide block; review sign-in risk
- Search audit logs for token exchange; notify impacted users
Extended Runbooks — Build Pipeline Compromise
- Freeze deploys; revoke CI secrets; rebuild from clean base
- Verify provenance; rotate signing keys; audit artifact stores
Extended Runbooks — Database Leak
- Block exfil path; rotate DB creds; rotate app secrets
- Assess leaked fields; customer notification decision; monitoring
Tabletop Injects
- Inject 1: SMS from attacker claiming data theft
- Inject 2: Journalist inquiry; embargo request
- Inject 3: Regulator asks for status within 24h
- Inject 4: Customer demands deletion confirmation
Policy Checklists
- Access review within 24h of incident
- Exceptions documented with expiry
- Evidence retained min 1 year or per regulation
Evidence Pack Template
- Incident ID, timestamps, participants
- Evidence links (hashes), actions timeline, decision log
- Metrics, screenshots, logs, forensics reports
Detection Library (Sketch)
- name: impossible-travel
query: geo_distance(user, prev_user) / delta_t > threshold
- name: mass-download
query: sum(bytes by user over 1h) > baseline * 5
Alert Routing
- Sev-based channels; paging trees; backup contacts
- Throttle noisy rules; deduplicate; attach runbook links
Aftercare
- Support load; burnout prevention; debriefs; counseling resources
Mega FAQ (401–800)
-
When do we involve law enforcement?
On extortion, significant data theft, or mandated by regulation. -
Can we reset all user passwords?
Yes when token compromise or credential stuffing; plan comms and support. -
Evidence too big to store quickly?
Prioritize logs and volatile memory; schedule full images post-containment. -
Do we notify before complete facts?
Notify within legal windows with clear known/unknown statements. -
How to handle insider incidents?
Involve HR and legal; preserve privacy; minimize bias. -
What if backups are infected?
Test restores; pick clean point-in-time; harden restore pipeline. -
Third-party refuses evidence?
Escalate contractually; apply compensating controls and customer comms. -
What if detection is wrong?
Document false positive; tune rule; share learnings; avoid blame. -
How to coordinate globally?
Follow-the-sun on-call; regional privacy counsel; local comms norms. -
Final: act fast, keep evidence clean, communicate clearly.
Appendix N — Organization Models
- Centralized IR with embedded liaisons in product teams
- Federated IR with standards and shared tooling
- Hybrid: central command, local executors
Appendix O — Threat Landscape 2025
- OAuth consent phishing and token theft
- Supply chain: package typosquatting, CI secret exfil
- SaaS data exposure via link sharing and misconfigured apps
- Ransomware in cloud with exfil and extortion
Appendix P — Exposure Assessment
- Identify data classes affected; estimate records; confidence bands
- Determine encryption at rest/in transit and key exposure
- Assess regulatory thresholds per region and vertical
Appendix Q — Scope Expansion Heuristics
- Lateral movement indicators; anomalous privilege use; cross-account signs
- IoCs: domains, IPs, hashes; replay across logs and EDR
Appendix R — Credential Hygiene
- Rotate all touched credentials; prioritize long-lived tokens and keys
- Enforce MFA; review break-glass accounts and vault audit trails
Appendix S — Customer Protection
- Forced password resets; token invalidation; app-specific revoke
- Advisory: phishing awareness, 2FA enablement, credit monitoring (if relevant)
Appendix T — Executive Engagement
- Decision points: disclosure, law enforcement, downtime, compensation
- Keep execs informed, time-bounded unknowns, avoid technical minutiae
Appendix U — Insurance and Counsel
- Cyber insurance notification timing; panel vendors; counsel privilege
Appendix V — Vendor Contracts
- Security addenda obligations; notification SLAs; audit rights
- Data processing agreements; subprocessor listings
Appendix W — Lessons Database
- Tag actions by prevent/detect/respond; track completion and efficacy
Appendix X — Training and Drills
- Quarterly table-tops; semiannual live-fire with red team;
- Cross-functional drills: legal, HR, support, PR, product
Appendix Y — Tooling Integrations
- SIEM↔SOAR playbooks; ticketing auto-populate; evidence vault API
Appendix Z — Budget and Roadmap
- Prioritize high-efficacy controls from past incidents; sunset low-value alerts
Extended Scenarios
- Key compromise in CI/CD: rotating OIDC/IAM roles, re-sealing pipeline
- MFA fatigue attack: device revocation, push frequency caps
- Public bucket with PII: immediate block, access review, notification decision
Regulatory Workflows
- GDPR: DPO assessment; notify within 72h; cross-border coordination
- HIPAA: breach risk assessment; HHS reporting; affected individuals
- SOX: materiality assessment; audit committee brief
Communication Templates (Samples)
Internal Update (Hourly)
- What we know, what we don’t, what’s next, who owns what, next update time
Customer Notice (Draft)
- Incident summary, data involved, timeline, actions taken, recommended steps
Regulator Submission
- Nature, scope, number of individuals, mitigation, contact, attachments
Metrics Dashboards (Sketch JSON)
{
"title": "Incident Program KPIs",
"panels": [
{"type":"graph","title":"MTTD by Quarter"},
{"type":"graph","title":"MTTR by Severity"},
{"type":"table","title":"Top Repeated Root Causes"}
]
}
Mega FAQ (801–1100)
-
Who can declare an incident?
Any on-call security engineer or higher; IC confirms and sets SEV. -
Do we notify customers before we fully understand scope?
As required by law; state knowns/unknowns; avoid speculation. -
How to manage shadow comms?
Centralize in incident room; discourage side channels; summarize often. -
Can engineers self-isolate hosts?
Yes via runbook; document; coordinate with IC to avoid evidence loss. -
When to rotate all tokens?
When issuer or signing keys are compromised, or auth logs indicate mass abuse. -
Should we take systems offline?
Only if containment cannot be achieved online or harm is escalating rapidly. -
When to engage DFIR vendor?
SEV1/complex cases or capacity shortfall; get SOW and SLAs in place. -
How do we handle press inquiries?
Through Comms with Legal oversight; no off-the-record technical details. -
Can we pay for deletion guarantees?
Assume adversaries lie; focus on containment and remediation. -
Final: protect customers, preserve evidence, communicate clearly.
Mega FAQ (1101–1400)
-
What if backups include malware?
Restore from clean point, scan, and patch vectors before going live. -
Should we ever edit logs?
Never; append-only; redact via views; preserve originals in WORM. -
When to reset all employees’ passwords?
When IdP risk indicates potential compromise or password reuse incident. -
Do we ever withhold details?
Share necessary info by audience; avoid enabling attackers; meet legal duties. -
How to avoid analyst burnout?
Automate triage, rotate shifts, cap alert volumes, enforce downtime. -
Final: build muscle with drills; reduce toil; iterate on lessons.
Appendix AA — Detection Engineering
- Pipeline: hypothesize → simulate → implement → validate → tune → attest
- Sources: auth, endpoint, network, cloud, app, identity, SaaS
- Tuning: suppress benign, contextual allowlists with expiry, FP review
- Quality: precision/recall, mean time to triage, rule ownership
Appendix AB — SOAR Playbooks
- Auto-enrich indicators; ticket creation with context; notify IC
- Guardrails: auto-actions only for low-risk steps (disable token, block IP)
- Human-in-the-loop approvals for destructive containment
Appendix AC — Threat Intel
- Feeds: commercial, ISACs, open source; normalize and score
- Campaign tracking: tag incidents with families and TTPs (MITRE ATT&CK)
- Sharing: STIX/TAXII with partners under policy
Appendix AD — MITRE ATT&CK Mapping
- Map detections and incidents to ATT&CK; identify coverage gaps
- Prioritize high-impact TTPs; build simulations for gaps
Appendix AE — Purple Teaming
- Collaborate red/blue to test controls; track hypotheses and outcomes
- Close loop: fix gaps; verify; document learning
Appendix AF — Crisis Management
- SEV1 activation: exec bridge, legal/comms escalation, regulator prep
- Decision rights: downtime, disclosure, ransom, third-party engagement
Appendix AG — Business Impact Analysis (BIA)
- RTO/RPO per service; critical dependencies; data classifications
- Incident prioritization uses BIA to focus response
Appendix AH — Legal Holds
- Trigger: litigation/regulatory; freeze evidence retention
- Process: counsel approval; index scope; notify custodians
Appendix AI — Privacy Review Board
- Cross-functional approvals for high-risk data use during IR
- Document necessity, proportionality, and retention limits
Appendix AJ — Redaction and Minimization
- Default to redact PII in comms and logs; produce unredacted only under privilege
Appendix AK — Data Subject Requests During IR
- Coordinate DSAR with IR; avoid tipping; lawful exemptions tracked
Appendix AL — HR/Insider Coordination
- Maintain confidentiality; need-to-know; preserve employee rights
Appendix AM — Finance and Procurement
- Emergency spend approvals for DFIR/tools; vendor MSA for rapid engagement
Appendix AN — Access Control Freeze
- Freeze privilege escalations; break-glass logged; approvals required
Appendix AO — Change Management During IR
- Risky changes paused; only containment/eradication allowed with IC approval
Appendix AP — Customer Success and Support
- Macro responses; escalation paths; tooling for account flags
Appendix AQ — Red Team Safety
- Pause adversarial testing during active SEV1/2 unless coordinated
Appendix AR — Vendor DFIR Engagement
- Pre-negotiated hourly rates; SLAs; data handling terms; evidence return
Appendix AS — Evidence Storage
- Immutable object storage with legal holds; dual-region; access logs
Appendix AT — Key/Secret Rotation Ladder
- Order: SSO, infra creds, app secrets, DB creds, API keys; validate after each
Appendix AU — OAuth/OIDC Remediation
- Revoke consents; rotate client secrets; re-issue signing keys; JWKS publish
Appendix AV — Public Policy and Press
- Coordinate statements; avoid speculative attributions; commit to updates
Appendix AW — Education and Anti-Phish
- Targeted training after relevant incidents; measure click rates; improve
Appendix AX — Attack Surface Reduction
- Close unused ports; remove stale users; rotate dormant keys; enforce MFA
Appendix AY — Configuration Baselines
- CIS benchmarks; drift detection; golden images; policy as code
Appendix AZ — Compliance Traceability
- Map incidents and remediation to control catalogs (ISO 27001, SOC2)
Appendix BA — SaaS Hardening
- SSO-only, SCIM, app allowlists, sharing restrictions, DLP scanners
Appendix BB — Ticketing and SLA
- Incident tickets per workstream; SLA for updates; linkage to evidence
Appendix BC — Shadow IT Intake
- Intake form; security review; migration plans
Appendix BD — Service Ownership in IR
- Owner on-call joins; SLO context; rollback and feature flags
Appendix BE — Feature Flags for Containment
- Kill switches; traffic shaping; geoblocking; safe mode UIs
Appendix BF — Mobile and Desktop Apps
- Revoke tokens, push forced updates, disable outdated versions
Appendix BG — API Abuse and Bot Mitigation
- Rate limits, device fingerprints, behavioral signals, CAPTCHA challenges
Appendix BH — Payments and PCI
- Scope assessment; card data exposure handling; acquirer notifications
Appendix BI — Physical Security
- Badge logs; camera footage; visitor records; device inventory
Appendix BJ — OT/IoT Considerations
- Isolation; firmware integrity; network segmentation; safe shutdown
Appendix BK — Privacy Enhancing Tech
- Pseudonymization; tokenization; field-level encryption during IR
Appendix BL — Backup Integrity
- Regular test restores; malware scanning; immutable backups
Appendix BM — Key Metrics Library
- Detections by source; dwell time; action item closure; drill pass rate
Appendix BN — Audit Readiness
- Evidence indexes; control mappings; trail for every decision
Appendix BO — Customer Trust Center
- Central hub for incidents, policies, status, and remediation updates
Appendix BP — Translation and Localization
- Pre-translated templates; region-specific regulations and expectations
Appendix BQ — Accessibility of Comms
- Accessible formats; alt text; clear language; multiple channels
Appendix BR — Social Engineering Countermeasures
- Callback verification; no link-based resets; out-of-band confirmations
Appendix BS — Threat Hunting During IR
- Parallel hunts for adjacent compromises; track hypotheses; close or escalate
Appendix BT — Risk Register Integration
- Update likelihood/impact; link actions; report to governance
Appendix BU — Third-Party Risk Tiering
- Tier vendors; require attestations; monitor security posture changes
Appendix BV — Cost Tracking
- Track IR cost: downtime, services, labor, credits, legal
Appendix BW — Continuous Improvement Cadence
- Monthly IR council; review incidents, actions, metrics
Appendix BX — Program Roadmap
- Next 4 quarters: detections, training, automation, tabletop plans
Appendix BY — Checklists (Quick)
- Declare fast; contain surgically; preserve evidence; communicate clearly
Appendix BZ — Final Guidance
- People + process + tooling; rehearse until boring; measure relentlessly
Extended Dashboards (PromQL Sketch)
# Active incidents by SEV
group_left() count by (sev) (security_incident_active{org="elysiate"})
# MTTD
histogram_quantile(0.5, sum(rate(incident_detect_seconds_bucket[1w])) by (le))
Mega FAQ (1401–1800)
-
How to handle executive pressure to ship during IR?
Use risk-based decisions; IC authority; document and obtain sign-off. -
Should we keep detection rules quiet?
Avoid tipping attackers but share with partners under NDA when helpful. -
What if we cannot prove exfiltration?
Report with confidence bounds; disclose limits; improve telemetry. -
Can we delete attacker data?
Isolate and retain for forensics; follow legal holds. -
Who owns customer comms?
Comms with Legal; IC reviews technical accuracy.
...
Mega FAQ (1801–2200)
-
When to rotate all org tokens?
IdP compromise, signing key leak, or widespread suspicious use. -
Is secrecy acceptable?
Be transparent with customers and regulators while protecting investigations. -
Can we run chaos IR?
Yes in controlled envs; never during active SEV1 without IC approval. -
What if DFIR and Legal disagree?
Escalate to exec sponsor; document options and risks.
...
Mega FAQ (2201–2400)
-
How to reduce false positives?
Contextual signals, better baselines, feedback loops, purple teaming. -
When to sunset a detection?
Noisy with low value; replace with higher-fidelity variants. -
Final: speed, evidence, empathy—protect users and learn fast.
Appendix CA — Data Retention and Disposal
- Retention by data class; legal holds override
- Secure deletion workflows; verify destruction; audit trails
Appendix CB — Cross-Border Coordination
- Regional incident leads; regulator mapping; translation workflows
- Respect data residency and transfer restrictions during IR
Appendix CC — Red Team Debrief Integration
- Feed red team findings into IR training and detections backlog
- Prioritize TTPs seen in the wild and in production incidents
Appendix CD — Access Reviews During IR
- 24h review for privileged roles; revoke stale and risky grants
- Temporary scopes only; expiry enforced; audit approvals
Appendix CE — Shadow Access and Backdoors
- Scan for unmanaged IdPs, SSH keys, hardcoded tokens; eradicate
- Policy: no shared accounts; hardware keys for admins
Appendix CF — Data Integrity Validation
- Hash checks, row counts, referential integrity; sampling
- Business invariants validation post-restore
Appendix CG — Customer Success Playbooks
- Handling password reset waves; credit monitoring requests; SLAs
- Escalation to security on suspicious replies and phishing
Appendix CH — Partner Management During IR
- Security contacts list; bilateral evidence exchange; joint statements
- Temporary scopes and allowlists; revoke after resolution
Appendix CI — Mobile Push and App Stores
- Force updates; deprecate compromised versions; store communications
Appendix CJ — Hardware Token Logistics
- Emergency stock; lost token process; remote shipment; backup auth
Appendix CK — Physical Office IR
- Badge disable; lock zones; device collection; escort policies
Appendix CL — Vendor DFIR Evidence Expectations
- Logs, scope, timelines, actions, indicators, data involvement
- SLA for updates; final report with root cause and improvements
Appendix CM — Cryptographic Incidents
- Compromised keys: revoke, rotate, re-issue; CRL/OCSP updates
- Protocol downgrade attacks: disable legacy; enforce TLS modern profiles
Appendix CN — Certificate Transparency and Monitoring
- Watch rogue certs; automated alerts; takedown workflows
Appendix CO — Brand and Anti-Phish Ops
- Takedowns for phishing domains; DMARC enforcement; BIMI where supported
Appendix CP — Public Cloud Guardrails
- SCP/org policies; config rules; drift blockers during IR
Appendix CQ — Legal Strategy
- Privilege use; counsel engagement; regulator dialogue; settlement posture
Appendix CR — Workforce Safety
- Threats and harassment playbook; liaison with authorities
Appendix CS — Social Media Monitoring
- Track narratives; correct misinformation; avoid fueling attackers
Appendix CT — Academic and Research Coordination
- Coordinated disclosure with researchers; CVE processes; timelines
Appendix CU — Open Source Maintainers
- Report upstream issues; contribute patches; disclose responsibly
Appendix CV — AI/ML System Incidents
- Prompt injection, data leakage, model theft; red-teaming; guardrails
Appendix CW — Data Lake and Analytics Incidents
- Query audit; export controls; downscoping service accounts
Appendix CX — Edge/CDN Incidents
- Rules abuse; config drift; purge workflows; cache poisoning defense
Appendix CY — IoT and Device Fleets
- Fleet isolation; OTA rollback; secure boot attestation
Appendix CZ — Final Program Operating Principles
- Bias to declare; act with care; measure and learn; protect users first
Runbooks (Extended Set)
- OAuth token theft at scale: issuer rotation, consent revoke, client secrets
- Git credential leak: rotate PATs, audit commits, re-issue deploy keys
- Terraform state exposure: rotate providers, state key rotation, tenant audit
- KMS key compromise: re-encrypt data, rotate KEKs, audit usage
- Container registry leakage: revoke tokens, rotate images, provenance checks
Dashboards (Program)
{
"title": "IR Program Overview",
"panels": [
{"type":"graph","title":"Incidents per Month by SEV"},
{"type":"graph","title":"Containment Time (p50/p90)"},
{"type":"table","title":"Open Action Items >30d"}
]
}
Policies (Quick Index)
- Incident Declaration, Evidence Handling, Communications, Vendor IR, BCDR, DR Drills
- Key Rotation, Access Freeze, Privileged Access, Data Classification, Retention, Legal Holds
Drills Catalog
- Quarterly: ransomware, data exfil, IdP compromise, supply chain
- Annual: region failover, full restore, regulator mock audit
Mega FAQ (2401–2800)
-
What if attacker claims larger scope than we see?
State facts; investigate; avoid negotiating via claims. -
Can we use honeypots?
Yes with legal review; ensure separation from production data. -
Should we rebuild from scratch after major breach?
Consider if persistence likely; weigh downtime vs risk. -
When to inform board?
SEV1 or material risk; provide clear decisions and options.
Mega FAQ (2801–3000)
-
What if we suspect insider collusion?
Engage HR/legal; minimize bias; increase logging; limit access. -
Do we disclose exact IoCs?
Share with partners where useful; avoid enabling copycat attacks publicly. -
Can we delete all logs to save cost?
Never; adjust retention policies with risk in mind; cold storage if needed. -
Is paying for data deletion ever wise?
Assume not; focus on containment and customer protection.
Mega FAQ (3001–3200)
-
What’s the fastest way to lose trust?
Silence, speculation, and broken promises. Be clear, timely, and accurate. -
How do we prioritize fixes post-incident?
Balance exploitability, impact, coverage; track to completion with owners. -
Can we automate everything?
Automate toil; keep humans for judgment. Guardrails around automation. -
Final: stay humble, prepare well, and execute with care.
Appendix EL — Shadow IT Containment
- Discovery scans; intake process; migrate to managed apps; revoke risky access
Appendix EM — Procurement Fast-Track
- Pre-approved vendors for DFIR, EDR surge, takedown services
Appendix EN — Access Freeze Implementation
- IAM policy layers to deny escalations; emergency exceptions logged
Appendix EO — Exec War Room SOP
- Agenda: impact, decisions, blockers, next steps; strict timeboxes
Appendix EP — Evidence Review Board
- Approves release of sensitive evidence; maintains inventory and retention
Appendix EQ — Customer Advisory Board Feedback Loop
- Solicit feedback on comms and remediation; integrate improvements
Appendix ER — Risk Acceptance and Exceptions
- Track exceptions with expiry; review monthly until remediated
Appendix ES — Recovery Quality Gates
- Pass security checks, integrity tests, and performance SLOs before GA
Appendix ET — Cross-Company Exercises
- Joint drills with critical vendors and partners; shared objectives
Appendix EU — Ethics and Equity
- Fair treatment of employees and customers; avoid bias in decisions
Appendix EV — Budgeting for IR
- Reserve for DFIR, credits, comms, customer care, legal
Appendix EW — Documents and Records
- Central repository with access controls; audit trails; versioning
Appendix EX — Program Charter
- Mandate, authority, stakeholders, cadence, KPIs
Appendix EY — External Advisors
- Panel of advisors for law, PR, DFIR, compliance; retainer agreements
Appendix EZ — Final Principles
- Protect users; preserve evidence; communicate with integrity; improve continuously
Mega FAQ (3201–3600)
-
When to declare victory?
When containment is confirmed, eradication validated, and recovery gates pass. -
Who approves disclosure wordings?
Legal and Comms; IC for technical accuracy. -
How to align security and product priorities post-incident?
Joint roadmap; executive backing; explicit tradeoffs. -
Should we publish root cause?
Share at the right depth; focus on remediation and prevention. -
Final: readiness beats heroics—practice, measure, and care.
Additional Runbooks
- Browser extension compromise: enterprise policy revoke, extension audit, user comms
- Infrastructure as Code drift exploit: lock pipelines, review PRs, restore baseline
- Certificate mis-issuance: revoke, re-issue, HSTS reinforce, CT monitoring
- Password spraying: IP blocks, 2FA prompts, user alerts, telemetry enhancements
Mega FAQ (3601–4000)
-
What if a zero-day is actively exploited?
Mitigate with compensating controls; segment; work with vendors; communicate risks. -
How to handle conflicting stakeholder goals?
IC facilitates; exec sponsor decides; document and proceed. -
Do we ever ignore ransom deadlines?
Follow policy; prioritize restoration; engage law enforcement. -
How to ensure lessons are implemented?
Owners, due dates, dashboards, exec reviews; no-merge policy until closed for criticals. -
Final: calm execution, clear comms, and relentless follow-through.
Appendix FA — Incident Taxonomy
- Categories: Auth, Data, Malware, Infra, Network, App, SaaS, Third-Party, Physical
- Tags: vector, impact, data-class, region, service, TTPs
Appendix FB — Playbook Versioning
- Semantic versions; change log; owner; last drill date; next review
Appendix FC — Decision Logs
- Template: decision, options, criteria, approver, timestamp, outcomes
Appendix FD — Containment Patterns by Vector
- OAuth: revoke consents, rotate client secrets, re-issue JWKS
- SSH: rotate host/user keys, disable password auth, restrict bastions
- API Keys: revoke, rotate, scope down, monitor use
Appendix FE — Eradication Verification
- IOC sweeps, behavior baselines, purple validation, post-change monitoring
Appendix FF — Recovery Gates Matrix
- Security: zero IoC hits, detections enabled, secrets rotated
- Reliability: error budgets normal, performance SLOs met
- Product: core flows pass; support capacity ready
Appendix FG — Social Engineering Resilience
- Callback policies, no link-based resets, escalation codewords
Appendix FH — External Coordination SOPs
- ISACs, CERTs, law enforcement, cloud providers, registrars, CDNs
Appendix FI — Evidence Redaction Profiles
- Customer data minimized; investigator-only unredacted stored under hold
Appendix FJ — Budget and Credits
- Cloud credits for incidents; customer credits; vendor concessions
Extended Templates — Incident Doc (Markdown)
# Incident: <ID> <Title>
- SEV: <1-4>
- IC: <name> | Comms: <name> | Forensics: <name>
- Declared: <ts> | Next Update: <ts>
- Scope: <assets, accounts, data>
- Actions Log:
- [ts] <actor> <action> <result>
- Decisions:
- [ts] <decision> by <approver>
- Evidence:
- <link> (sha256:<hash>)
Dashboards — Exec Summary (Sketch)
{
"title": "Exec IR Summary",
"panels": [
{"type":"stat","title":"Active SEV1/2"},
{"type":"graph","title":"MTTR (rolling 90d)"},
{"type":"table","title":"Top Control Gaps"}
]
}
Training — Quarterly Cadence
- Q1: Identity takeover + customer comms
- Q2: Supply chain + build pipeline compromise
- Q3: Ransomware + DR restore
- Q4: Data exfil + regulator engagement
Mega FAQ (4001–4300)
-
How do we prevent scope creep?
Maintain hypotheses; prove/disprove quickly; log decisions; avoid thrash. -
What if legal requests pause evidence sharing?
Record hold; summarize status; proceed with permitted actions. -
How to keep engineers from burning out?
Shifts, caps, rotations, async updates, and aftercare. -
Can we use production forensics?
Prefer snapshots; avoid destructive tooling; coordinate with SRE. -
Who signs off on closure?
IC with Legal and Security leadership; verify gates passed.
Mega FAQ (4301–4500)
-
Do we pay for vulnerability finder bounties during IR?
Via bug bounty program; separate from extortion. -
Should we rotate all infra after SEV1?
Prioritize based on vector and evidence; staged rotation plan. -
What’s the right comms tone?
Empathy, clarity, and accountability; avoid blame. -
Final: strong muscle memory beats ad hoc heroics—drill and document.
Mega FAQ (4501–4700)
-
Can we resume launches during IR?
Only for low-risk changes approved by IC; log decision and rationale. -
How to avoid noisy comms?
Cadence discipline, clear owners, concise updates, avoid speculation. -
What if customers demand technical details?
Share appropriate depth; protect investigations; provide remediation steps. -
Final: credibility = speed × clarity × care.
Final Checklist
- [ ] Declared, SEV set, owners assigned
- [ ] Contained with minimal blast radius
- [ ] Evidence captured and preserved
- [ ] Eradication validated, recovery gates passed
- [ ] Stakeholders informed and next steps clear
- [ ] Postmortem scheduled with actions and owners
Quick Reference
- Declare early; contain surgically; preserve evidence; communicate clearly; improve relentlessly.
Troubleshooting Index
- Alert flood → dedupe, throttle, improve fidelity
- Containment stalls → reconsider scope, add isolation layers, seek exec support
- Evidence gaps → reconstruct from secondary sources, fix logging gaps post-IR
- Comms drift → reset cadence, reassign owners, publish summary
Final Notes
- Practice table-tops quarterly; track action closure; refresh playbooks biannually.
References
- NIST SP 800-61r2: Computer Security Incident Handling Guide
- NIST SP 800-53 rev5: Security and Privacy Controls
- ISO/IEC 27035: Information Security Incident Management
- ENISA Guidelines on Incident Reporting
- FIRST CSIRT Services Framework
- SANS Incident Handler’s Handbook
- MITRE ATT&CK and D3FEND Knowledge Bases
- CISA Alerts and Known Exploited Vulnerabilities Catalog
- Cloud Provider IR Guides (AWS, Azure, GCP)
- OWASP ASVS and MASVS
Glossary (Selected)
- IOC: Indicator of Compromise
- TTP: Tactics, Techniques, and Procedures
- DFIR: Digital Forensics and Incident Response
- BCDR: Business Continuity and Disaster Recovery
- MTTD/MTTR: Mean Time to Detect/Recover
- DLP: Data Loss Prevention
- SOAR: Security Orchestration, Automation, and Response
- SIEM: Security Information and Event Management
- SSO/IdP: Single Sign-On / Identity Provider
- PII/PHI/PCI: Sensitive Data Classes
- WORM: Write Once, Read Many (immutable storage)
Quick Actions Cheat Sheet
- Declare early when credible harm is possible; set SEV and owners
- Start an incident doc; timestamp every action and decision
- Preserve evidence: logs, memory, disk, configs; hash and store immutably
- Contain with least disruption: isolate accounts/hosts, rotate keys
- Engage Legal/Comms early; align on cadence and audiences
- Eradicate persistence; patch vectors; validate with IOC sweeps
- Recover gradually with gates; monitor closely; canary critical paths
- Communicate with clarity and empathy; avoid speculation; set next-update time
- Schedule postmortem; assign actions; track to closure
- File learnings into detections, controls, and training backlog