Technical Standards

Incident response runbook

Updated March 2026 7 min read

Incident response runbook defines outage handling for hosted services. It follows the published rule: confirmed outage and critical incidents are handled 7/24, while non-outage requests remain in business-hour workflow.

Classification

ClassTypical conditionHandling model
Outage / criticalSite unreachable, critical function unavailableImmediate triage in 7/24 mode
Degraded but livePartial failures or performance issuesBusiness-hour diagnosis and mitigation plan
Routine requestFeature ask, advisory, non-urgent changePlanned and scoped workflow

Response flow

01
Open one incident thread
Include domain, timestamp, visible symptom, and [OUTAGE] subject marker if applicable.
02
Confirm outage scope
Validate whether impact matches outage/critical criteria or routine-scope issue.
03
Contain and restore
Apply rollback or mitigation to return service to stable state.
04
Close with written summary
Share timeline, action log, and follow-up controls in the same thread.

Communication pattern

  • Keep one thread per event; do not split updates across channels.
  • Share verified facts first, then assumptions labeled clearly.
  • Track key actions with timestamps for post-incident review.
Outage markerFor Kernel Host outage events, use [OUTAGE] in the subject line. This is the documented trigger for immediate 7/24 handling.