Hyperce Knowledge Base
SOPs

Incident Response

Standard operating procedure for handling production incidents.

Overview

This document outlines the steps to follow when a production incident occurs.

Severity Levels

LevelDescriptionResponse Time
P0Service down, all users affectedImmediate
P1Major feature broken, many users affectedWithin 15 minutes
P2Minor feature broken, some users affectedWithin 1 hour
P3Cosmetic or low-impact issueNext business day

Response Steps

  1. Acknowledge the alert and join the incident channel.
  2. Assess the severity level based on the table above.
  3. Communicate status to stakeholders.
  4. Investigate root cause using logs and monitoring dashboards.
  5. Mitigate the issue (rollback, hotfix, or feature flag).
  6. Resolve and confirm the fix is deployed.
  7. Post-mortem within 48 hours for P0/P1 incidents.

On this page