1.5 KiB
Incident template
This is a recommended template to document incidents. You might not always need all of it, and you might some times want to add new sections. Use your own judgement. We recommended tagging incidents as YYYYMMDD-inc. So, if two incidents happen on 2024-06-09, you would tag them
20240609-01and20240609-02. Inspired by: https://github.com/dastergon/postmortem-templates
Title of the incident
Managed by: Author here
Summary
- Components involved: What parts of our system were affected or played a significant role
- Started at: When did the issue actually start
- Detected at: When did we notice that the incident existed
- Mitigated at: When did we bring things to a stable state without further impact
A brief summary of what happened.
Impact
What were the negative consequences of the incident
Timeline
Events as they happened over time. Make sure to write down what’s the Time zone you are using.
Root Cause(s)
An explanation on what root causes started the incident and how they unfolded into the full-fledged incident
Resolution and recovery
What was done to fix the incident and go back to normal
Lessons Learned
List of knowledge acquired. Typically structured as: What went well, what went badly, where did we get lucky
Action Items
What should be done after the incident to prevent future occurrences of the same issue
Appendix
Miscellanea corner for anything else you might want to include