The time to restore service after service incidents, rollbacks, or any type of production failure happened.
This metric is essential to measure the disaster control capability of your team and the robustness of the software.
N/A
MTTR = Total incident age (in hours)/number of incidents.
If you have three incidents that happened in the given data range, one lasting 1 hour, one lasting 2 hours and one lasting 3 hours. Your MTTR will be: (1 + 2 + 3) / 3 = 2 hours.
Below are the benchmarks for different development teams:
| Groups | Benchmarks |
|---|---|
| Elite performers | Less than one hour |
| High performers | Less one day |
| Medium performers | Between one day and one week |
| Low performers | More than six months |
Data Sources Required
This metric relies on:
Deployments collected in one of the following ways:Incidents collected in one of the following ways:Transformation Rules Required
This metric relies on:
Deployments.Incidents.