FanDesk

Incident Analytics & Metrics

Measure your team's incident response performance, identify recurring patterns, and track improvement over time with FanDesk's incident analytics dashboard.

Key Metrics

MTTA — Mean Time to Acknowledge

The average time between an incident being triggered (created) and the first acknowledgment by a responder.

TargetDescription
< 5 minutesExcellent — alert and response system is working
5-15 minutesGood — responders are monitoring effectively
15-60 minutesNeeds improvement — consider better alerting or on-call coverage
> 1 hourCritical gap — incidents are sitting unattended

High MTTA usually indicates a problem with alerting, on-call coverage gaps, or responders not being notified quickly enough (WhatsApp notifications can help here).

MTTR — Mean Time to Resolution

The average time between an incident being triggered (created) and being marked resolved.

SeverityTarget MTTR
CriticalUnder 2 hours
HighUnder 4 hours
MediumUnder 24 hours
LowUnder 3 days

MTTR varies significantly by severity and incident type. Track the trend over time — is it improving or worsening? — more than comparing against an absolute number.

MTTA vs MTTR

The gap between MTTA and MTTR is the investigation-and-fix time. A very low MTTA with high MTTR means responders acknowledge quickly but struggle to resolve. Improve with better runbooks, clearer escalation paths, and postmortem action items.

Analytics Dashboard

The Incidents analytics dashboard provides a comprehensive view of your incident data.

Summary Cards

CardDescription
Total Incidents (30 days)All incidents created in the past month
Open IncidentsCurrently active (Triggered + Acknowledged + Investigating)
Critical / High OpenHighest severity incidents requiring immediate attention
Average MTTACalculated across all incidents in the selected period
Average MTTRCalculated across all resolved incidents in the selected period

Incident Trends Chart

Bar chart showing incident count over time (daily or weekly granularity):

  • Spot spikes in incident frequency — often correlate with major deployments or infrastructure changes
  • Identify recurring problem periods (e.g., Monday mornings after weekend maintenance)
  • Track whether incident frequency is trending up or down

Distribution by Severity

Pie or bar chart showing the breakdown of incidents by severity level:

  • A healthy pattern has more Low/Medium incidents than Critical/High
  • Increasing proportion of Critical incidents is a warning sign of systemic instability
  • Use this to justify stability investment to stakeholders

Distribution by Category

Which types of incidents are most common:

  • Heavy on Outages → reliability and infrastructure investment needed
  • Heavy on Security → security posture review needed
  • Heavy on Performance → capacity planning or optimization needed
  • Heavy on Bugs → QA and testing process improvements needed

MTTA and MTTR Over Time

Line charts showing how your average acknowledgment and resolution times change week over week:

  • A downward trend means your team is getting faster
  • Spikes often correspond to periods of high complexity, team changes, or coverage gaps
  • Use this in team retrospectives to measure the impact of process changes

Top Responders

Leaderboard showing who handles the most incidents in the selected time period:

  • Name, avatar, and incident count
  • Use for workload balancing — one person handling 80% of incidents is a bus factor risk
  • Use for recognition — acknowledge top contributors in team meetings

Incident Heatmap

Visual map of when incidents typically occur (day of week × hour of day):

  • Identify if incidents cluster on Monday mornings (post-weekend change deployment risk)
  • Identify if incidents cluster during business hours vs. off-hours (oncall coverage planning)
  • Find patterns that suggest systemic causes rather than random failures

Exporting Incident Data

Export all incident data for external analysis or reporting:

  1. Go to the Incidents analytics page
  2. Set the date range using the range picker
  3. Click Export
  4. Download as CSV
  5. Open in Excel, Google Sheets, or your BI tool of choice

The CSV includes: incident ID, title, severity, category, status, created time, acknowledged time, resolved time, MTTA, MTTR, and assignee.

Using Analytics to Drive Improvement

Identify Patterns

Ask the right questions of your data:

  • Do incidents spike after every deployment? → Improve deployment testing and staging validation
  • Does MTTA jump on weekends? → On-call coverage gap or alerting not reaching responders
  • Does one category dominate? → Systemic issue in that area needing dedicated investment

Improve Response with Runbooks

  • High MTTR for a specific category? → Document resolution steps as runbooks linked from the category
  • Repeated postmortem action items that don't get done? → Ensure they are tracked as real FanDesk tasks with owners and due dates

Track Progress Month Over Month

  • Set a quarterly MTTA target (e.g., reduce from 45 minutes to under 15 minutes)
  • Review the trend chart in monthly incident retrospectives
  • Celebrate wins when metrics improve — it reinforces the behavior changes that caused improvement

Using DeskMate for Incident Analytics

Ask DeskMate for insights:

  • "How many critical incidents did we have last month?"
  • "What's our average MTTR for the last quarter?"
  • "Which team member resolved the most incidents in Q1?"
  • "Are there any patterns in when our outages occur?"

Next: Learn about the daily digest in Daily Digest.

Need help? Contact us at hello@fandesk.ai