Learn more about Automated Diagnostics
Automated Diagnostics
Overview
Automated diagnostics is a solution provided by integrating PagerDuty's Incident Response and Runbook Automation products. By automating the retrieval of “diagnostic” data during incidents, you can shorten the length of incidents, reduce the number of individuals paged to help with resolution, and gather evidence for fixing the root-cause after the incident.
Use Cases
There are multiple use-cases and benefits to the Automated Diagnostics solution. Here are a few of the most common examples:
- Improve Triage: surfacing diagnostic data can improve the time spent troubleshooting and the number of people pulled into incidents.
- Capture Environment State: by capturing the environment or application "state" during an incident, operations engineers and developers have evidence to help them fix code-level bugs and configuration errors - perhaps a while after the incident has been resolved.
- Realtime Updates: by querying backend services in realtime, an Incident Commander can more easily provide updates to stakeholders during an incident.
For more details on these use-cases, see this section of the solution-guide.
Prebuilt Automation
PagerDuty provides a solution that helps users start automating diagnostics quickly. This Solution consists of prebuilt Automation Jobs that retrieve data from common infrastructure and services for investigating, debugging and diagnosing incidents:
As an example, if an incident is triggered for a service running in Kubernetes, PagerDuty Runbook Automation can retrieve information from logs, API’s, databases and other sources that support this service. This could be triggered with the click of a button or through event-driven invocation.
Simplifying and Sharing Diagnostics
Diagnostics retrieved using Runbook Automation can be made available in multiple interfaces such as PagerDuty's Mobil App, Slack, and Microsoft Teams:
Examples & Templates
This guide includes a full section on Examples & Best Practices - a preview of that is shown here: