# Automated Diagnostics
# What is PagerDuty's Automated Diagnostics Solution?
Automated diagnostics is a solution provided by integrating PagerDuty's Incident Response and Runbook Automation products. By automating the retrieval of “diagnostic” data during incidents, you can shorten the length of incidents, reduce the number of individuals paged to help with resolution, and gather evidence for fixing the root-cause after the incident.
# Use Cases
There are multiple use-cases and benefits to the Automated Diagnostics solution. Here are a few of the most common examples:
- Improve Triage: surfacing diagnostic data can improve the time spent troubleshooting and the number of people pulled into incidents.
- Capture Environment State: by capturing the environment or application "state" during an incident, operations engineers and developers have evidence to help them fix code-level bugs and configuration errors - perhaps a while after the incident has been resolved.
- Realtime Updates: by querying backend services in realtime, an Incident Commander can more easily provide updates to stakeholders during an incident.
For more details on these use-cases, see this section of the solution-guide.
# Prebuilt Automation
PagerDuty provides a solution that helps users start automating diagnostics quickly. This Solution consists of prebuilt Automation Jobs that retrieve data from common infrastructure and services for investigating, debugging and diagnosing incidents:


As an example, if an incident is triggered for a service running in Kubernetes, PagerDuty Runbook Automation can retrieve information from logs, API’s, databases and other sources that support this service. This could be triggered with the click of a button or through event-driven invocation.
# Simplifying and Sharing Diagnostics
Diagnostics retrieved using Runbook Automation can be made available in multiple interfaces such as PagerDuty's Mobil App, Slack, and Microsoft Teams:

# Examples & Templates
This guide includes a full section on Examples & Best Practices - a preview of that is shown here:
Category | Examples |
---|---|
![]() Amazon Web Services | Stopped ECS Task Errors ELB Targets Health CloudWatch Logs |
Microsoft Azure | Function App Health Troubleshoot Azure File Sync Load Balancer Health Probes |
Google Cloud Platform | Debug Load Balancer Health Checks Troubleshooting Firewall Rules GKE Cluster Connectivity |
![]() Linux OS | List Top CPU Consuming Processes Retrieve Errors from Syslog List Top Disk Consuming Files |
Windows OS | Active Directory Replication Diagnostics Retrieve IIS Web Server Logs SMB Connection Failures |
![]() APIs | Check Internal API Response Body Retrieve Diagnostics from SaaS Tools |
![]() Kubernetes | Retrieve Recent Pod Logs Recent Kubernetes Events Pod Status & Error Messages |
Databases | Top Resource Consuming Queries Blocking Locks Missing Indexes |
Network Devices | BGP Route Flapping Spanning Tree Issues Duplex Mismatch |
Observability Integrations | Retrieve Application Logs Surface Relevant Graphs Capture Time Sensitive Diagnostics |