Playbook: Quick Triage
Quick Triage
Section titled “Quick Triage”When to use: First step for any customer-reported issue. Establishes context before diving into specific playbooks.
Prerequisites
Section titled “Prerequisites”- Operator access token (
ps_at_prefix) — get one from an operator user - MCP client connected (Claude Code, Cursor, or direct API)
- Verify connectivity: run
whoami
1. Identify the Customer
Section titled “1. Identify the Customer”MCP: lookup_tenant query: "<customer name, email, or domain>"Note the organization_id, workspace_id, and cluster_name from the response.
2. Check Agent Connectivity
Section titled “2. Check Agent Connectivity”MCP: remote_debug workspace_id: "<workspace_id>" command: "agent_health"Healthy response: status: connected, recent heartbeat, capability list present.
Unhealthy indicators:
status: disconnected— agent is down or network issue- Stale heartbeat (>60s) — agent may be stuck
- Empty capabilities — agent version too old
3. Check System Health
Section titled “3. Check System Health”MCP: get_system_healthLook for degraded components that could explain the customer’s issue.
4. Review Recent Activity
Section titled “4. Review Recent Activity”MCP: query_audit_logs workspace_id: "<workspace_id>" limit: 20Check for unusual patterns — repeated errors, failed operations, config changes.
5. Assess and Route
Section titled “5. Assess and Route”| Finding | Next Playbook |
|---|---|
| Agent disconnected | Agent Connectivity |
| Analysis jobs failing | Analysis Failures |
| All systems healthy but customer sees issues | Check frontend logs, ask for screenshots |
Escalation
Section titled “Escalation”If the issue cannot be identified via MCP tools, escalate with:
- Organization ID and workspace ID
- Output from steps 1-4
- Customer’s exact error message or screenshot