Files · Anthropic AI Runbook Automation for SMB Field Service Dispatching

60 (1 binary, 557.9 kB total)attempt 1

README.md·4780 B·markdown

markdown

# Anthropic AI Runbook Automation for SMB Field Service Dispatching
 
> Automatically detect agent failures in field service workflows and execute pre-defined runbooks to restore operations without human intervention.
 
SMB field service businesses lose revenue when their AI dispatch agents fail during after-hours or peak times. Manual recovery is slow and requires operations staff that small teams can't afford. This recipe deploys a reliability layer using Anthropic's reasoning engine (Claude) to analyze failure signals and @reaatech/agent-runbook packages to triage and remediate automatically.
 
## Architecture
 
Trigger.dev sends webhooks → `POST /api/runbooks/trigger` receives failure events → health check probes via @reaatech/agent-runbook-health-checks → failure classification via @reaatech/agent-runbook-failure-modes → runbook generation via @reaatech/agent-runbook-runbook → Claude-powered execution via @reaatech/agent-runbook-agent → Slack alert on human escalation via @slack/web-api. Observability via @reaatech/agent-runbook-observability → Langfuse.
 
## Setup
 
Copy `.env.example` to `.env` and fill in all 10 environment variables:
 
```
ANTHROPIC_API_KEY=<your-anthropic-api-key>
SLACK_TOKEN=<your-slack-bot-token>
SLACK_CHANNEL_ID=<your-slack-channel-id>
TRIGGER_API_KEY=<your-trigger-dev-api-key>
TRIGGER_PROJECT_ID=<your-trigger-dev-project-id>
TRIGGER_ENVIRONMENT=<production|staging>
LANGFUSE_PUBLIC_KEY=<your-langfuse-public-key>
LANGFUSE_SECRET_KEY=<your-langfuse-secret-key>
LANGFUSE_HOST=<your-langfuse-host>
NODE_ENV=development
```
 
## Quick start
 
```bash
pnpm install && pnpm typecheck && pnpm test && pnpm dev
```
 
## API reference
 
### `POST /api/runbooks/trigger`
 
Receives failure events from Trigger.dev webhooks and executes the runbook pipeline.
 
**Request body:**
```json
{
  "agentId": "dispatch-agent-01",
  "failureType": "timeout",
  "failureDetails": { "duration": 5000 },
  "timestamp": "2026-06-13T00:00:00Z"
}
```
 
**Response body (200):**
```json
{
  "runbookId": "a1b2c3d4-...",
  "success": true,
  "summary": "# Runbook: timeout recovery\n...",
  "requiresHuman": false,
  "completenessScore": 0.92
}
```
 
**Status codes:**
 
| Code | Description |
|------|-------------|
| 200  | Runbook executed successfully |
| 400  | Validation error — missing required fields |
| 500  | Server error — runbook or Anthropic API failure |
 
## Dependencies
 
### REAA packages (6)
 
| Package | Version | Description |
|---------|---------|-------------|
| @reaatech/agent-runbook-agent | 0.1.0 | Claude-powered execution of structured runbook steps |
| @reaatech/agent-runbook-runbook | 0.1.0 | Build, validate, and export runbook artifacts |
| @reaatech/agent-runbook-failure-modes | 0.1.0 | Map failure symptoms to mitigation playbooks |
| @reaatech/agent-runbook-health-checks | 0.1.0 | Probe agent endpoints and generate Kubernetes probe YAML |
| @reaatech/agent-runbook-alerts | 0.1.0 | Generate and format alerts for Slack, Prometheus, and PagerDuty |
| @reaatech/agent-runbook-observability | 0.1.0 | Logging, tracing, metrics, and Langfuse export |
 
### Third-party packages (3)
 
| Package | Version | Description |
|---------|---------|-------------|
| @anthropic-ai/sdk | 0.104.1 | Anthropic Claude API client |
| @trigger.dev/sdk | 4.4.6 | Durable, fault-tolerant webhook orchestration |
| @slack/web-api | 7.17.0 | Slack messaging client for human escalation alerts |
 
## Testing
 
Test strategy uses **vitest** with v8 coverage (90%+ thresholds across lines, branches, functions, and statements). Unit tests mock external packages with `vi.mock`. HTTP mocking for Anthropic API calls uses **MSW**. Coverage excludes UI files (`app/**/*.tsx`).
 
**54 tests across 8 test files:**
 
| File | Focus |
|------|-------|
| tests/observability.test.ts | Observability layer initialization and span tracking |
| tests/health-check.test.ts | Health probe HTTP calls and Kubernetes YAML generation |
| tests/runbook-engine.test.ts | Failure classification, mitigation, runbook generation |
| tests/notify.test.ts | Slack alert formatting and error handling |
| tests/trigger.test.ts | Webhook handler validation and response shaping |
| tests/index.test.ts | Public API exports and TypeScript compilation |
| tests/integration.test.ts | Full end-to-end webhook-to-result flow with MSW |
| tests/instrumentation.test.ts | Next.js instrumentation hook registration |
 
## Project layout
 
```
app/                  Next.js App Router pages + API routes
src/                  services, lib, adapters
tests/                vitest suite (mirrors src/)
packages/             API references for every dependency (read these first)
DEV_PLAN.md           build plan for this recipe
```
 
## License
 
MIT — see [LICENSE](./LICENSE).