These packages give you a CLI, library, and MCP server that scan a service repository and produce a complete operator runbook—alerts, dashboards, failure modes, rollback steps, incident workflows, health checks, and dependency maps. You would adopt them to automate the creation and maintenance of runbooks for every service in your organization, replacing manual documentation that goes stale. The packages are designed as independent, composable modules (analyzer, alerts, dashboards, etc.) that share core types and Zod schemas, so you can use the full pipeline via the CLI or pick individual packages for programmatic use.
An AI agent class (`AnalysisAgent`) and factory function (`createAnalysisAgent`) that wraps Anthropic Claude, OpenAI, and Google Gemini LLMs with pre-built prompt templates for automated code analysis, failure mode identification, and runbook section generation.
Extracts existing alert definitions from Prometheus, Datadog, and CloudWatch configs and generates new SLO-based, resource, and application alerts, providing functions like `extractAlerts`, `generateAlerts`, and `formatAlertsForPlatform`.
Scans a service repository to detect its language, framework, deployment platform, configuration files, entry points, API endpoints, external service connections, and package dependencies, returning structured analysis objects from functions like `scanRepository`, `mapDependencies`, `parseConfigs`, and `analyzeCode`.
A CLI and programmatic entry point that generates operator runbooks from service repositories using AI analysis, providing five commands (`analyze`, `generate`, `validate`, `export`, `serve`) and re-exporting all public APIs from the `@reaatech/agent-runbook-*` ecosystem.
Generates Grafana and CloudWatch dashboard configurations by scanning a codebase to identify relevant service metrics and producing complete dashboard panels with queries, thresholds, and legends. Exports functions like `identifyMetrics`, `generateDashboard`, and platform-specific formatters.
Identifies potential failure points in a codebase by analyzing code patterns, categorizes them into 10 types (e.g., dependency, security, database) with severity scores, and generates actionable mitigation strategies including circuit breaker and retry configurations. Exports functions like `identifyFailureModes`, `generateMitigations`, and `getAllFailureModes`.
Identifies existing health check endpoints in a codebase and generates liveness, readiness, and startup probe definitions for Kubernetes and load balancers. Exports functions like `identifyHealthChecks`, `generateHealthChecks`, `generateKubernetesProbeYaml`, and `generateLoadBalancerConfig`.
Generates SEV1–SEV4 incident response workflows, escalation policies, and communication templates for the Agent Runbook Generator. Exports functions like `generateIncidentWorkflows`, `generateEscalationPolicy`, and `getTemplatesByCategory` that take an analysis context and configuration and return structured workflow and template objects.
An MCP server that exposes 16 tools for analyzing repository structure, generating operator runbooks, and validating runbook completeness, consumable by AI coding agents via stdio transport.
Provides structured logging via Pino, distributed tracing via OpenTelemetry, and Prometheus-compatible metrics for tracking runbook generation, agent costs, and quality. Exports a set of initialization functions and recording helpers (e.g., `initLogger`, `startGenerationSpan`, `recordGeneration`).